Transaction isolation levels in relational databases

During one of my recruitment interviews, I was asked how database transactions interact and what influence we have on this. Although I have read about transaction isolation levels in relational databases in the past, I hadn’t had the opportunity to think about this problem in production, so I didn’t initially associate this with what my interviewer was getting at. It seems to me to be a good topic to start, so let’s move on.

A database transaction is a sequence of database operations which are performed as a single unit of work. Its purpose is to guarantee the consistency of the information we modify through commit and rollback mechanisms. If one of the database operations (for example, one save/update out of several) fails, all previous changes are rolled back so that no data inconsistency is created.

ACID is a standard transactional model in relational databases whose name is the acronym, and it comes from the words atomicity, consistency, isolation and durability. It is a model that guarantees data consistency.

Says that either all operations performed as part of the transaction are approved, or none are approved.

Says that transaction transforms the database from one consistent state to another consistent state. Different levels of transaction isolations affect how data consistency is maintained during and after a transaction.

Says that the transaction is isolated from others to avoid conflicts and maintain a consistent state.

This means that once the transaction is completed, its effects are permanent and resilient to system failure. Although it is more related to data security, it is also essential in isolation as it affects what happens to the data after a transaction ends. This is also a set of properties designed to guarantee data validity despite errors, power failures and other mishaps.

During work with transactions, we may experience several characteristic, undesirable phenomena such as:

It occurs when a transaction reads data that another transaction has not yet committed. If another transaction eventually rollback changes, then the first transaction may read changes that never existed.

It occurs when a transaction reads the same row twice but gets different data each time. Example: transaction number one read a row. Transaction number two modifies or deletes the same row, and the transaction is committed. If transaction number one again reads the same row, it gets different data than the first time, or it discovers that the row no longer exists.

This is a phenomenon in which new rows appear or disappear when the same query is re-executed during a single transaction. Example: transaction number one reads a set of rows which meet the search criteria. Then, transaction number two adds a new row which meets the criteria of transaction number one. If transaction number one performs the same search query again, it gets a different set of records.

In the case of level read uncommitted, the system allows to read data before the end of transactions. It is a perfect solution when fast processing is required, and the cost of this is an acceptable risk of reading unconfirmed data. Problems such as dirty, non-repeatable, and phantom reads may occur at this level. Use this level where performance is the crucial and minimal isolation is enough.

Read committed level eliminates the problem of dirty reads, representing a significant improvement in data integrity and consistency. It is a more advanced solution than read uncommitted level but still allows for some inconsistency in data that can be acceptable in some business scenarios. Due to the fact that it has eliminated problems with dirty reads, you should use this level wherever avoiding dirty reads is essential, but non-repeatable and phantom reads are allowed.

Repeatable reads level is susceptible to phantom reads but eliminates both dirty reads and non-repeatable reads. To this end, the system must block access to update and add new rows, which can lead to potential delays and block access to data. Use this level where phantom reads are allowed.

The serializable level offers the highest isolation level, guaranteeing complete data consistency and eliminating all the above problems with parallel access. However, this completeness of isolation may lead to increased resource consumption and decreased performance, especially in the case of intensive data access across multiple transactions. Use this level where complete isolation is critical even at the expense of performance and higher resource consumption is not a problem.

Transaction isolation levels could be configured in several different ways depending on the effect we want to archive.

In most database management systems (RDBMS), you can set the default isolation level for the entire database. This means new transactions within the database will work with the isolation level set if not overridden at the session level.

This approach makes it possible to control isolation levels for concrete application sessions. You can archive that by changing the isolation level at the beginning of the session.

In some cases, you may want to set the isolation level at the transaction level itself. Concrete transaction will work with a specific isolation level, regardless of session-level configuration or database-wide settings.

In some ORM frameworks, you can control the isolation level in application code, which gives more flexible options for customising the isolation level to meet the needs of your application. Still, for details, I refer you to the documentation of the selected ORM framework you are interested in.