This article is designed to introduce you the concepts of isolation levels—the concepts that every database programmer needs to know to develop applications that allow multi-user access to the database.
Data consistency is a term used to describe the accuracy of data. For example, suppose you have a table that maintains a count of vehicles available for sale in car dealerships. When a vehicle is transferred from dealership A to dealership B, the count has to decrease in dealership A and increase in dealership B. Both actions must take place; if one occurs and another does not, the data becomes inaccurate—data inconsistency occurs.
Data inconsistency can occur due to several factors:
- Data is accessed/updated by multiple users at the same time
- Transition fails to perform all necessary actions due to a crash
To prevent the occurrence of the above situations, a DBMS (database management system) relies on isolation levels and locks.
An isolation levels mechanism is used to isolate each transaction in a multi-user environment. The correct use of the isolation levels mechanism prevents applications from introducing errors that can occur from the following situations:
1. Lost Updates: This situation occurs when two transactions attempt to update the same data. Consider the following example:
- Transaction A reads row 1.
- Transaction B reads row 1.
- Transaction A updates row 1.
- Transaction B updates row 1, overlaying changes applied by Transaction A.
In the above situation, updates performed by Transaction A are lost.
2. Dirty Reads: This situation occurs when transactions read data that has not been committed. Consider the following example:
- Transaction A inserts row 1 without committing.
- Transaction B reads row 1.
- Transaction B rolls back row 1.
- Transaction A now has a row that physically does not exist.
3. Nonrepeatable Reads: This situation occurs when a transaction reads the same query multiple times and results are not the same each time. Consider the following example:
- Transaction A reads a row of data.
- Transaction B modifies this row and commits.
- Transaction A re-reads the same row and sets back different data values.
4. Phantoms: This situation occurs when a row of data matches the first time but does not match subsequent times. Consider the following example:
- Transaction A reads two rows based on a Query A where clause.
- Transaction B inserts a new row that happens to fall under Transaction A Query A's where clause.
- Transaction A runs Query A again and now gets back three rows.
The above four phenomenona demonstrate that there is a need to utilize a mechanism called ISOLATION LEVELS. The next sections of this article will explain DB2 isolation levels. Most of the rules regarding isolation levels are the same across most transactional databases but some rules may vary; therefore, I advise you consult other sources when dealing with databases other than DB2. A DB2 database supports four isolation levels:
- REPEATABLE READ: Protects against Lost Updates, Dirty Reads, Nonrepeatable Reads, and Phantoms
- READ STABILITY: Protects against Lost Updates, Dirty Reads, and Nonrepeatable Reads. Read stability does not protect against Phantoms.
- CURSOR STABILITY: Protects against Nonrepeatable Reads and Phantoms. Cursor Stability does not protect against Lost Updates and Dirty Reads.
- UNCOMMITED READ: Protects against Lost Updates. Uncommitted Read does not protect against Phantoms, Dirty Reads, and Nonrepeatable Reads.
This isolation level is the most restrictive of the four. It makes sure that none of the phenomenona will occur. It is also the most restrictive when it comes to database concurrency. (Database concurrency controls ensure that transactions occur in an ordered fashion. The main job of these controls is to protect transactions issued by different users/applications from the effects of each other. They must preserve the four characteristics of database transactions: atomicity, isolation, consistency and durability.) When a transaction with isolation level REAPETABLE READ (RR) executes, it locks every row it references. It is important to understand that the lock is held for every row, not only rows that are updated, but also those that are merely selected. Consider the following situation:
A car dealership manager is executing a report that produces a total number of vehicles for sale. During this time, a salesperson is attempting to submit a purchase order that will decrement the number of vehicles on the lot. If the manager's report is running with isolation level RR, the purchase order will not go through because all rows are being locked by the manager's report.
The above situation reveals the importance of understanding the implications of using the Repeatable Read isolation level.
Now, look at another example but in more technical terms:
- Transaction A with Repeatable Read isolation level selects all rows on table A.
- Transaction B attempts to update any of the rows selected by Transaction A and fails because all rows selected by Transaction A are locked.
Read Stability is not as restrictive as the Repeatable Read isolation level; therefore, it does not prevent all the phenomenona. The difference between Read Stability and Repeatable Read is as follows:
In Read Stability, only rows that are retrieved or modified are locked, whereas in Repeatable Read, all rows that are being referenced are locked. This crucial difference makes Phantoms possible, but protects against Lost Updates, Dirty Reads, and Nonrepeatable Reads. Look at the following scenario:
- Transaction A with isolation level Read Stability selects row 1.
- Transaction B cannot update row 1 while Transaction B is holding a lock on row 1. This prevents against Lost Updates, Dirty Reads, and Nonrepeatable Reads.
- Transaction B can insert a new row that can show up in a result set of Transaction A. This allows for Phantoms.