The notion of a transaction is fundamental to business systems architectures. A transaction, simply put, ensures that only agreed-upon, consistent, and acceptable state changes are made to a system—regardless of system failure or concurrent access to the system's resources.
With the advent of Web service architectures, distributed applications (macroservices) are being built by assembling existing, smaller services (microservices). The microservices are usually built with no a priori knowledge of how they may be combined. The resulting complex architectures introduce new challenges to existing transaction models. Several new standards are being proposed that specify how application servers and transaction managers implement new transaction models that accommodate the introduced complexities.
The first section of this series introduces the fundamental concepts behind transactions and explain how transactions are managed within the current Java/J2EE platforms. Later sections discuss the challenges of using existing transaction models for Web services, explain newly proposed models and standards for Web service transactions, and finally, detail proposed implementations of these new models on the Java platform.
A transaction may be thought of as an interaction with the system, resulting in a change to the system state. While the interaction is in the process of changing system state, any number of events can interrupt the interaction, leaving the state change incomplete and the system state in an inconsistent, undesirable form. Any change to system state within a transaction boundary, therefore, has to ensure that the change leaves the system in a stable and consistent state.
A transactional unit of work is one in which the following four fundamental transactional properties are satisfied: atomicity, consistency, isolation, and durability (ACID). We will examine each property in detail.
It is common to refer to a transaction as a "unit of work." In describing a transaction as a unit of work, we are describing one fundamental property of a transaction: that the activities within it must be considered indivisible—that is, atomic.
A Flute Bank customer may interact with Flute's ATM and transfer money from a checking account to a savings account. Within the Flute Bank software system, a transfer transaction involves two actions: debit of the checking account and credit to the savings account. For the transfer transaction to be successful, both actions must complete successfully. If either one fails, the transaction fails. The atomic property of transactions dictates that all individual actions that constitute a transaction must succeed for the transaction to succeed, and, conversely, that if any individual action fails, the transaction as a whole must fail.
A database or other persistent store usually defines referential and entity integrity rules to ensure that data in the store is consistent. A transaction that changes the data must ensure that the data remains in a consistent state—that data integrity rules are not violated, regardless of whether the transaction succeeded or failed. The data in the store may not be consistent during the duration of the transaction, but the inconsistency is invisible to other transactions, and consistency must be restored when the transaction completes.
When multiple transactions are in progress, one transaction may want to read the same data another transaction has changed but not committed. Until the transaction commits, the changes it has made should be treated as transient state, because the transaction could roll back the change. If other transactions read intermediate or transient states caused by a transaction in progress, additional application logic must be executed to handle the effects of some transactions having read potentially erroneous data. The isolation property of transactions dictates how concurrent transactions that act on the same subset of data behave. That is, the isolation property determines the degree to which effects of multiple transactions, acting on the same subset of application state, are isolated from each other.
At the lowest level of isolation, a transaction may read data that is in the process of being changed by another transaction but that has not yet been committed. If the first transaction is rolled back, the transaction that read the data would have read a value that was not committed. This level of isolation—read uncommitted, or "dirty read"—can cause erroneous results but ensures the highest concurrency.
An isolation of read committed ensures that a transaction can read only data that has been committed. This level of isolation is more restrictive (and consequently provides less concurrency) than a read uncommitted isolation level and helps avoid the problem associated with the latter level of isolation.
An isolation level of repeatable read signifies that a transaction that read a piece of data is guaranteed that the data will not be changed by another transaction until the transaction completes. The name "repeatable read" for this level of isolation comes from the fact that a transaction with this isolation level can read the same data repeatedly and be guaranteed to see the same value.
The most restrictive form of isolation is serializable. This level of isolation combines the properties of repeatable-read and read-committed isolation levels, effectively ensuring that transactions that act on the same piece of data are serialized and will not execute concurrently.
The durability property of transactions refers to the fact that the effect of a transaction must endure beyond the life of a transaction and application. That is, state changes made within a transactional boundary must be persisted onto permanent storage media, such as disks, databases, or file systems. If the application fails after the transaction has committed, the system should guarantee that the effects of the transaction will be visible when the application restarts. Transactional resources are also recoverable: should the persisted data be destroyed, recovery procedures can be executed to recover the data to a point in time (provided the necessary administrative tasks were properly executed). Any change committed by one transaction must be durable until another valid transaction changes the data.
Isolation Levels and Locking
Traditionally, transaction isolation levels are achieved by taking locks on the data that they access until the transaction completes. There are two primary modes for taking locks: optimistic and pessimistic. These two modes are necessitated by the fact that when a transaction accesses data, its intention to change (or not change) the data may not be readily apparent.
Some systems take a pessimistic approach and lock the data so that other transactions may read but not update the data accessed by the first transaction until the first transaction completes. Pessimistic locking guarantees that the first transaction can always apply a change to the data it first accessed.
In an optimistic locking mode, the first transaction accesses data but does not take a lock on it. A second transaction may change the data while the first transaction is in progress. If the first transaction later decides to change the data it accessed, it has to detect the fact that the data is now changed and inform the initiator of the fact. In optimistic locking, therefore, the fact that a transaction accessed data first does not guarantee that it can, at a later stage, update it.
At the most fundamental level, locks can be classified into (in increasingly restrictive order) shared, update, and exclusive locks. A shared lock signifies that another transaction can take an update or another shared lock on the same piece of data. Shared locks are used when data is read (usually in pessimistic locking mode).
An update lock ensures that another transaction can take only a shared lock on the same data. Update locks are held by transactions that intend to change data (not just read it).
If a transaction locks a piece of data with an exclusive lock, no other transaction may take a lock on the data. For example, a transaction with an isolation level of read uncommitted does not result in any locks on the data read by the transaction, and a transaction with repeatable read isolation can take only a share lock on data it has read.
Locking to achieve transaction isolation may not be practical for all transactional environments; however, it remains the most common mechanism to achieve transaction isolation.
In a simple Java application that interacts with a database management system (DBMS), the application can demarcate transaction boundaries using explicit SQL commits and rollbacks. A more sophisticated application environment, with multiple transactional resources distributed across a network, requires a dedicated component to manage the complexity of coordinating transactions to completion.
A transaction manager works with applications and application servers to provide services to control the scope and duration of transactions. A transaction manager also helps coordinate the completion of global transactions across multiple transactional resource managers (e.g., database management systems), provides support for transaction synchronization and recovery, and may provide the ability to communicate with other transaction manager instances.
A transaction context contains information about a transaction. Conceptually, a transaction context is a data structure that contains a unique transaction identifier, a timeout value, and the reference to the transaction manager that controls the transaction scope. In Java applications, a transaction manager associates a transaction context with the currently executing thread. Multiple threads may be associated with the same transaction contextdividing a transaction's work into parallel tasks, if possible. The context also has to be passed from one transaction manager to another if a transaction spans multiple transaction managers.
Two separate but interconnected Java specifications pertain to the operation and implementation of Java transaction managers. These are detailed in the next sections.
Transaction Manager versus TP Monitor
Transaction processing (TP) monitors, such as CICS and IMS/DC, enhance the underlying operating system's scalability and its ability to manage large transaction volumes, by taking on some of the roles of the underlying operating system. For example, a TP monitor, in addition to managing transactions, also performs connection pooling and task/thread pooling and scheduling. Transaction management is only one function of a TP monitor. In today's J2EE environment, application servers perform a similar function and may be thought of as modern equivalents of TP monitors.
Two-Phase Commit and Global Transactions
Global transactions span multiple resource managers. To coordinate global transactions, the coordinating transaction manager and all participating resource managers should implement a multiphased completion protocol, such as the two-phasecommit (2PC) protocol (Figure 1). Although there are several proprietary implementations of the this protocol, X/Open XA is the industry standard. Two distinct phases ensure that either all the participants commit or all of them roll back changes.
During the first, or prepare phase, the global coordinator inquires if all participants are prepared to commit changes. If the participants respond in the affirmative (if they feel that the work can be committed), the transaction progresses to the second, or commit phase, in which all participants are asked to commit changes.
The two-phase commit protocol ensures that either all participants commit changes or none of them does. A simplified explanation follows of how this happens in a typical transaction manager. To keep the discussion brief, we examine only a few failure scenarios. Once a transaction starts, it is said to be in-flight. If a machine or communication failure occurs when the transaction is in-flight, the transaction will be rolled back eventually.
In the prepare phase, participants log their responses (preparedness to commit) to the coordinator, and the state of the transaction for each participant is marked in-doubt. At the end of the prepare phase, the transaction is in-doubt. If a participant cannot communicate with the global coordinator after it is in the in-doubt state, it will wait for resynchronization with the coordinator. If resyn-chronization cannot take place within a predefined time, the participant may make a heuristic decision either to roll back or commit that unit of work. (Heuristic or arbitrary decisions taken by the participant are a rare occurrence. We emphasize this because it is a conceptual difference between current transaction models and those such as BTP, discussed later in the chapter).
During the second phase, the coordinator asks all participants to commit changes. The participants log the request and begin commit processing. If a failure occurs during commit processing at one of the participants, the commit is retried when the participant restarts.
Transactions managers can provide transactional support for applications using different implementation models. The most common is the flat transaction model. A transaction manager that follows the flat transaction model does not allow transactions to be nested within other transactions. The flat transaction model can be illustrated by examining the transaction model employed in J2EE application servers today.
As an example, Flute Bank provides a bill payment service. The service is implemented by an EJB that interacts with two other EJBs the account management EJB (to update account balance) and the check writing EJB (to write out a check to the payee).
Table 1a illustrates the scope of transactions in the scenario where all three EJBs are deployed with declarative transaction attribute of Required.
Table 1b illustrates the scope of transactions where the bill payment service EJB and account management EJB are deployed with transaction attribute Required but the check writing EJB is deployed with a transaction policy of RequiresNew. In this scenario, when the check writing EJB method is executed, the container suspends T1 and starts a new transaction, T2. Check writing occurs within the scope of the second transaction. When control returns to the bill payment EJB, T2 is committed, and the previous transaction is activated. So, in a flat transaction model, two transactions are executed in different scopes—T2's scope is not within T1's scope.
Table 1a All EJBs Deployed with Transaction Attribute Required
Table 1b Account Management EJB with Transaction Attribute RequiresNew
T2 (started, then terminated)
(T1 resumed, then terminated)
In effect, one business activity (bill payment) is executed under the scope of two separate transactions. In a flat transaction model, the only way to correctly control the scope of transactions that span multiple services is to reach an agreement beforehand on how these services will be combined for a business activity and to apply appropriate transaction policies for the services by agreement.
What if the check writing service were created and hosted by a different company? What if the check writing company did not want its deployment attributes to be dictated by Flute Bank? What if other customers (Flute Bank's competitor) of the check writing company wanted a contradictory transaction policy? Would it not be in Flute Bank's interest to have its bill payment transaction control the outcome of the business activity, regardless of whether the check writing company decided to start a new transaction or not?
A nested transaction model, shown in Figure 2, is one solution to the above problem. It allows transactions to consist of other transactions: a top-level transaction may contain subtransactions. In a nested transaction model, with respect to the above example, the bill payment service would start a top-level transaction (t). Both the account management service and the check writing service are free to start new transactions (ts1 and ts2). But both these subtransactions are within the scope of the top-level transaction (t). Transactions in a nested transaction model also adhere to the ACID properties of transactions—that is, t completes successfully only if ts1 and ts2 both complete successfully. If either subtransaction fails, the top-level transaction would fail, thereby guaranteeing atomicity.
A nested transaction model allows services to be built independently and later combined into applications. Each service can determine the scope of its transaction boundaries. The application or service that orchestrates the combination of services controls the scope of the top-level transaction.
Java Transaction API (JTA)
In a J2EE environment, the transaction manager has to communicate with the application server, the application program, and the resource managers, using a well-defined and standard API. The Java Transaction API (JTA) is defined precisely for this purpose. It does not specify how the transaction manager itself has to be implemented but how components external to the transaction manager communicate with it. The JTA defines a set of high-level interfaces that describe the contract between the transaction manager and three application components that interact with it: the application program, resource manager, and application server. These are described below. The next section describes the Java Transaction Service (JTS) specification—a related specification that deals with the implementation details of transaction managers.
JTA contract between transaction managers and application programs. In J2EE applications, a client application or server EJB component whose transaction policy is managed in the bean code (TX_BEAN_MANAGED) can explicitly demarcate (i.e., start and stop) transactions. For this purpose, the application gets a reference to a user transaction object that implements the javax.transaction.UserTransaction interface. This interface defines methods, among others, that allow the application to commit, roll back, or suspend a transaction. The transaction that is started or stopped is associated with the calling user thread.
In a Java client application, the UserTransaction object is obtained by looking up the application server's JNDI-based registry. There is no standard JNDI name for storing the UserTransaction reference, so the client application must know beforehand (usually by reading a configuration file) how to obtain the UserTransaction object from the application server's JNDI registry. In an EJB, the UserTransaction object is exposed through the EJBContext.
JTA contract between transaction managers and application servers. J2EE containers are required to support container-managed transactions. The container demarcates transactions and manages thread and resource pooling. This requires the application server to work closely with the transaction manager. For example, in an EJB where the container has to manage transaction demarcation, the EJB container has to communicate with the transaction manager to start, commit, and suspend transactions. The J2EE application server and transaction manager communicate via the javax.transaction.TransactionManager interface.
JTA contract between transaction managers and transactional resource managers. Resource managers provide an application access to resources, which can be databases, JMS queues, or any other transactional resource. An example of a global transaction is the one that changes a Flute Bank employee's address: it involves updating the employee master database (Oracle) and posting a message to a JMS queue (MQSeries) for the external payroll company (because payroll taxes can change based on employee address). For the transaction manager to coordinate and synchronize the transaction with different resource managers (in this example, Oracle and MQSeries), it must use a standards-based, well-known transaction protocol that both resource managers understand (such as X/Open XA).
JTA defines the javax.transaction.xa.XAResource interface, which allows a transactional resource manager, such as a DBMSor JMS implementation, to participate in a global transaction. The XAResource interface is a Java mapping of the industry-standard X/Open XA. J2EE applications communicate to the resource managers via a resource adapter (e.g., JDBC connection), and JDBC 2.0 extensions support distributed transactions. JDBC 2.0 provides two interfaces that support JTA-compliant resources: the javax.sql.XAConnection and javax.sql.XADataSource.
Similarly, JMS providers implement the javax.jms.XAConnection and the javax.jms.XASession interfaces in supporting JTA transaction managers.
Java Transaction Service
The JTA specification's main purpose is to define how a client application, the application server, and the resource managers communicate with the transaction manager. Because JTA provides interfaces that map to X/Open standards, a JTA-compliant transaction manager can control and coordinate a transaction that spans multiple resource managers (distributed transactions). JTA does not specify how a transaction manager is to be implemented, nor does it address how multiple transaction managers communicate with each other to participate in the same transaction. That is, it does not specify how a transaction context can be propagated from one transaction manager to another. The Java Transaction Service (JTS) specification addresses these concepts.
JTS specifies the implementation contracts for Java transaction managers. It is the Java mapping of the CORBAObject Transaction Service (OTS) 1.1 specification. The OTS specification defines a standard mechanism for generating and propagating a transaction context between transaction managers, using the IIOP protocol. JTS uses the OTS interfaces (primarily org.omg.CosTransactions and org.omg.CosTSPortability) for interoperability and portability. Because JTS is based on OTS, it is not unusual to find implementations that support transactions propagating from non-Java CORBA clients as well.
Within a JTS-compliant transaction manager implementation, a communication resource manager component handles incoming and outgoing transaction requests. Details of the interaction between the application server and a JTS transaction manager can be found in the JTS specification document.
In addition to being a Java mapping of OTS, the JTS specification also mandates that a JTS-compliant transaction manager implement all JTA interfaces—that is, a JTS-based transaction manager interfaces with an application program, the application server, and resource managers, using the JTA. Figure 3 shows the relationship between the application components involved in a transaction.