Java EE 5 Performance Management and Optimization

Friday Jun 30th 2006 by Steven Haines

Make performance and optimization a priority in building your application instead of an afterthought.

A discussion held during a recent client visit...

“Okay, I understand how to gather metrics, but now what do I do with them?” John asked, looking confounded. “If I have application response time, instrumentation, and application server metrics, what should I have my developers do to ensure that the next deployment will be successful?”

“That is a very good question. At its core, it involves a change of mind-set by your entire development organization, geared toward performance. You’ll most likely feel resistance from your developers, but if they follow these steps and embrace performance testing from the outset, then you’ll better your chances of success more than a hundredfold,” I said.

“I can deal with upset developers,” John responded. “The important thing is that the application meets performance criteria when it goes live. I’ll make sure that they follow the proper testing procedures; they have to understand the importance of application performance. I just can’t face the idea of calling the CEO and telling him that we failed again!”

“Don’t worry, I’ve helped several customers implement this methodology into their development life cycle, and each one has been successful. It is a discipline that, once adopted, becomes second nature. The key is to get started now!”

“Tell me more,” John stated calmly, in contrast with his stressed demeanor. I knew that John had seen the light and was destined for success in the future.

Performance Overview

All too often in application development, performance is an afterthought. I once worked for a company that fully embraced the Rational Unified Process (RUP) but took it to an extreme. The application the company built spent years in architecture, and the first of ten iterations took nearly nine months to complete. The company learned much through its efforts and became increasingly efficient in subsequent iterations, but one thing that the organization did not learn until very late in the game was the importance of application performance. In the last couple of iterations, it started implementing performance testing and learned that part of the core architecture was flawed—specifically, the data model needed to be rearchitected. Because object models are built on top of data models, the object model also had to change. In addition, all components that interact with the object model had to change, and so on. Finally, the application had to go through another lengthy QA cycle that uncovered new bugs as well as the reemergence of former bugs.

That company learned the hard way that the later in the development life cycle performance issues are identified, the more expensive they are to fix. Figure 1 illustrates this idea graphically. You can see that a performance issue identified during the application’s development is inexpensive to fix, but one found later can cause the cost to balloon. Thus, you must ensure the performance of your application from the early stages of its architecture and test it at each milestone to preserve your efforts.

Figure 1. The relationship between the time taken to identify performance issues and the repair costs

A common theme has emerged from those customer sites I visit in which few or no performance issues are identified: these customers kept in mind the performance of the application when designing the application architecture. At these engagements, the root causes of most of the application problems were related to load or application server configuration—the applications had very few problems.

This is the first in a series of three articles that formalizes the methodology you should implement to ensure the performance of your application at each stage of the application development, QA, and deployment stages. I have helped customers implement this methodology into their organizations and roll out their applications to production successfully.

Performance in Architecture

The first step in developing any application of consequence is to perform an architectural analysis of a business problem domain. To review, application business owners work with application technical owners to define the requirements of the system. Application business owners are responsible for ensuring that when the application is complete it meets the needs of the end users, while application technical owners are responsible for determining the feasibility of options and defining the best architecture to solve the business needs. Together, these two groups design the functionality of the application.

In most organizations, the architecture discussions end at this analysis stage; the next step is usually the design of the actual solution. And this stage is where the architectural process needs to be revolutionized. Specifically, these groups need to define intelligent SLAs for each use case, they need to define the life cycles of major objects, and they need to address requirements for sessions.


An intelligent SLA maintains three core traits. It is

  • Reasonable
  • Specific
  • Flexible

An SLA must satisfy end-user expectations but still be reasonable enough to be implemented. An unreasonable SLA will be ignored by all parties until end users complain. This is why SLAs need to be defined by both the application business owner and the application technical owner: the business owner pushes for the best SLAs for his users, while the application technical owner impresses upon the business owner the reality of what the business requirement presents. If the business requirement cannot be satisfied in a way acceptable to the application business owner, then the application technical owner needs to present all options and the cost of each (in terms of effort). The business requirement may need to be changed or divided into subprocesses that can be satisfied reasonably.

An intelligent SLA needs to be specific and measurable. In this requirement, you are looking for a hard and fast number, not a statement such as “The search functionality will respond within a reasonable user tolerance threshold.” How do you test “reasonable”? You need to remove all subjectivity from this exercise. After all, what is the point in defining an SLA if you cannot verify it?

Finally, an intelligent SLA needs to be flexible. It needs to account for variations in behavior as a result of unforeseen factors, but define a hard threshold for how flexible it is allowed to be. For example, an SLA may read “The search functionality will respond within three seconds (specific) for 95 percent of requests (flexible).” The occasional seven-second response time is acceptable, as long as the integrity of the application is preserved—it responds well most of the time. By defining concrete values for the specific value as well as the limitations of the flexible value, you can quantify what “most of the time” means to the performance of the application, and you have a definite value with which to evaluate and verify the SLA.


Although you define specific performance criteria and a measure of flexibility, defining either a hard upper limit of tolerance or a relative upper limit is also a good idea. I prefer to specify a relative upper limit, measured in the number of standard deviations from the mean. The purpose of defining an SLA in this way is that on paper a 3-second response time for 95 percent of requests is tolerable, but how do you address drastically divergent response time, such as a 30-second response time? Statistically, this should not be grossly applicable, but it is a good safeguard to be aware of.

An important aspect of defining intelligent SLAs is tracking them. The best way to do this is to integrate them into your application use cases. A use case is built from a general thought, such as “The application must provide search functionality for its patient medical records,” but then the use case is divided into scenarios. Each scenario defines a path that the use case may follow given varying user actions. For example, what does the application do when the patient exists? What does it do when the patient does not exist? What if the search criterion returns more than one patient record? Each of these business processes needs to be explicitly called out in the use case, and each needs to have an SLA associated with it. The following exercise demonstrates the format that a proper use case containing intelligent SLAs should follow.


Use Case

The Patient Management System must provide functionality to search for specific patient medical history information.


Scenario 1: The Patient Management System returns one distinct record.
Scenario 2: The Patient Management System returns more than one match.
Scenario 3: The Patient Management System does not find any users meeting the specified criteria.


The user has successfully logged in to the application.


The user enters search criteria and submits data using the Web interface.


Scenario 1:
1. The Patient Management
2. . . .

Scenario 2:
3. . . .


The Patient Management System displays the results to the user.


Scenario 1: The Patient Management System will return a specific patient matching the specified criteria in less than three seconds for 95 percent of requests. The response time will at no point stray more than two standard deviations from the mean.

Scenario 2: The Patient Management System will return a collection of patients matching the specified criteria in less than five seconds for 95 percent of requests. The response time will at no point stray more than two standard deviations from the mean.

Scenario 3: When the Patient Management System cannot find a user matching the specified criteria, it will inform the user in less than two seconds for 95 percent of requests. The response time will at no point stray more than two standard deviations from the mean.

The format of this use case varies from traditional use cases with the addition of the SLA component. In the SLA component, you explicitly call out the performance requirements for each scenario. The performance criteria include the following:

  • The expected tolerance level: Respond in less than three seconds.
  • The measure of flexibility: Meet the tolerance level for 95 percent of requests.
  • The upper threshold: Do not stray more than three standard deviations from the observed mean.

With each of these performance facets explicitly defined, the developers implementing code to satisfy the use case understand their expectations and can structure unit tests accordingly. The QA team has a specific value to test and measure the quality of the application against. Next, when the QA team, or a delegated performance capacity assessor, performs a formal capacity assessment, an extremely accurate assessment can be built and a proper degradation model constructed. Finally, when the application reaches production, enterprise Java system administrators have values from which to determine if the application is meeting its requirements.

All of this specific assessment is possible, because the application business owner and application technical owner took time to carefully determine these values in the architecture phase. My aim here is to impress upon you the importance of up-front research and a solid communication channel between the business and technical representatives.

Object Life Cycle Management

The most significant problem plaguing production enterprise Java applications is memory management. The root cause of 90 percent of my customers’ problems is memory related and can manifest in one of two ways:

  • Object cycling
  • Loitering objects (lingering object references)

Recall that object cycling is the rapid creation and deletion of objects in a short period of time that causes the frequency of garbage collection to increase and may result in tenuring short-lived objects prematurely. The cause of loitering objects is poor object management; the application developer does not explicitly know when an object should be released from memory, so the reference is maintained. Loitering objects are the result of an application developer failing to release object references at the correct time. This is a failure to understand the impact of reference management on application performance. This condition results in an overabundance of objects residing in memory, which can have the following effects:

  • Garbage collection may run slower, because more live objects must be examined.
  • Garbage collection can become less effective at reclaiming objects.
  • Swapping on the physical machine can result, because less physical memory is available for other processes to use.

Neglecting object life cycle management can result in memory leaks and eventually application server crashes. I discuss techniques for detecting and avoiding object cycling later in this article, because it is a development or design issue, but object life cycle management is an architectural issue.

To avoid loitering objects, take control of the management of object life cycles by defining object life cycles inside use cases. I am not advocating that each use case should define every int, boolean, and float that will be created in the code to satisfy the use case; rather, each use case needs to define the major application-level components upon which it depends. For example, in the Patient Management System, daily summary reports may be generated every evening that detail patient metrics such as the number of cases of heart disease identified this year and the common patient profile attributes for each. This report would be costly to build on a per-request basis, so the architects of the system may dictate that the report needs to be cached at the application level (or in the application scope so that all requests can access it).

Defining use case dependencies and application-level object life cycles provides a deeper understanding of what should and should not be in the heap at any given time. Here are some guidelines to help you identify application-level objects that need to be explicitly called out and mapped to use cases in a dependency matrix:

  • Expensive objects, in terms of both allocated size as well as allocation time, that will be accessed by multiple users
  • Commonly accessed data
  • Nontemporal user session objects
  • Global counters and statistics management objects
  • Global configuration options

The most common examples of application-level object candidates are frequently accessed business objects, such as those stored in a cache. If your application uses entity beans, then you need to carefully determine the size of the entity bean cache by examining use cases; this can be extrapolated to apply to any caching infrastructure. The point is that if you are caching data in the heap to satisfy specific use cases, then you need to determine how much data is required to satisfy the use cases. And if anyone questions the memory footprint, then you can trace it directly back to the use cases.

The other half of the practice of object life cycle management is defining when objects should be removed from memory. In the previous example, the medical summary report is updated every evening, so at that point the old report should be removed from memory to make room for the new report. Knowing when to remove objects is probably more important than knowing when to create objects. If an object is not already in memory, then you can create it, but if it is in memory and no one needs it anymore, then that memory is lost forever.

Application Session Management

Just as memory mismanagement is the most prevalent issue impacting the performance of enterprise Java applications, HTTP sessions are by far the biggest culprit in memory abuse. HTTP is a stateless protocol, and as such the conversation between the Web client and Web server terminates at the conclusion of a single request: the Web client submits a request to the Web server (most commonly GET or POST), and then the Web server performs its business logic, constructs a response, and returns the response to the Web client. This ends the Web conversation and terminates the relationship between client and server.

In order to sustain a long-term conversation between a Web client and Web server, the Web server constructs a unique identifier for the client and includes it with its response to the request; internally the Web server maintains all user data and associates it with that identifier. On subsequent requests, the client submits this unique identifier to identify itself to the Web server.

This sounds like a good idea, but it creates the following problem: if the HTTP protocol is truly stateless and the conversation between Web client and Web server can only be renewed by a client interaction, then what does the Web server do with the client’s information if that client never returns? Obviously, the Web server throws the information away, but the real question relates to how long the Web server should keep the information.

All application servers provide a session time-out value that constrains the amount of time user data is maintained. When the user makes any request from the server, the user’s time-out is reset, and once the time-out has been exceeded, the user’s stateful information is discarded. A practical example of this is logging in to your online banking application. You can view your account balances, transfer funds, and pay bills, but if you sit idle for too long, you are forced to log in again. The session time-out period for a banking application is usually quite short for security reasons (for example, if you log in to your bank account and then leave your computer unattended to go to a meeting, you do not want someone else who wanders by your desk to be able to access your account). On the other hand, when you shop at, you can add items to your shopping cart and return six months later to see that old book on DNA synthesis and methylation that you still do not have time to read sitting there. uses a more advanced infrastructure to support this feature (and a heck of a lot of hardware and memory), but the question remains: how long should you hold on to data between user requests before discarding it?

The definitive time-out value must come from the application business owner. He or she may have specific, legally binding commitments with end users and business partners. But an application technical owner can control the quantity of data that is held resident in memory for each user. In the aforementioned example, do you think that maintains everyone’s shopping cart in memory for all time? I suspect that shopping cart data is maintained in memory for a fixed session length, and afterward persisted to a database for later retrieval.

As a general guideline, sessions should be as small as possible while still realizing the benefits of being resident in memory. I usually maintain temporal data describing what the user does in a particular session, such as the page the user came from, the options the user has enabled, and so on. More significant data, such as objects stored in a shopping cart, opened reports, or partial result sets, are best stored in stateful session beans, because rather than being maintained in a hash map that can conceivably grow indefinitely like HTTP session objects, stateful session beans are stored in predefined caches. The size of stateful session bean caches can be defined upon deployment, on a per-bean basis, and hence assert an upper limit on memory consumption. When the cache is full, to add a new bean to it, an existing bean must be selected and written out to persistent storage. The danger is that if the cache is sized too small, the maintenance of the cache can outweigh the benefits of having the cache in the first place. If your sessions are heavy and your user load is large, then this upper limit can prevent your application servers from crashing.


Here you learned how to integrate proactive performance testing throughout the development life cycle. The process begins by integrating performance criteria into use cases, which involves modifying use cases to include specific SLA sections that include performance criteria for each use case scenario.

About the Author

Steven Haines is the author of three Java books: The Java Reference Guide (InformIT/Pearson, 2005), Java 2 Primer Plus (SAMS, 2002), and Java 2 From Scratch (QUE, 1999). In addition to contributing chapters and coauthoring other books, as well as technical editing countless software publications, he is also the Java Host on As an educator, he has taught all aspects of Java at Learning Tree University as well as at the University of California, Irvine. By day he works as a Java EE 5 Performance Architect at Quest Software, defining performance tuning and monitoring software as well as managing and performing Java EE 5 performance tuning engagements for large-scale Java EE 5 deployments, including those of several Fortune 500 companies.

Source of this material

Pro Java EE 5 Performance Management and Optimization
By Steven Haines

Published: May 2006, Paperback: 424 pages
Published by Apress
ISBN: 1590596102
Retail price: $49.99
This material is from Chapter 5 of the book.

Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved