Ensuring Performance During the Quality Assurance Phase

Friday Jul 14th 2006 by Steven Haines
Share:

As components are integrated and tested in a production staging environment, it is critical to identify and resolve J2EE application bottlenecks.

The integration of components usually falls more on development than on QA, but the exercise usually ends at functional testing. Development ensures that the components work together as designed, and then the QA team tests the details of the iteration's use cases. Now that your use cases have performance criteria integrated, QA has a perfect opportunity to evaluate the iteration against the performance criteria. The new notion that I am promoting is that an application that meets all of its functional requirements but does not satisfy its SLAs (Service Level Agreements) does not pass QA. The response by the QA team should be the same as if the application is missing functionality: the application is returned to development to be fixed.

Performance integration testing comes in two flavors:

  • Performance integration general test
  • Performance integration load test

QA performs the integration general test under minimal load; the amount of that load is a subset of the expected load and defined formally in the test plan. For example, if the expected load is 1,500 simultaneous users, then this test may be against 50 users. The purpose of this test is to identify any gross performance problems that might occur as the components are integrated. Do not run a full-load test, because in a failed full-load test it may be difficult to identify the root cause of the performance failure. If the load is completely unsustainable, then all aspects of the application and environment will most likely fail. Furthermore, if the integrated application cannot satisfy a minimal load, then there is no reason to subject it to a full load.

After the application has survived the performance integration general test, the next test is the performance integration load test. During this test, turn up the user load to the expected user load, or if you do not have a test environment that mirrors production, then use a single JVM scaled down appropriately. For example, if you are trying to support 1,500 users with four JVMs, then you might send 400 users at a single JVM. Each use case that has been implemented in this integration is tested against the formal use case SLAs. The performance integration load test is probably the most difficult one for the application to pass, but it offers the ability to tune the application and application server, and it ensures that the performance of the application stays on track.

Balanced Representative Load Testing

Probably the most important aspect of performance tuning in integration or staging environments is ensuring that you are accurately reproducing the behavior of your users. This is referred to as balanced representative load testing. Each load scenario that you play against your environment needs to represent a real-world user interaction with your application, complete with accurate think times (that is, the wait time between requests). Furthermore, these representative actions must be balanced according to their observed percentage of occurrence.

For example, a user may log in once, but then perform five searches, submit one form, and log out. Therefore the logon, logoff, and submission functionalities should each receive a balance of one-eighth of the load, and the search functionality should receive the remaining five-eighths of the load for this transaction. If your load scripts do not represent real-world user actions balanced in the way users will be using your application, then you can have no confidence that your tuning efforts are valid. Consider this example if the actions were not balanced properly (say each action receives one-fourth of the load). Logon and logoff functionalities may be far less database-intensive than search functionality, but they may be much heavier on a JCA connector to a Lightweight Directory Access Protocol (LDAP) server. Tuning each function equally results in too few database connections to service your database requests and extraneous JCA connections. A simple misbalance of respective transactions can disrupt your entire environment.

There are two primary techniques to extracting end-user behaviors: process access log files or add a network device into your environment that monitors end-user behavior. The former is the less exact of the two but can provide insight into user pathways through your Web site and accurate think times. The latter is more exact and can be configured to provide deeper insight into customer profiling and application logic.

Production Staging Testing

Seldom will your applications run in isolation; rather, they typically run in a shared environment with other applications competing for resources. Therefore, testing your applications in a staging environment designed to mirror production is imperative. As the integration test phase is split into two steps, so is the production staging test:

  • Performance production staging general test
  • Performance production staging load test

The general test loads the production staging environment with a small user load with the goal of uncovering any egregiously slow functionality or drained resources. Again, this step is interjected before performing the second, full-load test, because a full-load test may completely break the environment and consume all resources, thereby obfuscating the true cause of performance issues. If the application cannot satisfy a minimal amount of load while running in a shared environment, then it is not meaningful to subject it to excessive load.

Identifying Performance Issues

When running these performance tests, you need to pay particular attention to the following potentially problematic environmental facets:

  • Application code
  • Platform configuration
  • External resources

Application code can perform poorly as a result of being subjected to a significant user load. Performance unit tests help identify poorly written algorithms, but code that performs well under low amounts of user load commonly experiences performance issues as the load is significantly increased. The problems occur because subtle programmatic issues manifest themselves as problems only when they become exaggerated. Consider creating an object inside a servlet to satisfy a user request and then destroying it. This is no problem whatsoever for a single user or even a couple dozen users. Now send 5,000 users at that servlet-it must create and destroy that object 5,000 times. This behavior results in excessive garbage collection, premature tenuring of objects, CPU spikes, and other performance abnormalities. This example underscores the fact that only after testing under load can you truly have confidence in the quality of your components.

Platform configuration includes the entire environment that the application runs in: the application server, JVM, operating system, and hardware. Each piece of this layered execution model must be properly configured for optimal performance. As integration and production staging tests are run, you need to monitor and assess their performance. For example, you need to ensure that you have enough threads in the application server to process incoming requests, that your JVM's heap is properly tuned to minimize major garbage collections, that your operating system's process scheduler is allotting enough CPU to the JVM, and that your hardware is running optimally on a fast network. Ensuring proper configuration requires a depth of knowledge across a breadth of technologies.

Finally, most enterprise-scale applications interact with external resources that may or may not be under your control. In the most common cases, enterprise applications interact with one or more databases, but external resources can include legacy systems, messaging servers, and, in recent years, Web services. As the acceptance of SOAs has grown, applications can be rapidly assembled by piecing together existing code that exposes functionality through services. Although this capability promotes the application architect to an application assembler, permitting rapid development of enterprise solutions, it also adds an additional tier to the application. And with that tier comes additional operating systems, environments, and, in some circumstances, services that can be delivered from third-party vendors at run time over the Internet.

The first step in identifying performance issues is to establish monitoring capabilities in your integration and production staging environments, and record the application behavior while under load. This record lists service requests that can be sorted by execution count, average execution time, and total execution time. These service requests are then tracked back to use cases to validate against predefined SLAs. Any service request whose response time exceeds its SLA needs to be analyzed to determine why that's the case. Figure 1 shows a breakdown of service requests running inside the MedRec application. In this 30-second time slice, two service requests spent an extensive amount of time executing: GET /admin/viewrequests.do was executed 12 times, accounting for 561 seconds, and POST /patient/register.do was executed 10 times, accounting for 357 seconds.



Click here for a larger image.

Figure 1. Breakdown of service requests running inside the MedRec application

As shown in Figure 2, looking at the average exclusive time for each method that satisfies the POST /patient/register.do service request, the HTTP POST at the WebLogic cluster consumed on average 35.477 seconds of the 35.754 total service request average, which is important because the request passed quickly from the Web server to the application server, but then waited at the application server for a thread to process it. The remainder of the request processed relatively quickly.



Click here for a larger image.

Figure 2. Breakdown of response time for the POST /patient/register.do service request for each method in a hierarchical request tree

Figure 3 shows a view of the performance metrics for the application server during this recorded session. This screen is broken into three regions: the top region shows the heap behavior, the middle shows the thread pool information, and the bottom shows the database connection pool information.

Figure 3 confirms our suspicions: the number of idle threads during the session hit zero, and the number of pending requests grew as high as 38. Furthermore, toward the end of the session, the database connection usage peaked at 100 percent and the heap was experiencing significant garbage collection.

This level of diagnosis requires insight into the application, application server, and external dependency behaviors. With this information, you are empowered to determine exactly where and why your application is slowing.



Click here for a larger image.

Figure 3. Performance metrics for the application server during this recorded session

Summary

As components are integrated and tested in a production staging environment, application bottlenecks are identified and resolved.

About the Author

Steven Haines is the author of three Java books: The Java Reference Guide (InformIT/Pearson, 2005), Java 2 Primer Plus (SAMS, 2002), and Java 2 From Scratch (QUE, 1999). In addition to contributing chapters and coauthoring other books, as well as technical editing countless software publications, he is also the Java Host on InformIT.com. As an educator, he has taught all aspects of Java at Learning Tree University as well as at the University of California, Irvine. By day he works as a Java EE 5 Performance Architect at Quest Software, defining performance tuning and monitoring software as well as managing and performing Java EE 5 performance tuning engagements for large-scale Java EE 5 deployments, including those of several Fortune 500 companies.

Source of this material

Pro Java EE 5 Performance Management and Optimization
By Steven Haines



Published: May 2006, Paperback: 424 pages
Published by Apress
ISBN: 1590596102
Retail price: $49.99
This material is from Chapter 5 of the book.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved