Building in J2EE Performance during the Development Phase

Friday Jul 7th 2006 by Steven Haines
Share:

When building a car, is it premature to test the performance of your alternator before the car is assembled and you try to start it? Obviously the answer is No, it’s not premature. The same principle holds for application development.

Continuing on the same theme of my last article, Java EE 5 Performance Management and Optimization, have you ever heard anyone ask the following question: “When developers are building their individual components before a single use case is implemented, isn’t it premature to start performance testing?”

Let me ask a similar question: When building a car, is it premature to test the performance of your alternator before the car is assembled and you try to start it? The answer to this question is obviously “No, it’s not premature. I want to make sure that the alternator works before building my car!” If you would never assemble a car from untested parts, why would you assemble an enterprise application from untested components? Furthermore, because you integrate performance criteria into use cases, use cases will fail testing if they do not meet their performance criteria. In short, performance matters!

In development, components are tested in unit tests. A unit test is designed to test the functionality and performance of an individual component, independently from other components that it will eventually interact with. The most common unit testing framework is an open source initiative called JUnit. JUnit’s underlying premise is that alongside the development of your components, you should write tests to validate each piece of functionality of your components. A relatively new development paradigm, Extreme Programming (www.xprogramming.com), promotes building test cases prior to building the components themselves, which forces you to better understand how your components will be used prior to writing them.

JUnit focuses on functional testing, but side projects spawned from JUnit include performance and scalability testing. Performance tests measure expected response time, and scalability tests measure functional integrity under load. Formal performance unit test criteria should do the following:

  • Identify memory issues
  • Identify poorly performing methods and algorithms
  • Measure the coverage of unit tests to ensure that the majority of code is being tested

Memory leaks are the most dangerous and difficult to diagnose problems in enterprise Java applications. The best way to avoid memory leaks at a code level is to run your components through a memory profiler. A memory profiler takes a snapshot of your heap (after first running garbage collection), allows you to run your tests, takes another snapshot of your heap (after garbage collection again), and shows you all of the objects that remain in the heap. The analysis of the heap differences identifies objects abandoned in memory. Your task is then to look at these objects and decide if they should remain in the heap or if they were left there by mistake. Another danger of memory misusage is object cycling, which, again, is the rapid creation and destruction of objects. Because it increases the frequency of garbage collection, excessive object cycling may result in the premature tenuring of short-lived objects, necessitating a major garbage collection to reclaim these objects.

After considering memory issues, you need to quantify the performance of methods and algorithms. Because SLAs are defined at the use case level, but not at the component level, measuring response times may be premature in the development phase. Rather, the strategy is to run your components through a code profiler. A code profiler reveals the most frequently executed sections of your code and those that account for the majority of the components’ execution times. The resulting relative weighting of hot spots in the code allows for intelligent tuning and code refactoring. You should run code profiling on your components while executing your unit tests, because your unit tests attempt to mimic end-user actions and alternate user scenarios. Code profiling your unit tests should give you a good idea about how your component will react to real user interactions.

Coverage profiling reports the percentage of classes, methods, and lines of code that were executed during a test or use case. Coverage profiling is important in assessing the efficacy of unit tests. If both the code and memory profiling of your code are good, but you are exercising only 20 percent of your code, then your confidence in your tests should be minimal. Not only do you need to receive favorable results from your functional unit tests and your code and memory performance unit tests, but you also need to ensure that you are effectively testing your components.

This level of testing can be further extended to any code that you outsource. You should require your outsourcing company to provide you with unit tests for all components it develops, and then execute a performance test against those unit tests to measure the quality of the components you are receiving. By combining code and memory profiling with coverage profiling, you can quickly determine whether the unit tests are written properly and have acceptable results.

Once the criteria for tests are met, the final key step to effectively implementing this level of testing is automation. You need to integrate functional and performance unit testing into your build process—only by doing so can you establish a repeatable and trackable procedure. Because running performance unit tests can burden memory resources, you might try executing functional tests during nightly builds and executing performance unit tests on Friday-night builds, so that you can come in on Monday to test result reports without impacting developer productivity. This suggestion’s success depends a great deal on the size and complexity of your environment, so, as always, adapt this plan to serve your application’s needs.

When performance unit tests are written prior to, or at least concurrently with, component development, then component performance can be assessed at each build. If such extensive assessment is not realistic, then the reports need to be evaluated at each major development milestone. For the developer, milestones are probably at the completion of the component or a major piece of functionality for the component. But at minimum, performance unit tests need to be performed prior to the integration of components. Again, building a high-performance car from tested and proven high-performance parts is far more effective than from scraps gathered from the junkyard.

Unit Testing

I thought this section would be a good opportunity to talk a little about unit testing tools and methods, though this discussion is not meant to be exhaustive. JUnit is, again, the tool of choice for unit testing. JUnit is a simple regression-testing framework that enables you to write repeatable tests. Originally written by Erich Gamma and Kent Beck, JUnit has been embraced by thousands of developers and has grown into a collection of unit testing frameworks for a plethora of technologies. The JUnit Web site (www.junit.org) hosts support information and links to the other JUnit derivations.

JUnit offers the following benefits to your unit testing:

  • Faster coding: How many times have you written debug code inside your classes to verify values or test functionality? JUnit eliminates this by allowing you to write test cases in closely related, but centralized and external, classes.
  • Simplicity: If you have to spend too much time implementing your test cases, then you won’t do it. Therefore, the creators of JUnit made it as simple as possible.
  • Single result reports: Rather than generating loads of reports, JUnit will give you a single pass/fail result, and, for any failure, show you the exact point where the application failed.
  • Hierarchical testing structure: Test cases exercise specific functionality, and test suites execute multiple test cases. JUnit supports test suites of test suites, so when developers build test cases for their classes, they can easily assemble them into a test suite at the package level, and then incorporate that into parent packages and so forth. The result is that a single, top-level test execution can exercise hundreds of unit test cases.
  • Developer-written tests: These tests are written by the same person who wrote the code, so the tests accurately target the intricacies of the code that the developer knows can be problematic. This test differs from a QA-written one, which exercises the external functionality of the component or use case—instead, this test exercises the internal functionality.
  • Seamless integration: Tests are written in Java, which makes the integration of test cases and code seamless.
  • Free: JUnit is open source and licensed under the Common Public License Version 1.0, so you are free to use it in your applications.

From an architectural perspective, JUnit can be described by looking at two primary components: TestCase and TestSuite. All code that tests the functionality of your class or classes must extend junit.framework.TestCase. The test class can implement one or more tests by defining public void methods that start with test and accept no parameters, for example:

public void testMyFunctionality() { ... }

For multiple tests, you have the option of initializing and cleaning up the environment before and between tests by implementing the following two methods: setUp() and tearDown(). In setUp() you initialize the environment, and in teardown() you clean up the environment. Note that these methods are called between each test to eliminate side effects between test cases; this makes each test case truly independent.

Inside each TestCase “test” method, you can create objects, execute functionality, and then test the return values of those functional elements against expected results. If the return values are not as expected, then the test fails; otherwise, it passes. The mechanism that JUnit provides to validate actual values against expected values is a set of assert methods:

  • assertEquals() methods test primitive types.
  • assertTrue() and assertFalse() test Boolean values.
  • assertNull() and assertNotNull() test whether or not an object is null.
  • assertSame() and assertNotSame() test object equality.
  • In addition, JUnit offers a fail() method that you can call anywhere in your test case to immediately mark a test as failing.

    JUnit tests are executed by one of the TestRunner instances (there is one for command-line execution and one for a GUI execution), and each version implements the following steps:

    1. It opens your TestCase class instance.
    2. It uses reflection to discover all methods that start with “test”.
    3. It repeatedly calls setUp(), executes the test method, and calls teardown().

    As an example, I have a set of classes that model data metrics. A metric contains a set of data points, where each data point represents an individual sample, such as the size of the heap at a given time. I purposely do not list the code for the metric or data point classes; rather, I list the JUnit tests. Recall that according to one of the tenets of Extreme Programming, we write test cases before writing code. Listing 1 shows the test case for the Metric class, and Listing 2 shows the test case for the DataPoint class.

    Listing 1. DataPointTest.java

    package com.javasrc.metric;
    
    import junit.framework.TestCase;
    import java.util.*;
    
    /**
     * Tests the core functionality of a DataPoint
     */
    public class DataPointTest extends TestCase
    {
      /**
      * Maintains our reference DataPoint
      */
    private DataPoint dp;
    
    /**
     * Create a DataPoint for use in this test
     */
    protected void setUp()
    {
      dp = new DataPoint( new Date(), 5.0, 1.0, 10.0 );
    }
    
    /**
     * Clean up: do nothing for now
     */
    protected void tearDown()
    {
    }
    
    /**
     * Test the range of the DataPoint
     */
    public void testRange()
    {
      assertEquals( 9.0, dp.getRange(), 0.001 );
    }
    
    /**
     * See if the DataPoint scales properly
     */
    public void testScale()
    {
      dp.scale( 10.0 );
      assertEquals( 50.0, dp.getValue(), 0.001 );
      assertEquals( 10.0, dp.getMin(), 0.001 );
      assertEquals( 100.0, dp.getMax(), 0.001 );
    }
    
    /**
     * Try to add a new DataPoint to our existing one
     */
    public void testAdd()
    {
      DataPoint other = new DataPoint( new Date(), 4.0, 0.5, 20.0 );
      dp.add( other );
      assertEquals( 9.0, dp.getValue(), 0.001 );
      assertEquals( 0.5, dp.getMin(), 0.001 );
      assertEquals( 20.0, dp.getMax(), 0.001 );
    }
    
    /**
     * Test the compare functionality of our DataPoint to ensure that
     * when we construct Sets of DataPoints they are properly ordered
     */
    public void testCompareTo()
    {
      try
       {
        // Sleep for 100ms so we can be sure that the time of
        // the new data point is later than the first
        Thread.sleep( 100 );
      }
      catch( Exception e )
      {
      }
    
      // Construct a new DataPoint
      DataPoint other = new DataPoint( new Date(), 4.0, 0.5, 20.0 );
    
      // Should return -1 because other occurs after dp
      int result = dp.compareTo( other );
      assertEquals( -1, result );
    
      // Should return 1 because dp occurs before other
      result = other.compareTo( dp );
      assertEquals( 1, result );
    
      // Should return 0 because dp == dp
      result = dp.compareTo( dp );
      assertEquals( 0, result );
     }
    }
    
    

    Listing 2. MetricTest.java

    package com.javasrc.metric;
    
    import junit.framework.TestCase;
    import java.util.*;
    
    public class MetricTest extends TestCase
    {
      private Metric sampleHeap;
    
      protected void setUp()
      {
        this.sampleHeap = new Metric( "Test Metric",
                                      "Value/Min/Max",
                                      "megabytes" );
        double heapValue = 100.0;
        double heapMin = 50.0;
        double heapMax = 150.0;
    
        for( int i=0; i<10; i++ )
        {
          DataPoint dp = new DataPoint( new Date(),
                                        heapValue,
                                        heapMin,
                                        heapMax );
          this.sampleHeap.addDataPoint( dp );
          try
          {
            Thread.sleep( 50 );
          }
          catch( Exception e )
          {
          }
          // Update the heap values
          heapMin -= 1.0;
          heapMax += 1.0;
          heapValue += 1.0;
        }
    }
    
    public void testMin()
    {
      assertEquals( 41.0, this.sampleHeap.getMin(), 0.001 );
    }
    
    public void testMax()
    {
      assertEquals( 159.0, this.sampleHeap.getMax(), 0.001 );
    }
    
    public void testAve()
    {
      assertEquals( 104.5, this.sampleHeap.getAve(), 0.001 );
    }
    
    public void testMaxRange()
    {
      assertEquals( 118.0, this.sampleHeap.getMaxRange(), 0.001 );
    }
    
    public void testRange()
    {
      assertEquals( 118.0, this.sampleHeap.getRange(), 0.001 );
    }
    
    public void testSD()
    {
      assertEquals( 3.03, this.sampleHeap.getStandardDeviation(), 0.01 );
    }
    
    public void testVariance()
    {
      assertEquals( 9.17, this.sampleHeap.getVariance(), 0.01 );
    }
    
    public void testDataPointCount()
    {
      assertEquals( 10, this.sampleHeap.getDataPoints().size() );
    }
    }
    

    In Listing 1, you can see that the DataPoint class, in addition to maintaining the observed value for a point in time, supports minimum and maximum values for the time period, computes the range, and supports scaling and adding data points. The sample test case creates a DataPoint object in the setUp() method and then exercises each piece of functionality.

    Listing 2 shows the test case for the Metric class. The Metric class aggregates the DataPoint objects and provides access to the collective minimum, maximum, average, range, standard deviation, and variance. In the setUp() method, the test creates a set of data points and builds the metric to contain them. Each subsequent test case uses this metric and validates values computed by hand to those computed by the Metric class.

    Listing 3 rolls both of these test cases into a test suite that can be executed as one test.

    Listing 3. MetricTestSuite.java

    package com.javasrc.metric;
    
    import junit.framework.Test;
    import junit.framework.TestSuite;
    
    public class MetricTestSuite
    {
      public static Test suite()
      {
        TestSuite suite = new TestSuite();
        suite.addTestSuite( DataPointTest.class );
        suite.addTestSuite( MetricTest.class );
        return suite;
      }
    }
    

    A TestSuite exercises all tests in all classes added to it by calling the addTestSuite() method. A TestSuite can contain TestCases or TestSuites, so once you build a suite of test cases for your classes, a master test suite can include your suite and inherit all of your test cases.

    The final step in this example is to execute either an individual test case or a test suite. After downloading JUnit from www.junit.org, add the junit.jar file to your CLASSPATH and then invoke either its command-line interface or GUI interface. The three classes that execute these tests are as follows:

    • junit.textui.TestRunner
    • junit.swingui.TestRunner
    • junit.awtui.TestRunner

    And as these package names imply, textui is the command-line interface and swingui is the graphical interface. awtui provides a batch interface to executing unit tests. You can pass an individual test case or an entire test suite as an argument to the TestRunner class. For example, to execute the test suite that we created earlier, you would use this:

    java junit.swingui.TestRunner com.javasrc.metric.MetricTestSuite
    

    Unit Performance Testing

    Unit performance testing has three aspects:

  • Memory profiling
  • Code profiling
  • Coverage profiling
  • This section explores each facet of performance profiling. I provide examples of what to look for and the step-by-step process to implement each type of testing.

    Memory Profiling

    Let’s first look at memory profiling. To illustrate how to determine if you do, in fact, have a memory leak, I modified the BEA MedRec application to capture the state of the environment every time an administrator logs in and to store that information in memory. My intent is to demonstrate how a simple tracking change left to its own devices can introduce a memory leak.

    The steps you need to perform on your code for each use are as follows:

    1. Request a garbage collection and take a snapshot of your heap.
    2. Perform your use case.
    3. Request a garbage collection and take another snapshot of your heap.
    4. Compare the two snapshots (the difference between them includes all objects remaining in the heap) and identify any unexpected loitering objects.
    5. For each suspect object, open the heap snapshot and track down where the object was created.
    Note

    A memory leak can be detected with a single execution of a use case or through a plethora of executions of a use case. In the latter case, the memory leak will scream out at you. So, while analyzing individual use cases is worthwhile, when searching for subtle memory leaks, executing your use case multiple times makes finding them easier.

    In this scenario, I performed steps 1 through 3 with a load tester that executed the MedRec administration login use case almost 500 times. Figure 1 shows the difference between the two heap snapshots.

    Figure 1. The snapshot difference between the heaps before and after executing the use case

    Figure 1 shows that my use case yielded 8,679 new objects added to the heap. Most of these objects are collection classes, and I suspect they are part of BEA’s infrastructure. I scanned this list looking for my code, which in this case consists of any class in the com.bea.medrec package. Filtering on those classes, I was interested to see a large number of com.bea.medrec.actions. SystemSnapShot instances, as shown in Figure 2.

    Note

    The screen shots in this article are from Quest Software’s JProbe and PerformaSure products.

    Figure 2. The snapshot difference between the heaps, filtered on my application packages

    Realize that rarely is a loitering object a single simple object; rather, it is typically a subgraph that maintains its own references. In this case, the SystemSnapShot class is a dummy class that holds a set of primitive type arrays with the names timestamp, memoryInfo, jdbcInfo, and threadDumps, but in a real-world scenario these arrays would be objects that reference other objects and so forth. By opening the second heap snapshot and looking at one of the SystemSnapShot instances, you can see all objects that it references. As shown in Figure 3, the SystemSnapShot class references four objects: timestamp, memoryInfo, jdbcInfo, and threadDumps. A loitering object, then, has a far greater impact than the object itself.

    Next, let’s look at the referrer tree. We repeatedly ask the following questions: What class is referencing the SystemSnapShot? What class is referencing that class? Eventually, we finally find one of our classes. Figure 4 shows that the SystemSnapShot class is referenced by an Object array that is referenced by an ArrayList that is finally referenced by the AdminLoginAction.

    Figure 3. The SystemSnapShot class references four objects: timestamp, memoryInfo, jdbcInfo, and threadDumps.

    Figure 4. Here we can see that the AdminLoginAction class created the SystemSnapShot, and that it stored it in an ArrayList.

    Finally, we can look into the AdminLoginAction code to see that it creates the new SystemSnapShot instance we are looking at and adds it to its cache in line 66, as shown in Figure 5.

    You need to perform this type of memory profiling test on your components during your performance unit testing. For each object that is left in the heap, you need to ask yourself whether or not you intended to leave it there. It’s OK to leave things on the heap as long as you know that they are there and you want them to be there. The purpose of this test is to identify and document potentially troublesome objects and objects that you forgot to clean up.

    Figure 5. The AdminLoginAction source code

    Code Profiling

    The purpose of code profiling is to identify sections of your code that are running slowly and then determine why. The perfect example I have to demonstrate the effectiveness of code profiling is a project that I gave to my Data Structures and Algorithm Analysis class—compare and quantify the differences among the following sorting algorithms for various values of n (where n represents the sample size of the data being sorted):

    • Bubble sort
    • Selection sort
    • Insertion sort
    • Shell sort
    • Heap sort
    • Merge sort
    • Quick sort

    As a quick primer on sorting algorithms, each of the aforementioned algorithms has its strengths and weaknesses. The first four algorithms run in O(N2) time, meaning that the run time increases exponentially as the number of items to sort, N, increases; specifically, as N increases, the amount of time required for the sorting algorithm to complete increases by N2. The last three algorithms run in O( N log N ) time, meaning that the run time grows logarithmically: as N increases, the amount of time required for the sorting algorithm to complete increases by N log N. Achieving O( N log N ) performance requires additional overhead that may cause the last three algorithms to actually run slower than the first four for a small number of items. My recommendation is to always examine both the nature of the data you want to sort today and the projected nature of the data throughout the life cycle of the product prior to selecting your sorting algorithm.

    With that foundation in place, I provided my students with a class that implements the aforementioned sorting algorithms. I really wanted to drive home the dramatic difference between executing these sorting algorithms on 10 items as opposed to 10,000 items, or even 1,000,000 items. For this exercise, I think it would be useful to profile this application against 5,000 randomly generated integers, which is enough to show the differences between the algorithms, but not so excessive that I have to leave my computer running overnight.

    Figure 6 shows the results of this execution, sorting each method by its cumulative run time.

    Figure 6. The profiled methods used to sort 5,000 random integers using the seven sorting algorithms

    We view the method response times sorted by cumulative time, because some of the algorithms make repeated calls to other methods to perform their sorting (for example, the quickSort() method makes 5,000 calls to q_sort()). We have to ignore the main() method, because it calls all seven sorting methods. (Its cumulative time is almost 169 seconds, but its exclusive method time is only 90 milliseconds, demonstrating that most of its time is spent in other method calls—namely, all of the sorting method calls.) The slowest method by far is the bubbleSort() method, accounting for 80 seconds in total time and 47.7 percent of total run time for the program.

    The next question is, why did it take so long? Two pieces of information can give us insight into the length of time: the number of external calls the method makes and the amount of time spent on each line of code. Figure 7 shows the number of external calls that the bubbleSort() method makes.

    Figure 7. The number of external calls that the bubbleSort() method makes

    This observation is significant—in order to sort 5,000 items, the bubble sort algorithm required almost 12.5 million comparisons. It immediately alerts us to the fact that if we have a considerable number of items to sort, bubble sort is not the best algorithm to use. Taking this example a step further, Figure 8 shows a line-by-line breakdown of call counts and time spent inside the bubbleSort() method.

    Figure 8. Profiling the bubbleSort() method

    By profiling the bubbleSort() method, we see that 45 percent of its time is spent comparing items, and 25 percent is spent managing a for loop; these two lines account for 56 cumulative seconds. Figure 8 clearly illustrates the core issue of the bubble sort algorithm: on line 15 it executes the for loop 12,502,500 times, which resolves to 12,479,500 comparisons.

    To be successful in deploying high-performance components and applications, you need to apply this level of profiling to your code.

    Coverage Profiling

    Identifying and rectifying memory issues and slow-running algorithms gives you confidence in the quality of your components, but that confidence is meaningful only as long as you are exercising all—or at least most—of your code. That is where coverage profiling comes in; coverage profiling reveals the percentage of classes, methods, and lines of code that are executed by a test. Coverage profiling can provide strong validation that your unit and integration tests are effectively exercising your components.

    In this section, I’ll show a test of a graphical application that I built to manage my digital pictures running inside of a coverage profiler filtered according to my classes. I purposely chose not to test it extensively in order to present an interesting example. Figure 9 shows a class summary of the code that I tested, with six profiled classes in three packages displayed in the browser window and the methods of the JThumbnailPalette class with missed lines in the pane below.

    Figure 9. Coverage profile of a graphical application

    The test exercised all six classes, but missed a host of methods and classes. For example, in the JThumbnailPalette class, the test completely failed to call the methods getBackgroundColor(), setBackgroundColor(), setTopRow(), and others. Furthermore, even though the paint() method was called, the test missed 16.7 percent of the lines. Figure 10 shows the specific lines of code within the paint() method that the test did not execute.

    Figure 10 reveals that most lines of code were executed 17 times, but the code that handles painting a scrolled set of thumbnails was skipped. With this information in hand, the person needs to move the scroll bar, or configure an automated test script to move it, to ensure that this piece of code is executed.

    Coverage is a powerful profiling tool, because without it, you may miss code that your users will encounter when they use your application in a way that you do not expect (and rest assured, they definitely will).

    Figure 10. A look inside the JThumbnailPalette’s paint() method

    Summary

    As components are built, performance unit tests are performed alongside functional unit tests. These performance tests include testing for memory issues, and code issues, and the validation of the coverage of tests to ensure that the majority of component code is being tested.

    About the Author

    Steven Haines is the author of three Java books: The Java Reference Guide (InformIT/Pearson, 2005), Java 2 Primer Plus (SAMS, 2002), and Java 2 From Scratch (QUE, 1999). In addition to contributing chapters and coauthoring other books, as well as technical editing countless software publications, he is also the Java Host on InformIT.com. As an educator, he has taught all aspects of Java at Learning Tree University as well as at the University of California, Irvine. By day he works as a Java EE 5 Performance Architect at Quest Software, defining performance tuning and monitoring software as well as managing and performing Java EE 5 performance tuning engagements for large-scale Java EE 5 deployments, including those of several Fortune 500 companies.

    Source of this material

    Pro Java EE 5 Performance Management and Optimization
    By Steven Haines



    Published: May 2006, Paperback: 424 pages
    Published by Apress
    ISBN: 1590596102
    Retail price: $49.99
    This material is from Chapter 5 of the book.

    Share:
    Home
    Mobile Site | Full Site
    Copyright 2017 © QuinStreet Inc. All Rights Reserved