Java Development News:

Designing Performance Testing Metrics into Highly Distributed J2EE Applications

By Frank Teti

01 Feb 2004 | TheServerSide.com

Introduction

In ancient computing times, a wise system architect once said, "you can't manage an application you can't monitor." In many cases, documenting the design for a new application, let alone how it might be monitored, is typically beyond the scope of the project. This article describes a reusable mechanism for capturing application-bound, performance statistics for highly distributed J2EE applications.

Applications essentially need to be engineered to support established service levels (i.e. Service Level Agreements). In some cases, SLA might be very granular, such as, stating the elapsed time expected to persist data related to a person. The architecture depicted in Figure 1 was implemented to handle an Internet order entry, including a self-service feature to allow a customer to update his own profile.



Figure 1. Distributed J2EE Architecture


Because of expectations of high order volumes, the application was engineered to provide distributed EJB transactions, which can be pooled and clustered for high availability and scalability. Highly distributed applications not only need to be engineered from a distributed object standpoint, but from a load balancing and clustering perspective, too. The architecture does not reflect load balancing and clustering implementation details.

This document describes a basic methodology for performance testing the distributed architecture. Essentially, the document describes the design and edits required to collect basic instrumentation/timing estimates for a distributed J2EE application.


The Distributed Architecture and Time Synchronization

The J2EE application relies on five tiers: a Web, EJB, EAI (Enterprise Application Integration), ERP (Enterprise Resource Planning) and DBMS tier. An Internet client makes an HTTP request to the Web tier and forwards the request over RMI/IIOP to the EJB tier. The EJB tier communicates with the SeeBeyond EAI tier using a SeeBeyond MUX e*Way . The EAI tier communicates with the Peoplesoft CRM (Customer Relationship Management) application within the ERP tier using a Peoplesoft adapter. Finally, The CRM application integrates with an Oracle DBMS over SQL*net.

A key part of any distributed environment is the time server. Within the methodology, instrumentation metrics are collected on the server where the code is executed, so it is important that all machines within the distributed J2EE architecture are time synchronized. Within the IBM AIX UNIX environment, the xntpd daemon sets the internal clock time on a machine. The service is used to synchronize time services with an external machine using the Network Time Protocol (NTP). Typically, there is a time server, which does not necessarily need to be another AIX or UNIX machine, within an organization's network that all machines synchronize with on startup. (In reality, machines with xntpd configured synchronize constantly, not just on startup.)

Alternatively, a machine can synchronize with an external time server on the Internet. The xntpd daemon defaults to off, so time settings, such as the IP address of the time server to synchronize with, needs to be configured within an ntp.conf file. Since instrumentation metrics are collected at the second or even millisecond levels, discrepancies between machines can lead to spurious performance statistics.


Object Specification and Instrumentation Behavior

The proposed J2EE architecture relies on a number of well-known design patterns, such as the Business Delegate (a.k.a. Proxy), Data Access Object, Model-View-Controller, Session Facade and Transfer Object (a.k.a. Value Object). A transfer object is a serializable class that groups related attributes, forming a composite value. In the architecture, the Transfer or Value Object is used as the return type of a remote business method, such as an Enterprise JavaBean. Fetching multiple attributes through the value or transfer object in one server roundtrip decreases network traffic and minimizes latency and server resource usage.

Figure 2 depicts the PersonValueObject, which implements IPersonValueObject and Serializable interfaces. The PersonWorkerBean uses a PersonValueObject for containing person and instrumentation attributes. Instrumentation attributes are long values, which contain a timestamp from calling System.currentTimeMillis(). The PersonValueObject also includes accessor methods for all attributes.



Figure 2. UML Class Diagram Value Object Specification (a.k.a Transfer Object)


While it is good OO practice to provide accessor methods instead of allowing direct manipulation of instance variables, it also increases overhead by imposing an additional method call. You want accessors to hide the implementation of get and set, and to allow subclasses to change the synchronization of a get or set, but there will be an overhead cost. Value objects can be sub-classed or they may have associations with other Value objects. For example, if requirements dictated that all Value objects in the application needed to be performance tested, refactoring instrumentation operations within a "FrameworkValueObject" would strongly be recommended. In this alternative specification to Figure 2, the PersonValueObject would extend the FrameworkValueObject and implement the IPersonValueObject interface; the IPersonValueObject interface would extend the IFrameworkValueObject interface. However, the operations listed in the IPersonValueObject would be the same as depicted in Figure 2 for either specification. The advantage to using this model is that all value objects would extend the FrameworkValueObject and therefore have instrumentation behavior; however, this is a complex relationship, and while correct from a UML (Unified Modeling Language) perspective, in and of itself would exert a performance cost to the application.

For value objects that have a direct association with other value objects, those relationships are implemented as class instance variables. UML recommends that for value objects that have an aggregation or composition association with other value objects, those relationships should be implemented as a Collection. A Collection is the root interface in the Java Collection hierarchy. A Collection represents a group of objects, known as its elements. This interface is typically used to pass collections around and manipulate them where maximum generality is desired. For example, a PersonValueObject contains one to many AddressValueObjects or a Collection of AddressValueObjects. However, this relationship is not included in the diagram, because it is not related to capturing timing estimates, but it is important to "real world" applications.


Instrumentation Data Collection Process

The objective of the methodology for this project was:

  • To estimate the total elapsed time to marshal and persist a PersonValueObject within middleware components.

  • To determine which tiers or key processes were the most expensive from a performance perspective and where optimization might be needed.

  • To be able to log timing estimates from multiple tiers in a single log file, which makes performance analysis and reporting simple.

Once the value object was wired for collecting instrumentation/timing, instrumentation edits were required in the J2EE and SeeBeyond application code. This required identifying the objects that would set timing estimates into the value object for the tier, and which object would be responsible for getting the estimates and writing them to a log file. The number of places in the code that timing metrics were collected was, in a way, an arbitrary decision. It was determined that the following objects needed to be reworked:

  • PersonWorkerBean on the Web Tier - would be responsible for setting timing attributes (that is, T1 and T6) on the PersonValueObject to determine total time spent marshaling the remote Session EJB from the Web tier, getting all timing estimates from the PersonValueObject and writing them to a single log file.

  • EJB SessionBean on the EJB Tier - responsible for setting timing estimates (T2 and T5) on the PersonValueObject to determine elapsed time spent on the EJB tier getting a connection to SeeBeyond and parsing the PersonValueObject into a delimited string for SeeBeyond processing.

  • PersonDAO (Data Access Object) on the EJB Tier - responsible for setting timing estimates (T3 and T4) on the PersonValueObject to determine total time spent on the EAI tier.


Architectural Limitations

The input for testing the application was generated using Mercury Interactive's LoadRunner via an HTML browser; this discussion, however, is out of scope for this report. Internet/network timing estimates, that is, download time from the HTTP server to the browser, are not collected through the value object; intranet download timing estimates must be imputed by estimating elapsed overall response time through LoadRunner minus PersonValueObject timing estimates collected in the framework_log4j.log.

The target of the log4j log output can be a file, an OutputStream, a java.io.Writer, a remote log4j server, a remote Unix Syslog daemon, or even an NT event logger, among many others. The performance of the log4j package is also quite good. On an AMD Duron clocked at 800Mhz running JDK 1.3.1, it costs about 5 nanoseconds to determine if a logging statement should be logged or not. Actual logging is also quite fast, ranging from 21 microseconds using the basic configuration.

It is important to point out that the purpose of the methodology is not memory and CPU application profiling as would be performed by tools such as JProbe from Quest Software or Optimizeit form Borland. The only tools that are engineered for distributed performance monitoring similar to the methology discussed here are PathWAI Dashboard from Candle Corp. and Introscope from Wily Technology.



Figure 3. UML Sequence Diagram - Create a PersonValueObject


A successful execution of the application from the Step 1 createPerson on the PersonWorkerBean to Step 23, writing the timing data to a Log4j log file would result in a single row that can be parsed into the following matrix. Timing estimates are collected in milliseconds. The table below is used to depict the elapsed time for processes within the application tiers, which is somewhat an arbitrary decision based on requirements. In effect, various permutations and combinations of processing times can be estimated, as long as the analyst/programmer has a solid understanding of the context of the timestamp captured. For example, 250 represents the total elapsed time marshaling the PersonValueObject from the Web tier through the entire application, whereas, the Incremental Elapsed Time Column is an imputed estimate for processing within tiers.


Implementation Details

Step 1: Implement instrumentation code edits for the IPersonValueObject Interface for the PersonValueObject Class (see Figure 2).

Step 2: Implement instrumentation code edits for the PersonValueObject Class (see Figure 2).

Step 3: Instrumentation code edits for the log4j.xml that sets priority level for application logging to the framework_log4j.log file. log4j relies on a configuration file for runtime settings. It also provides an ability to dynamically set the log levels (i.e. DEBUG, INFO, WARN, etc.).

 <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE log4j:configuration SYSTEM "log4j.dtd"> <log4j:configuration debug="true"> <appender name="TEMP" class="org.apache.log4j.FileAppender"> <param name="File" value="framework_log4j.log"/> <layout class="org.apache.log4j.PatternLayout"> <param name="ConversionPattern" value="%d - %m%n"/> </layout> </appender> <category name="proofofconcept"> <priority value="info" <appender-ref ref="TEMP"/> </category> </log4j:configuration>

Step 4: Implement instrumentation code edits for the PersonWorkerBean (see Figure 3). The PersonWorkerBean logs the timestamps to the framework_log4j.log file by invoking a log4j.info method on a static reference to the Log4j Logger class. There are really two key classes: the PropertyConfigurator and Logger. The PropertyConfigurator.configure method is used to pass the location of the configuration file. The Logger is used to set the logging messages, the logging level and to identify which classes in the application you want to log errors for. The recommended pattern for extending the Logger class within Log4j is wrapping or creating a reference. It is strongly discouraged that users subclass either the Logger classes. Log4j provides an ability to (dynamically) set the log levels (i.e. DEBUG, INFO, WARN, etc.) or provide the level through a configuration file. By setting the level to "info", all messages of "info" and above will be written to the log file. Provided that the application does not throw any exceptions, there should be only one "info" message written to the file, which includes timing estimates for all middleware tiers.

Step 5: Implement instrumentation code edits for the PersonServiceSessionBean (see Figure 3). The PersonSessionBean create method receives a serialized copy of the PersonValueObject, sets the estimate into the PersonValueObject and calls the createPerson on the PersonDAO. In the model, the DAO is used to get a connection to the SeeBeyond EAI application server, and the PersonValueObject is not directly persisted into a DBMS. The point is, that in most J2EE applications, the Data Access Object pattern is used to specify how to persist data into DBMS.

Step 6: Implement instrumentation code edits for the PersonDAO class (see Figure 3). The createPerson method on the PersonDAO sends the PersonValueObject to the SeeBeyond application server using a MUX e*Way, a synchronous protocol. For this implementation, the PersonValueObject attributes where sent as a test message, because the SeeBeyond application server only supported text messages. The returning message from SeeBeyond, because it represented a string, not a PersonValueObject, needed to be parsed using a StringTokenizer. This edit for instrumenation code assumes that there are 3 tokens in the order in the format "Id^timeStamp1^timeStamp2." The Id is generated by SeeBeyond and represents the primary key for the PersonValueObject created.

The SeeBeyond application server provided timestamps for integration with the Peoplesoft CRM. SeeBeyond also returned the generated primary key, which would be required for persisting associated value objects, such as Address value objects. This is not modeled in the UML sequence diagram, because the central theme is how to collect timing estimates within disparate computing tiers.

 protected IPersonValueObject createValueObject( String rs ) { IPersonValueObject valueObject = null; try { valueObject = getPersonValueObject(); if ( valueObject == null ) // create one { valueObject = new PersonValueObject(); } StringTokenizer st = new StringTokenizer(rs, "^"); PersonPrimaryKey pk = new PersonPrimaryKey(st.nextToken()); valueObject.setPrimaryKey( pk); valueObject.setT3( Long.parseLong(st.nextToken())); valueObject.setT4( Long.parseLong(st.nextToken())); } catch ( Exception exc ) { printMessage("PersonDAO:createValueObject()-" + exc ); } return( valueObject ); }


Architectural Implications

Ideally, the best design would allow for a ValueObject to be passed to all middleware tiers. SeeBeyond on the EAI tier actually has asynchronous support using JMS (Java Messaging Service) text messages, but not for JMS ObjectMessage type messages. Most implementers would prefer ObjectMessage type messages. For example, with an ObjectMessage type using JMS, the PersonValueObject would not have been parsed into a String object, but could have been sent directly to SeeBeyond . ObjectMessages - messages containing a serialized Java object - make it clear to programmers which data type and attribute they are receiving and sending. With support for JMS ObjectMessages, the PersonEAI object depicted in Figure 4 could have been implemented in SeeBeyond, which could have marshaled directly the PersonValueObject. For distributed object programmers, this is what it is all about -- transferring objects around the network.



Figure 4. View of participating classes for UML Sequence Diagram - Create a PersonValueObject


About the Author

Frank is a Lead Architect specializing in J2EE application design and implementation of enterprise-wide applications. He can be reached at frank_teti@hotmail.com.


Other related articles by Frank Teti:

When SOAP Just Won't Do, Network Computing, 11/03
http://www.networkcomputing.com/showitem.jhtml?docid=1424ws1

Design for EJB Message-Driven Beans with UML Use Cases, Enterprise Architect, Winter 2003
http://www.ftponline.com/ea/magazine/winter/columns/modeling/

The new world of clustering, Application Development Trends, 12/01
http://www.adtmag.com/article.asp?id=5752

Boosting Java Performance, Application Development Trends, 12/00
http://www.adtmag.com/article.asp?id=2667

Related Resources