New Article 'Improving J2EE Application Performance' Posted


News: New Article 'Improving J2EE Application Performance' Posted

  1. A new TSS article by Scott Marlow, "Improving J2EE Application Performance", describes how to achieve a high level of performance in a J2EE application. A structured approach to improving performance will be described that ranges from broad strokes (monitoring J2EE Application Server resource usage) to fine strokes (finding bottlenecks in the application).

    Read "Improving J2EE Application Performance"

    Threaded Messages (21)

  2. Great article, test environment is extrememly important, especially once the initial application goes into production. This is important because when the next addition to the application comes down from the business folks, and you can't interrupt production, where do you test? The little QA environment that doesn't simulate the production env? Definitely plan for performance up front and have somewhere to test when you add or change functionality to your baseline. cb
  3. I have written a free Java performance monitoring tool called JAMon that I thought participants in this thread
    would be interested in . It couldn't be easier to use and provides a wealth of information
    including performance and scalability statistics and more. It includes capabilities that a number of the emails mentions including: monitoring of production java code, min/max/avg/std dev for response times, runtime disabling,...
    A link with a typical sample data:
    To learn about and download JAMon go to:
    If you know of anyone that wants to monitor there code, please refer them
    to this link. Thanks. Let me know what you think.
    JAMon is being hosted by Jack Shirazi, author of the O'Reilly book "Java
    Performance Tuning" -
  4. Steve, thank you for the JAMon tool...[ Go to top ]


    I tried JAMon and found it useful for measuring performance in server code. I also used it in some standalone (multi-threading) utilities as well

    My email address is no longer smarlow at silverstream dot com, it is now smarlow at novell dot com.

  5. new contact information[ Go to top ]

    I got to use your JAMon tool again recently, still pretty cool! I am now at Vestmark ( and can be reached at smarlow at vestmark dot com -Scott
  6. Blah blah blah. Honestly, very little good stuff in that article. Nothing we haven't seen many times over.
  7. See, Tracy, there is always something new for some people, even in articles like this one.
  8. Tracey, re-enforcing good practice is good idea.

    Our project is currently in the performance testing phase. Therefore, this article does a lot to support the effort and helps ensure that we didn't miss any steps. Not everyone related to a project will know what "you" take for granted.
  9. Did anybody notice the "kill -3" gem. Didn't know about
    this one!!.
  10. Yes, kill -3 is really, really neat. It can tell you a lot when you want to know what your app is up to. I found out about the feature last year and within minutes I was able to spot a bug that have been haunting our application for weeks!

  11. kill -3??? excuse the stupid question (tracy) what is it? how can I use it?

    thanks in advance
  12. Good Article.

    I have some questions about our performance measurement.In GE, We have a lot of J2EE applications which goes through QA.For load test, we use WebLoad tool and for perf, we use JProbe.

    The Performance test box is Netra sun solaris, 512 MB RAM.But production box is 2GB RAM e250 sun solaris boxes.

    How do we measure the peformance on the test environment?Does performance relate to RAM or CPU(from hardware standpoint)?The test done on 512 MB RAM box is valid?


  13. Lawrence:
    Assuming everything else is equal (database speed, processor speed and network latency), the test on the 512MB box is definitely valid. It may not be directly comparable, but you can assume that any test you run on your test environment will perform the same or better when it has more memory to work with in production.

    Performance relates to both CPU and RAM. Faster processors and more RAM usually results in better response times and more load handled.

    Hope this helps-
    Caryn Eldridge
  14. Pedro,

    Read step five in the performance article for more information about using the "kill -3" command to get the Java stack trace for all threads.

    The following summary text is clipped from

    What is a Java stack trace? A Java stack trace is a user-friendly snapshot of the threads and monitors in a Java1 Virtual Machine (JVM). Depending on how complex your application or applet is, a stack trace can range from fifty lines to thousands of lines of diagnostics.

  15. do you guys notice author make a mistake here
    When the rates change, simply replace the old array with an updated one (object assignments are atomic in Java).

  16. Qing Yan,

    I double-checked the Java Language specification to verify that object assignments are atomic. See

    The following text is mentioned in the Java Language specification:
    1. A lock action acts as if it flushes all variables from the thread's working memory; before use they must be assigned or loaded from main memory.

    2. If a thread is to perform an unlock action on any lock, it must first copy all assigned values in its working memory back out to main memory.

    3. In the absence of explicit synchronization, an implementation is free to update the main memory in an order that may be surprising. Therefore the programmer who prefers to avoid surprises should use explicit synchronization.

    To find out if it is safe for a one-line assignment operation that is performed without a synchronization lock, you have to carefully read the Java language specification. Even after you have read the Java language specification, you may still wonder if it is safe to assign a shared object variable a new value without locking the shared object that contains the variable. You could wrap the shared object variable read/write operations with a class that deals with copying the thread working memory to/from main memory (the wrapper class would enter/exit synchronized block on new object instance).

    You are correct to point out that updating a shared object variable with a new object instance is not a simple operation (I hope this is what you meant by the mistake.) Each thread needs to enter/exit a synchronization block before the update will take effect on each thread respectfully. We removed the text from the article to avoid any further confusion.

    Scott Marlow
  17. Good article. I followed all the links in the article and found them useful with one exception. The Introduction to Java StackTrace proved to be utterly useless!
  18. I've just completed a contract to improve the performance of a French government website based on WL 6.1.

    They started out with two problems. A team composed of programmers with little or no experience of Java and of J2EE technology and no requirement to make performance an essential part of development.

    From this I'd like to make two points. The developers didn't really have any concept of threads, shared data etc. As a result the application just wouldn't run multi-user. But of course they'd never tested in this environment as everyone worked with a local WL instance effectively testing their code single user.

    The second point, and this is the J2EE project I've worked on that has suffered this. Performance and testing has to be done from the start... this is of course a doctrine of XP and a very good one too IMHO. By the end of the project the mistakes have been made and there is often little beyond a major rewrite that can be done.

    So to begin with:

    1. in a J2EE project no code should be accepted that does not run multiuser without errors. Build use test cases as you go along with OpenSTA or Cactus which include multiuser tests. People shouldn't even check code that does not pass these tests into source control (well CVS, you are using CVS arn't you? What, still paying $$$$ for source control?).

    The point of these tests is not to tune the performance but to ensure that the code runs multi-user without bugs. Discourage your developers from spending huge amounts of time at the front end tuning code that may not be on the critical path (remember the good old 80:20 rule).

    2. In the most recent project someone on the team had built a 'JSP' framework that sat above all JSPs and shared the request and response objects!!!!! The first use case I ran with 2 users failed horribly. So secondly, don't reimplement the functionality provided by your application server. And don't let inexperienced developers embark flights of fantasy.

    A note on OpenSTA, this is an open source load testing tool which is quite useful for building end to end use cases (especially because it is free). I found that a lot of the data reported by OpenSTA was bogus, like Respones/Second. In the end I got this information quite simply from the server log files (2 server WL cluster with apache for static pages and load balancing).

    However max response times were useful. In the application I was testing we would get an average response time of 1second per page say under a 400 VU load, but would often get rogue pages taking 15 - 20 seconds. I assumed they were waiting on monitors but they were quite few and far between and difficult to trace with thread dumps.

    I also was never able to load the cluster above 35% CPU utilixation during testing. In theory, the ideal is to get 100% CPU load (e.g. CPU == Real Time) and this will give you your max number of simultaneous users. Below 100% and you waiting on I/O or monitors. Make sure you have more than 1 host to run tests from and that you are on an isolated LAN segment. Both difficult things to get out of management who seem to view all testing as superfluous in these Internet development days.

    Probably the biggest thing you can do to affect Java execution speeds is change the VM. The latest Hotspot VMs offer far better performance on synchronized data where there is no contention. So you can either pull out your Hashtables and Vectors and replace them with ArrayLists and Hashmaps or simply swap to a later VM.

    Just some thoughts,

    David George
  19. David,

    Could you explain us why you were not able to load the cluster above 35% CPU utilixation during testing ?
    When we first designed the RUBiS benchmark we noticed that the database was the bottleneck. After a redesign to unload the DB the app server was the bottleneck but in any case our bean code (SB or EB) was less than 2% of the overall execution time leaving very few space for optimization (you can see the results in this article).
    I think that one of the problem is that most of the time you will also need a cluster of clients to stress a clustered app. In RUBiS, we already needed up to 4 clients nodes to stress a single app server according to the implementation we used. We noticed problems emulating a large number of clients from a single node. Any comment on that ?

  20. Could you explain us why you were not able to load the cluster above 35% CPU utilixation during testing ?

    I've been thinking about this. The first problem was that the client really lacked resources. As a consequence they were unable to provide a separate lan segment for testing (yes I know this is hard to believe but it was impossible to get a faulty RJ45 plug replaced in the 2 months I was there).

    So the obvious candidate is that I was just not able to load the system sufficiently. As it took me 3 of the 4 weeks I had to test the system just to get it to run without errors and supporting more than 1 user so I didn't have too much time to investigate.

    It wasn't the DB, I checked that. Also I was concerned, like you, that the test client should have been distributed across more machines. I ran two tests, one accessing random static pages and the other accessing pages which in turn went to the database via EJBs.

    For the static test I was able to get Weblogic 6.1 to process around 200 requests per second, with the EJB test the maximum response time was closer to 10 requests per second. So I figure that my client was quite capable of loading the server to a greater degree than I achieved. But I would still agree with your conclusion that to perform real tests you need to have more than 1 host for your exclusive use.

    I still suspect that there was a problem with the application arhitecture but without something like Introscope it was hard to tell. I ran JProbe but found it too intrusive for this level of testing. It is a fine product for micro-performance tuning however.


  21. If you are interested in J2EE Performance tuning, take a look at, a new J2EE Performance Monitor (30-Day Trial). We have seen clients improve performance over 60% in 2 weeks, as JView 2004 shows the data in an easy to read format, so even in-experienced performance tuners can get results pretty quickly.
  22. Link has moved![ Go to top ]

    The article mentioned has been moved. Here is the new address (as of 2007/01/11). regards, michael