Best Practices for J2EE Performance Engineering

Discussions

News: Best Practices for J2EE Performance Engineering

  1. Best Practices for J2EE Performance Engineering (29 messages)

    Learn all about the best practices for performance engineering in J2EE Web applications. Darren Broemmer, author of "J2EE Best Practices" (Wiley), looks at the role of performance in the development process and discusses techniques to optimize the architecture, design and code in your J2EE application components such as Entity Beans, Message-Driven Beans, and when processing XML.

    Read J2EE Best Practices for Performance.

    Darren Broemmer will also be speaking at TheServerSide Symposium.

    Threaded Messages (29)

  2. typical error[ Go to top ]

    On page 11 the author states that the following statement is
    inefficient because of the creation of many temporary strings object:

    String result = value1 + value2 + value3 + value4;

    I read this in "Performance" books very often. But I think, it's either obvious. So why repeat in every book? Or - even worse - it's not true.

    From the Java Api doc of class StringBuffer:

    "... String buffers are used by the compiler to implement the binary string
    concatenation operator +. For example, the code:

         x = "a" + 4 + "c"

    is compiled to the equivalent of:

         x = new StringBuffer().append("a").append(4).append("c")
                               .toString()
    ..."

    Ingo
  3. typical error[ Go to top ]

    Ingo:

    In my understanding, the use of StringBuffer is more efficient. It is equivalent to StringBuffer append() method invocations rather than String concat(), which according to the StringBuffer Javadoc directly below your snippet states:

    "... creates a new string buffer (initially empty), appends the string representation of each operand to the string buffer in turn, and then converts the contents of the string buffer to a string. Overall, this avoids creating many temporary strings. "

    The last statement was the key for me, and simply part of a larger point that I was trying to make about the performance implications (via garbage collection) of sections of code that create many small, temporary objects. I agree with you that the use of StringBuffer vs. String is a detailed optimization that is rehashed many times in the literature ... it was only intended for use in the larger discussion of when to perform optimizations, establish development best practices, and analyze sections of code with regards to performance.

    Darren Broemmer
  4. String vs. StringBuffer[ Go to top ]

    You are right and the way that I like to illustrate the difference to my students is through a simple example reading characters from a stream (e.g. a file). Consider a file with the alphabet: abcdefg...

    If you read this in character by character into a String:
    String s = "";
    while( ... ){
    s += c;
    }

    In the end, your String table in memory looks like this:
    a
    ab
    abc
    abcd
    abcde
    ...

    If, on the other hand you use a StringBuffer:

    StringBuffer sb = new StringBuffer()
    while( ... ) {
     sb.append( c );
    }
    String s = sb.toString();

    You do not have any of those temporary Strings (no "a", "ab", "abc"), just the final String.

    Hope that helps!
  5. String vs. StringBuffer[ Go to top ]

    There is a difference between:

    1)
    s = "a" + "b" + "c"

    and

    2)
    s = "a"
    s += "b"
    s += "c"

    1) is ok in terms of temporary string.
    2) is not.

    Don't mix the two cases.
    If 2) was stated in the book, I wouldn't have commented it.

    Ingo
  6. Red Herrings[ Go to top ]

    Why, oh why, do people get so hung up on this StringBuffer stuff?? If I ever see a Java enterprise application whose performance is dominated by *String manipulation*, I will be .... well, I dont know what I'll be ... its simply not going to happen.

    Oh, I don't mean string manipulation that happens in an XSLT processor, or SOAP library, or OR tool SQL rendering ... libraries like these should always be using StringBuffers. But in *business* code??

    You *regularly* see completely hilarious things like this:


      if ( log.isDebugEnabled() ) {
         log.debug( "Retrieving User: " + user.getName() );
      }

      preparedStatement.executeQuery(); ///!!!!!!!!!!!


    Please, everyone! The above is an abomination. It takes three (!) lines of code to write a log message (in order to avoid a single string concatenation). Immediately following that, it hits the database.

    Remember the Rules:
    (1) Don't do it
    (2) (For experts) Do it later.

    sorry...I had to vent....
  7. Red Herrings[ Go to top ]

    You're absolutely correct that things like database access are much more concerning (considering scalable performance) than the concatenation of strings. Our company has worked hands-on with many (probably >100) different J2EE applications that were suffering serious scalable performance issues, and it is safe to say that the overwhelming majority (85%?) were bottlenecked on the database. Only one was related to inefficent use of CPU/memory, and a profiler found it instantly (96% of the entire JVM CPU time, across 2x Sun e10000 domains, was spent in one method.)

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Easily share live data across a cluster!
  8. Red Herrings[ Go to top ]

    \Purdy\
    You're absolutely correct that things like database access are much more concerning (considering scalable performance) than the concatenation of strings. Our company has worked hands-on with many (probably >100) different J2EE applications that were suffering serious scalable performance issues, and it is safe to say that the overwhelming majority (85%?) were bottlenecked on the database. Only one was related to inefficent use of CPU/memory, and a profiler found it instantly (96% of the entire JVM CPU time, across 2x Sun e10000 domains, was spent in one method.)
    \Purdy\

    I think your 85% may be overstating facts a bit across many J2EE projects. But not by too much :-). Certainly in an environment like J2EE, there are many different sort of calls and operations with widely varying costs. And the first think you should look at performance-wise are your most expensive operations - database calls, calling out over the network, etc. And the bulk of your problems are going to lie there.

    However - I think people are being overzealous in downplaying the memory allocation issue. The issue here isn't the cost of of the allocation, its the cost of garbage collection later. Java programmers as a rule tend to be rather fast and loose in the number of objects they create during processing, and they don't have the benefit of stack-based objects that a language like C++ provides.

    The end result is that a process can spend a surprising amount of time garbage collecting. In the JMS work I'm persuing right now, the original code created around 35 objects per JMS publish operation. This might not seem like a big deal - until you think in terms of doing 500 or more publishes a second. In this sort of scenario, the JVM spent a distressing amount of time garbage collecting.

    In all, I'm not going to go over board and say that developers should obsess over this. Most of the time it's the least of their performance worries. But it shouldn't be discounted altogether. Java programs have a tendency to create an awful lot of objects, and there is a cost in cleaning up after this.

    The only good news is that the Java 1.4 generational garbage collector is much, much better tuned to collecting "temporaries" of this kind. Under 1.4 I find I don't need to worry much at all about object creation, because the generational collector is exceptionally good at fast object allocation, and very efficiently cleaning up very short lived objects. But for people working in older JVMs, it's still a concern that should be watched.

        -Mike
  9. Red Herrings[ Go to top ]

    Mike: the original code created around 35 objects per JMS publish operation. This might not seem like a big deal - until you think in terms of doing 500 or more publishes a second. In this sort of scenario, the JVM spent a distressing amount of time garbage collecting.

    That's only 15,000 objects you are allocating per second. While it sounds terrible, it is not a very large number, relatively speaking. If I remember correctly from some profiling we did, I think that a certain (well-known vendor-name omitted) type-4 JDBC driver will typically create around that many temporary objects for a single SQL call ;-). Have you looked at verbose gc output to get an idea of the total number of allocations going on? Also, how big where those 35 objects? Were they forcing a full GC?

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Easily share live data across a cluster!
  10. Red Herrings[ Go to top ]

    \Purdy\
    That's only 15,000 objects you are allocating per second. While it sounds terrible, it is not a very large number, relatively speaking
    \Purdy\

    It depends on your environment and your goals. I readily admit that it's not a concern in many situations. But for some it is. In the case I'm citing, there's no really good reason for 30 of those objects to be created per send. That's an awful lot of of garbage being created per second. And if you compare it to a stack-based C++ solution, it compares very poorly.

    \Purdy\
    If I remember correctly from some profiling we did, I think that a certain (well-known vendor-name omitted) type-4 JDBC driver will typically create around that many temporary objects for a single SQL call ;-).
    \Purdy\

    Well, that's object creation around a heavy weight call. It's likely that you won't be able to do very many of those per second.

    \Purdy\
    Have you looked at verbose gc output to get an idea of the total number of allocations going on? Also, how big where those 35 objects? Were they forcing a full GC?
    \Purdy\

    Yes, I have done profiling, and I've worked with two app server vendors on this issue as well. The pronouncement from one vendor was that the server spending 5%-8% of its time garbage collecting was within their acceptable range. To me, there's no really good reason for an app server to spend 5%-8% of its time garbage collecting. In my specific case, the number as around 10%, and most of that was truly unnecessary.

    When dealing with server processes, this can rapidly become an issue. Despite the fact that memory is pretty cheap these days, even my own semi-optimized code needs 256MB to run comfortably, and GC delays perceptibly eat into the runtime execution. A comparable C++ solution can run in less than 40MB.

    This isn't an indictment of Java or J2EE - I think there are real benefits to be derived here that outweigh the downsides. But at the same time, alot of Java and J2EE apps suck memory like there's no tomorrow, and spend more time GC'ing then you might expect. To my mind, memory consumption and GC times are the biggest obstacle to J2EE adoption at this point - we're an order of magnitude higher here than a C++ solution. Some of this is because Java doesn't support stack objects, and it's often difficult for the runtime to optimize new() operations into something like stack solutions. But some of this is because Java developers - even producers of J2EE products - create a huge number of unnecessary objects.

        -Mike
  11. Red Herrings[ Go to top ]

    Mike,

    I'm in general agreement with what you're saying. Did you get any before vs. after information from chopping those extra 30 allocations per message? In other words, did it make a difference to optimize that?

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Easily share live data across a cluster!
  12. Red Herrings[ Go to top ]

    \Purdy\
    I'm in general agreement with what you're saying. Did you get any before vs. after information from chopping those extra 30 allocations per message? In other words, did it make a difference to optimize that?
    \Purdy\

    Under Java 1.4, it didn't make any noticable difference in performance. So long as you set the "new" generation to an appropriate size, the GC is unbelievably fast in picking up even 10s of megabytes of short-lived objects. But you can get in trouble if your new generation size is set too small.

    Under Java 1.3, it did make a difference of about 5% over the life of the server.

    As a related side note, the GC under 1.4 has a fabulous -Xverbosegc option which can be used to spot-check for memory leaks. This option reports:

    - why GC happened (Alloc failure, System.gc, Old generation full, Perm generation full, Train generation full, etc). So you can see what triggered a garbage collect

    - Sizes of each memory generation area before and after the collect. So you see something like:

      Eden before/after Survivor before/after Old Before/after Perm before/after

    With this sort of output, if you have a memory leak you can quickly see objects migrating from the eden space outward to the older areas (Eden->Survivor->Old). If you see more and more stuff migrating out to the older generation over time, chances are that you have a memory leak.

         -Mike
  13. Red Herrings[ Go to top ]

    Why, oh why, do people get so hung up on this StringBuffer stuff??

    (as a response to my mail)

    Because people write about it and people teach it. So it should be right (from the java perspective).

    Maybe it's irelevant from the book's title's perspective -> take it out.

    > Why, oh why, do people get so hung up on this StringBuffer stuff??
    (as a response in general)

    Because that's one of the few concrete performance hints people know, besides the more general ones like "Test early, test often" or "Don't do it. Do it later.".

    Ingo
  14. Red Herrings[ Go to top ]

    Gavin writes:

    <em>

      if ( log.isDebugEnabled() ) {
         log.debug( "Retrieving User: " + user.getName() );
      }

      preparedStatement.executeQuery(); ///!!!!!!!!!!!


    Please, everyone! The above is an abomination. It takes three (!) lines of code to write a log message (in order to avoid a single string concatenation). Immediately following that, it hits the database.

    </em>

    I have absolutely no idea what you mean. The test is not against string concatenation but to make sure debug is enabled.

    Did I miss something? What is wrong with this code? And what do you think is the correct way to write it?

    --
    Cedric
    confused
  15. Red Herrings[ Go to top ]

    Cedric,

    The reason most people use 'if' statement as shown in Gavin's example is to avoid unnecessary string concatenation in case debugging is off. Say, you have a statement like below.

        log.debug( "Retrieving User: " + user.getName() );
      
    If the debugging is turned off, then string concatenation happens and discarded by debug() method. Therefore the concatenation operation results in wasted CPU cycles. I think the point Gavin was trying to make was that this is such a small CPU cycle wastage compared to what happends next operation (DB query). The overall performance gain is so small because DB query will dominate the performance result also now we have 3 lines of code for debugging rather than 1 line. (simplicity and readability)
  16. isDebugEnabled()[ Go to top ]

    I have absolutely no idea what you mean. The test is not against string concatenation but to make sure debug is enabled. <

    All logging frameworks (should), as the first line of the debug() method do something like:

      if ( !isDebugEnabled() ) return;

    or equivalent. Remember that logging frameworks are heavily optimised for the case of NO logging, which is kinda upside down since most software optimizes its functionality, not its non-functionality ;)


    :)
  17. isDebugEnabled()[ Go to top ]

    <quote>

    All logging frameworks (should), as the first line of the debug() method do something like:

      if ( !isDebugEnabled() ) return;

    or equivalent. Remember that logging frameworks are heavily optimised for the case of NO logging, which is kinda upside down since most software optimizes its functionality, not its non-functionality ;)

    </quote>

    Yes, I know that, but I still don't see why the snippet you pasted is so monstrous. The DB cost is much higher than the logging statement, but still, both need to be there.

    So, what is the "right" way of doing that?

    --
    Cedric
  18. isDebugEnabled()[ Go to top ]

    So, what is the "right" way of doing that?


    Simply...

      log.debug( "Retrieving User: " + user.getName() );
      preparedStatement.executeQuery();

    The isDebugEnabled() call is unnecessary and just clutters the code. The only difference to a direct log.debug call is the on-demand assembling of the parameter string - and this isn't worth the effort.

    Juergen
  19. isDebugEnabled()[ Go to top ]

    <quote>
    Simply...

      log.debug( "Retrieving User: " + user.getName() );
      preparedStatement.executeQuery();

    The isDebugEnabled() call is unnecessary and just clutters the code. The only difference to a direct log.debug call is the on-demand assembling of the parameter string - and this isn't worth the effort.

    </quote>

    If you are using an optimized version of a prepared statement (imagine an implementation with a delayed write), then it becomes significant. Or imagine that your company invests in a caching software that makes all these "executeQuery" return immediately... Suddenly, you find yourself having to go through all your code and reintroduce all these boolean tests that you so cleverly removed.

    In other words, you are writing code at step 1 because you know that step 2 is going to execute in that many milliseconds. This is a very dangerous assumption and it spells "premature optimization" loud to me.

    Keep the "if debug()", keep the "executeQuery" and let JVM's and extenal softwares optimize your code as needed.

    --
    Cedric
  20. YAGNI[ Go to top ]

    If you are using an optimized version of a prepared statement (imagine an implementation with a delayed write), then it becomes significant. Or imagine that your company invests in a caching software that makes all these "executeQuery" return immediately... Suddenly, you find yourself having to go through all your code and reintroduce all these boolean tests that you so cleverly removed. <


    But even in that case, the code fragment will not be dominated by the string concatenation. Even WITH a JDBC cache, the work done by the prepared statement still creates FAR more garbage than one (1) instance of java.lang.String.

    If, a posteriori, you should discover that log messages are a bottleneck in your system, THEN optimize them. Before that, YAGNI applies.
  21. YAGNI[ Go to top ]

    \King\
    But even in that case, the code fragment will not be dominated by the string concatenation. Even WITH a JDBC cache, the work done by the prepared statement still creates FAR more garbage than one (1) instance of java.lang.String.
    If, a posteriori, you should discover that log messages are a bottleneck in your system, THEN optimize them. Before that, YAGNI applies.
    \King\

    I'm not sure why it is - perhaps we should call it OO fever - but a surprising number of Java developers take this attitude. The problem here is that you're focusing on a single piece of a single operation on a single thread.

    Given that J2EE applications tend to be heavily layered, you have to consider the cost of an entire invocation chain from "request" all the way to the end points, and back again to the originator. It's not unusual to see a stack backtrace in an application server which is 20-30 layers deep. And that's just one piece of the invocation tree. The whole chain may appear to run "fast" for a single invocation - but let's say each major layer creates 5 objects. This doesn't seem to be a big deal. But it's not at all unusual in J2EE for a single end-client request to generate 500-1000 temporary objects to service that request (when you consider the entire invocation tree). To some this number may seem crazy - but not when you realize that each developer looking at his little piece of his world says "Oh, I'm only creating 5 or 6 objects - big deal". The trouble comes in when you put all the layers together.

    This is for a single client request. Run a bunch of these, and you've got alot of garbage.

    Now consider a heavily loaded system where it's common to have 20 simultaneous requests. That's 10,000-20,000 pieces of garbage created.

    You may not pay the price of this garbage while the requests are being processed. But you will pay 1) in terms of the maximum memory required by your server, and 2) in GC times.

    It may sound crazy, but simple memory usage and over-creation of temporary objects can directly affect the scalability of a server.

    It's not something you need to generally worry about in single-user scenarios - for a GUI it may not be a big deal, because there's only so much a single user can physically do in a given amount of time. But it's a big concern on the server side.

    As I mentioned in another post, the runtime characteristics of systems that create alot of garbage are greatly improved running under Java 1.4. Even so, it's still possible to just plain run out of memory, even with a generational collector.

    Note that this problem only exists in GC languages that don't have stack-based object allocation. In a language like C or C++, most "temporaries" are allocated on the stack. Freeing them is truly a "free" operation, because it's part of the call-return from the function/method.

         -Mike
  22. JDBC drivers[ Go to top ]

    It's not unusual to see a stack backtrace in an application server which is 20-30 layers deep. And that's just one piece of the invocation tree. The whole chain may appear to run "fast" for a single invocation - but let's say each major layer creates 5 objects. This doesn't seem to be a big deal. But it's not at all unusual in J2EE for a single end-client request to generate 500-1000 temporary objects to service that request (when you consider the entire invocation tree). To some this number may seem crazy - but not when you realize that each developer looking at his little piece of his world says "Oh, I'm only creating 5 or 6 objects - big deal". The trouble comes in when you put all the layers together. <

    I STRONGLY encourage anyone who believes this stuff to do some actual profiling with JProbe. I promise you that in a real environment your application server and JDBC driver (particularly the latter) create FAR more garbage than your business code. In my experience, for web applications with 1000s of *concurrent* users, bottlenecks are absolutely associated with data access and, in one case, XML/XSLT processing.

    Seriously, you won't believe just how much garbage is created by some JDBC drivers.

    Anyway, everyone agrees that in current JVMs object allocation / garbage collection is cheap, so unless you are developing for an obsolete platform, what's the bug deal?
  23. JDBC drivers[ Go to top ]

    \King\
    I STRONGLY encourage anyone who believes this stuff to do some actual profiling with JProbe. I promise you that in a real environment your application server and JDBC driver (particularly the latter) create FAR more garbage than your business code.
    \King\

    First off, the garbage numbers I cited aren't made up - I've profiled it in many J2EE applications. Those amounts of garbage creation are real.

    Secondly - the reason app servers/JDBC drivers/etc create so much garbage is because the developers there have made exactly the same assumptions that are being propogated here e.g. memory is cheap.

    The funny thing to me is that because commercial code is creating so much garbage, you advocate not worrying about your own! I say quite the opposite - clean up your own garbage, and at the same time complain to vedors when you see their own garbage creation rates are through the roof.

    \King\
     In my experience, for web applications with 1000s of *concurrent* users, bottlenecks are absolutely associated with data access and, in one case, XML/XSLT processing.
    \King\

    1000s of concurrent users on a single J2EE app server? I think not. Most likely, you're forced to deploy multiple servers. Guess why that is - well,
    there are many reasons, but one of them is that each app server is going to be guaranteed to eat 250-500MB (or even much more).

    Perhaps you think this OK. But to me - I've written C and C++ servers that serve more users than the J2EE app server equivalent, and eat a tenth as much memory. This means that less hardware is required, and more money can be spent on ensuring proper failover solutions.

    As I said before, the issue isn't transaction times - the issue is server scalability, how many simultaneous users can a given server process support for a given hardware cost.

    Once again - I'm not down on J2EE. It's my bread and butter and I don't want to go back to C and C++. But memory really is a resource, and not an infinite one.

    \King\
    Anyway, everyone agrees that in current JVMs object allocation / garbage collection is cheap, so unless you are developing for an obsolete platform, what's the bug deal?
    \King\

    I'd hardly call Java 1.3.x an obsolete platform - and its garbage collection is not cheap.

    And as I said before, even under Java 1.4.x running out of memory under high load is not unusual at all under app servers. And the high memory cost of app servers right now often mitigate against running much of anything else on the same server machine (because the app server is eating so much RAM).

        -Mike
  24. Memory[ Go to top ]

    Most likely, you're forced to deploy multiple servers <

    Correct, two E10Ks does the trick.....

    >> each app server is going to be guaranteed to eat 250-500MB (or even much more). <
    Absolutely. Much more than that, in fact. But guess what: a few gig of RAM these days is MUCH, MUCH less expensive to my employer than my time or the time of any other developer.

    >> I've written C and C++ servers that serve more users than the J2EE app server equivalent, and eat a tenth as much memory. This means that less hardware is required, and more money can be spent on ensuring proper failover solutions. <
    Of course C solutions eat less memory. duh. Thats a no-brainer. But a J2EE solution will save you money in other ways. And you can spend *that* money on memory.

    >> The funny thing to me is that because commercial code is creating so much garbage, you advocate not worrying about your own! <
    Thats exactly what I'm advocating. If the JDBC driver and appserver produce one to three orders of magnitude more garbage than my business logic (that is NOT made up) then optimizing the object allocation of the business logic has the potential to reduce total object allocation by, say, .003 - 3 per cent. Do you really thing that is worth the effort?

    It is one of the basic rules of optimization that you optimize the most expensive part, never the least expensive part. In this case, reduce data access via caching, etc, or yes, email your database vendor and complain about object allocation (not likely to get you very far).
  25. Memory[ Go to top ]

    \King\
    Correct, two E10Ks does the trick.....
    \King\

    Well, if you can afford 2 of those then I can see why memory isn't an issue
    for you. But it may be for people running on lesser hardware.

    \King\
    Absolutely. Much more than that, in fact. But guess what: a few gig of RAM these days is MUCH, MUCH less expensive to my employer than my time or the time of any other developer.
    \King\

    This is a rather narrow view of the problem. The issue isn't the cost of RAM, it's that each machine has a finite amount of RAM. And on many systems the absolute RAM max is only 2-4GB.

    I won't go into great detail here, since it's obvious your mind is made up. But what's the cost of getting an out of memory error in production because your JVM heap setting is set "only" to 1GB?

    \King\
    Of course C solutions eat less memory. duh. Thats a no-brainer. But a J2EE solution will save you money in other ways. And you can spend *that* money on memory.
    \King\

    You seem resigned to the idea that J2EE solutions have to use significantly more memory than a C or C++ solution. Based on the design of Java and J2EE, it doesn't has to be.

    \King\
    Thats exactly what I'm advocating. If the JDBC driver and appserver produce one to three orders of magnitude more garbage than my business logic (that is NOT made up) then optimizing the object allocation of the business logic has the potential to reduce total object allocation by, say, .003 - 3 per cent. Do you really thing that is worth the effort?
    \King\

    I cited 1,000 objects for processing a given request as typical for the projects I've seen, where this is happening in the application code. I've measured this. You're saying "one to three orders of magnitude more garbage than my business logic" is coming from the JDBC driver and app server.

    Excuse me - a range of 1 to 3 orders of magnitude tells me that you haven't measured anything. Assuming 1,000 as my baseline, you're saying that your JDBC driver and app server generate "roughly" between 10,000 and 1 million objects per request. Your measurements seem to have a distressingly high margin of error :-/

    \King\
    It is one of the basic rules of optimization that you optimize the most expensive part, never the least expensive part. In this case, reduce data access via caching, etc, or yes, email your database vendor and complain about object allocation (not likely to get you very far).
    \King\

    I'm not necessarily talking about just optimization. I'm talking about original design through optimization. You've repeatedly come back to to time, but there are many finite resources in computing - CPU time, RAM space, disk space, network bandwidth, etc. Your view seems to look at RAM space as not being limited (or so high that it doesn't matter), but this isn't true for a number of production setups.

         -Mike
  26. 1000??[ Go to top ]

    I cited 1,000 objects for processing a given request as typical for the projects I've seen, where this is happening in the application code. I've measured this. You're saying "one to three orders of magnitude more garbage than my business logic" is coming from the JDBC driver and app server. <

    Looks like we are talking about two different things:

    I said "business logic" - 1000 objects allocated in BUSINESS logic is certainly not a "typical" average, at least not the way I build applications. You have done a bait and switch! You are talking about 1000 objects in "application code". That is not what I said. All kinds of things could be considered as "application logic", for which the developer has a greater or lesser degree of control of object allocation.

    And different application architectures certainly affect memory usage. I tend to use quite lightweight architectures.

    >> Excuse me - a range of 1 to 3 orders of magnitude tells me that you haven't measured anything. Assuming 1,000 as my baseline, you're saying that your JDBC driver and app server generate "roughly" between 10,000 and 1 million objects per request. Your measurements seem to have a distressingly high margin of error :-/ <
    I am not assuming your baseline of 1000 objects per request, obviously.

    And anyway, working from *your* assumption of 1000 objects created in application code, how in hell is an extra 10, 50 or even 100 objects created in log messages going to affect performance??
  27. 1000??[ Go to top ]

    \King\
    Looks like we are talking about two different things:
    I said "business logic" - 1000 objects allocated in BUSINESS logic is certainly not a "typical" average, at least not the way I build applications. You have done a bait and switch! You are talking about 1000 objects in "application code". That is not what I said. All kinds of things could be considered as "application logic", for which the developer has a greater or lesser degree of control of object allocation.

    And different application architectures certainly affect memory usage. I tend to use quite lightweight architectures.
    \King\

    I didn't intend to bait and switch you. In many systems I've been involved in and measured object allocation rates (among other things) 1000 objects per request in the application code is typical. You didn't refute that as abnormal, so I assumed that your own work was in the same ballpark. Apparently it's not.

    I myself strongly prefer lightweight frameworks, unless truly unusual requirements push me into something more complex. One of the many benefits of this is that such frameworks usually create very few objects - log4j is a great example of this - it goes to great lengths to avoid allocating objects if the current logger/level is disabled, and even when logging is enabled, it has a number of optimization tricks to speed things up. This is a good thing in a product that's going to be in the lowest level of application (and other product) usage.

    In any case, while your designs and attitude seem reasonable, I've seen many designs and applications where rampant object allocation was the norm, and ultimately caused problems.

    \King\
    I am not assuming your baseline of 1000 objects per request, obviously.
    And anyway, working from *your* assumption of 1000 objects created in application code, how in hell is an extra 10, 50 or even 100 objects created in log messages going to affect performance??
    \King\

    The number isn't necessarily that low. Consider a trivial example, where the business logic is iterating over a result set and doing some calculations. It's common in some circles to put debugging statements in that iterator loop to see what data is being processed. So if you're iterating over 100 rows, that's 100 objects right there at a minimum. That's in one little piece of a request. Add in any other loops that might be part of the request, and then stray logging statements here and there, and it adds up to a surprising number of objects.

    In the end, some people say reasonably "memory isn't too expensive", and use lightweight frameworks and come up with a nice design that has pretty minimal object creation rates per request. But there are alot of developers out there that say "memory is free" - or worse, don't really understand when they're creating objects and when they're not. Or they neglect the impact of loops (or don't realize someone is looping around _them_). If they're lucky they'll end up with an average of 1,000 objects allocated in their code per request (in my experience that's typical). The worst cases are far worse than 1,000.

    Of course people shouldn't go crazy trying to eliminate every possible allocation. That truly is a waste of development resources. But at the same time, when multiple developers indicate that you shouldn't worry about generating temporaries, an awful lot of people take that literally - and end up stuck in the GC thread for 10%/15%/20% of their server CPU time.

         -Mike
  28. typical error[ Go to top ]

    Darren,

    Unfortunately, the passage he refers to is simply incorrect:

    Consider now the case of the very small object, such as the intermediate strings created by the following line of code:
    String result = value1 + value2 + value3 + value4;
    This is a commonly referenced example in which, because String objects are immutable, you find out that value1 and value2 are concatenated to form an intermediate String object, which is then concatenated to value3, and so on until the final String result is created.

    You can easily verify it either by looking at the specification (JLS) or disassembling/decompiling a test case. For example, in jasm:

    new StringBuffer
    dup
    invokespecial void StringBuffer.<init>()
    aload value1
    invokevirtual StringBuffer StringBuffer.append(String)
    aload value2
    invokevirtual StringBuffer StringBuffer.append(String)
    aload value3
    invokevirtual StringBuffer StringBuffer.append(String)
    aload value4
    invokevirtual StringBuffer StringBuffer.append(String)
    invokevirtual String StringBuffer.toString()
    astore result

    There is a new StringBuffer, a new String (from toString) and probably just one new char[] (owned by the StringBuffer initially, then "shared" by the String constructor invoked by the toString of StringBuffer.)

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Easily share live data across a cluster!
  29. Hello,

    I still side with the people who write if(log.isDebugEnabled()) to avoid object creation. You really have to sit in front of a WebLogic (or similar) management console watching the memory usage grow and then see the thing stop for 4 seconds doing GC every 30 seconds in order to appreciate that you want LESS GC very badly.

    In the beginning of writing the application, the SQL query may be more costly than a thousand string constructors and GC events. However, when it comes to streamlining the whole thing, the query may be wrapped by a cache lookup with a high hit rate, and suddenly, your String operations are the bottleneck again. At least I always felt much more comfortable after optimizing away everything possible in the core methods.

    Cheers,
        Henrik
  30. Gavin, what about your Seam now?[ Go to top ]

    Gavin, Quite many people tried your Seam and got out of memory issues. Ooooh, Yes, they are using poor P4 PC with 1GB memory. That's their own problems. Seam is ONLY for E10Ks