Discussions

News: Top 10 Performance Problems taken from Zappos, Monster, Thomson and Co

  1. For a recent edition of the Swiss Computerworld Magazine we listed our Top 10 Performance Problems as we have seen them over the years when working with our clients. I hope this list is enlightening – and I’ve included follow-up links to the blogs to help better understand how to solve these problems:

    #1: Too Many Database Calls,
    #2: Synchronized to Death,
    #3: Too chatty on the remoting channels,
    #4: Wrong usage of O/R-Mappers,
    #5: Memory Leaks,
    #6: Problematic 3rd Party Code/Components,
    #7: Wasteful handling of scarce resources,
    #8: Bloated web frontends,  #9: Wrong Cache Strategy leads to excessive Garbage Collection and #10: Intermittent Problems

    Read the full blog with detailed information about all these problems and links to follow up articles.

    Threaded Messages (8)

  2. Good Summary[ Go to top ]

    This is a very good summary and reflects a lot of the issues that I have seen.  One I didn't sse (perhaps I missed it) that is probably the worst is unecessary allocation of memory for lack of a better name.  This may have some overlap with #5 (memory/space leaks) but I think there is a meaningful distinction.

    I see a very common patten of database development in situations where there are a significant number (say hundreds or thousands but I often deal with 100s of thousands) of fairly large objects that are used to build some sort of output or result.  What many (most?) developers do in this situation is build a large collection of these objects and once that collection is fully filled, they take each item from the collection and process it individually.  This adds a lot of extra memory overhead and a lot of time too as the memory must be allocated and GC'd.  With a small change, each object can usually be handled as it is retrieved and with modern garbage collectors, you tend to use little more than the space required by one of these objects for the entire process.

    With web services, this has another twist because popular WS*  frameworks will not start serializing XML to the client (or parse it) until the entire object set is created or the entire message has been recieved.  You end up with highly serial processes with high memory requirements.  This slows down the services considerably in relation to what they could do.

  3. Good Summary[ Go to top ]

    Great feedback. You are right - Memory usage and the resulting GC problems are another big topic area of performance problems. With GC there is also a lot to optimize when tweaking the GC settings of the JVM

  4. Good Summary[ Go to top ]

    Great feedback. You are right - Memory usage and the resulting GC problems are another big topic area of performance problems. With GC there is also a lot to optimize when tweaking the GC settings of the JVM

    I'm tempted to believe that in a lot cases twiddling with the GC settings is dealing with the symptoms of the issue and not the source of the problem.  We had a case in the past where the GC limit was bumped up on a server to avoid OOMEs but when that was done, the GC didn't run very often.  Because DB connections were not being closed properly, connections were only returned to the pool when the objects using them were GC'd.  So increasing the max heap resulted in connection leaks.   It's much better to solve the root causes IMO.

    I'm sure there are many legitimate needs for GC tweaks are needed so no flames please.

  5. Good Summary[ Go to top ]

    I totally agree with you on that point. I guess I wasnt clear with my last statement. I didnt mean just pumping up the memory - because it doesnt solve the problem as you said. I was more talking about fine tuning different Memory and GC settings such as the size of the heap generations or certain settings that affect the strategy of the GC. We optimized GC and Memory settings based on the memory profile of our applications. If you are dealing with a certain amount of short living objects you may want to give the eden space enough size to avoid some of these objects of being promoted to higher generations. There are quite a few settings that are interesting to explore.

  6. Good Summary[ Go to top ]

    I totally agree with you on that point. I guess I wasnt clear with my last statement. I didnt mean just pumping up the memory - because it doesnt solve the problem as you said. I was more talking about fine tuning different Memory and GC settings such as the size of the heap generations or certain settings that affect the strategy of the GC. We optimized GC and Memory settings based on the memory profile of our applications. If you are dealing with a certain amount of short living objects you may want to give the eden space enough size to avoid some of these objects of being promoted to higher generations. There are quite a few settings that are interesting to explore.

    Sorry, I understand what you mean.  I just haven't had to do much of this.  The standard settings have usually been fine for my needs so it seems unecessary to me.  I don't mean to say this is really the case.  It's just my experience.  It's also something that isn't normally my area of responsibility so I try not to butt my nose in.  There may be more of this going on with the applications I deal with than I realize.

    Actually, I have messed around with using concurrent GC so I'm full of it.

  7. Good Summary[ Go to top ]

    Sorry to be more clear, I'm usually taking something that executes in say 100 seconds and modifying the code so it runs in around 1 or taking something that uses a heap of 2 gigs and getting it down to hundreds of K with the kinds of changes I mentioned above.  After that, an additional 50% speed up usually isn't even worth bothering over.  I wish I had more opportunity to work on things where squeezing out every last millisecond really mattered.

    So in that context, I've listened to a lot of bad developers talk about modifying GC settings to fix performance issues that are a result of bad coding practices that are (from what I can tell) quite common.  I've had quite a few developers argue with me and tell me I'm wrong about these things and then talk about GC settings.  For example, I once asked a developer to not put all the records into a list and then process them explaining that it could result in a OOME.  He demurred and said he was afraid of keeping the cursor open too long.  In QA, the program failed with an OOME and even the maximum heap size was not adequate to solve the issue.

    I'm not saying this to too my own horn or anything.  I just get a distinct impression that there's a general lack of understanding of this in the Java development community.  It maybe that it's because this is what tools like Hibernate and iBatis (at least appear to) do by default and it thererfore seems like the 'right' way to do it.

  8. Large Query Support[ Go to top ]

    > general lack of understanding of this in the Java development community

    I concur with James. You can look to use Hibernate StatelessSession, Ibatis RowHandler and Ebean ORM has QueryListener that provides Persistence context per Object graph. These can be much more efficient memory wise for processing large queries on a per object graph basis.

    Currently there is no similar mechanism in JPA for large queries and I do believe there is a general lack of understanding around this issue.

  9. Large Query Support[ Go to top ]

    > general lack of understanding of this in the Java development community

    I concur with James. You can look to use Hibernate StatelessSession, Ibatis RowHandler and Ebean ORM has QueryListener that provides Persistence context per Object graph. These can be much more efficient memory wise for processing large queries on a per object graph basis.

    Your right there with me but this is just the tip of the iceberg.  Consider a simple read-only web service that pulls from a database.  The following occur in creating returning a response:

    1.  Query the database

    2.  Build an in memory representation of the response

    3.  Transform that data into XML

    4.  Send the bytes across the socket

    Assume that these steps take 2 seconds, 1 second, 1 second, and 1 second respectively.

    In most approaches that I see, thes steps all take place sequentially so it takes 5 seconds total.

    If, however, you start step two as soon as you have enough detail to build the first object and start tranforming and transmitting as soon as possible, you can reduce this time even on a single core.  The reason is that the db and the client are generally on a different machine so instead of your service sitting there like a bump on a log while in IO wait, you are building your objects.  If your client is built in a similar way, the user (assuming there is one) might even see some part of the response before the query finishes.

    This is often easier said than done because the libraries we use often are not designed in this manner and there are some downsides to the approach but I think people need to be more aware that the way things are usually done is not the only way to do it.  At the very least we should all understand how the common approach works and why it doesn't support large messages very well.