JVM Lies: The OutOfMemory Myth

Discussions

News: JVM Lies: The OutOfMemory Myth

  1. JVM Lies: The OutOfMemory Myth (26 messages)

    Kenneth Roper has posted "JVM Lies: The OutOfMemory Myth," addressing what happens when a JVM throws an OutOfMemoryError - as many who've encountered it have noticed, it may say it's out of memory, but it doesn't always look like it, and throwing more RAM at the JVM may help, but that's the wrong solution.
    I expect I am not alone in having the knee-jerk reaction that any application's memory problems can be solved by cranking up the heap. I blame James Gosling, or whoever decided that the JRE 1.1 JVM's heap should default to 64M. Even at the start of my Java programming career in 1998 I remember quickly running out of heap space, and needed to look up what this non-standard -Xmx switch did. Increasing this value made these problems just disappear. However, instead of doing the obvious and increasing the -Xmx, I added extra GC debugging output and attempted to replicate the problem. We have plenty of spare memory on our hardware, so any time spent on such an obvious issue is arguably a waste: there was important business functionality I could be delivering instead of messing around with JVM switches. However, being at times more stubborn than my own good, I insisted on understanding exactly what was going on... [snip] Depending on the flavour of JVM, an OutOfMemoryError can indicate a shortage of memory in one of several areas. These broader concepts are common to generational GC algorithms across the major JVM vendors including Sun, IBM and BEA, although the specifics I refer to below relate to the Sun Hotspot GC model.
    • The first is the tenured generation. This is usually what I mean when I say "the heap". Memory is segmented into several generations, however it is when the tenured generation is full, and cannot be expanded any further, that the JVM considers itself OutOfMemory.
    • The second is the permanent generation. This does not resize during the life time of the application, regardless of how much free space may exist in the rest of the heap, but remains at whatever it was originally set to (default is 64K). Should this prove too small for the perm generation, then the JVM will throw an OOME even if there's plenty of heap left. Adding the -XX:+PrintHeapAtGC switch will tell you if this is the case.
    • The third possibility is your operating system is out of memory, e.g. you've asked for a 2GB heap on a box with 1GB RAM and 512MB swap space (not a typical server, admittedly, but serves as an example).
    Great stuff.

    Threaded Messages (26)

  2. Re: JVM Lies: The OutOfMemory Myth[ Go to top ]

    Of course all of this ignores the very real possibility that you or someone whose code you are using gobbled horrific amounts of memory. It's a good idea to first look for what's using all the memory and see if that's appropriate -- rather than trying to just toss memory at the problem. Of course looking for what's really using the memory can be non-trivial...
  3. closures[ Go to top ]

    It will be interesting to see what happens in regards to OOM when and if closures are added to Java, specifically, if they are implemented similarly to anonymous inner classes. Maybe I'm talking out of my arse, but it seems in this case, Perm Gen space use would explode. -- Bill Burke http://bill.burkecentral.com
  4. Re: closures[ Go to top ]

    With popular framworks like Spring and Hibernate dynamically creating proxy classes left and right, I'd say PermGen space has already exploded.
  5. Re: closures[ Go to top ]

    I think it started before Spring and Hibernate. JSPs didn't help the Perm Space cause either.
  6. Re: closures[ Go to top ]

    I think it started before Spring and Hibernate. JSPs didn't help the Perm Space cause either.
    Anybody know why they don't let the Perm Gen space grow like a normal heap? Would also be cool if classes could be GCed.
  7. Re: closures[ Go to top ]

    Anybody know why they don't let the Perm Gen space grow like a normal heap? Would also be cool if classes could be GCed.
    Perm Gen can grow. You set min and max (at least in Sun's JVMs) similar to heap. Also classes can be GC'ed -- but the rules for when they can be GC'ed are much more restrictive than for other objects (see the JVM specs for details).
  8. JDK update 4[ Go to top ]

    Anybody know why they don't let the Perm Gen space grow like a normal heap? Would also be cool if classes could be GCed.
    http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6541037
  9. OutOfMemory Problem[ Go to top ]

    I had the same problem with JBoss as well and increasing the permspace was the solution.
  10. Re: JDK update 4[ Go to top ]

    Dear god. I hope this doesn't mean that (more) clueless Java developers will add System.gc() all over the code. I've seen countless examples of System.gc() being used to solve memory problems in the same way that sleep() solves race conditions (if you get my meaning). Nothing pisses me off more than having to explain to co-workers why this is the wrong approach :) I would love to know what actual use-case Sun's customer was having that required this colorful addition.
  11. Re: JDK update 4[ Go to top ]

    But the mentioned bug is more about the classes loaded are not cleaned up during GC and not about resizing PermSpace (which can be done using -XX:PermSize= and -XX:MaxPermSize=) regards, Ingo
  12. A 4th reason[ Go to top ]

    You can also get an OutOfMemoryError if you've run out of address space. For example, you can easily chew up gigabytes of address space through thread creation. Just one more reason to do the reasonable thing and use thread pools.
  13. Re: A 4th reason[ Go to top ]

    Agree, you can get a: java.lang.OutOfMemoryError: unable to create new native thread when your application is trying to start too many threads. In that case you might need to decrease the default stack assigned to each thread using the -Xss parameter, so that each single thread has less stack but you can create more of them. For some operating systems this is not enough, you should decrease the O.S stack size using the "ulimit -s" command.
  14. MythBusters[ Go to top ]

    "JVM Lies: The OutOfMemory Myth": So the solution to my problem was:
    1. reduce the heap allocated to the JVM.
    2. remove the memory leaks caused by native objects not being freed in a timely fashion.
  15. GCViewer[ Go to top ]

    We've found gcviewer invaluable for quickly accertaining whether we have slow memory leaks / too frequent or too costly Full GCs http://www.tagtraum.com/gcviewer.html
  16. Beautiful Evidence: Metric Monitoring - GC http://blog.jinspired.com/?p=33 You can also correlate your GC metrics with other metrics including transactions & traces http://blog.jinspired.com/?p=32 as well as probes http://blog.jinspired.com/?p=164 This is all available in the free development edition. This blog article shows how you can analyze the problem from both ends (allocation -> collection). http://blog.jinspired.com/?p=156 A very old article shows how to determine whether the OOM errors are related to concurrent capacity problems. http://www.jinspired.com/products/jxinsight/outofmemoryexceptions.html William
  17. JRockit[ Go to top ]

    This article and discussion illustrate two advantages of using the JRockit JVM, advantages that you particularly love when working on the production/operations side: * JRockit does not need a separate isoteric PermGen setting, thereby simplifying your work. I have lost count on all the occasions where we deployed a new version of an application to a non-JRockit based installation, only to get an OutOfMemory exception a couple of hours later. Often it turned out that a new 3rd party library had been added (names withheld to protect the guilty ;) and when all classes had loaded, the total amount of memory needed for class definitions (once again) exceeded our PermGen setting. Some links of interest: http://www.jroller.com/agileanswers/entry/preventing_java_s_java_lang http://deepthoughts.orsomethinglikethat.com/2006/12/23/run-jboss-as-a-windows-service-on-beas-jrockit-jvm-using-java-service-wrapper-not-javaservice/ http://crashingdaily.wordpress.com/2007/02/04/crashing-tomcat/ * JRockit provides powerful low-overhead monitoring tools (e.g. the graphical JRockit Mission Control) that actually let you see exactly what's going on inside your JVM in real-time, thereby letting you find the root cause of your problems vs just guessing, and be it in development, test or production. Especially when coming from a non-JRockit environment it's an absolute bliss to have all this information presented to you - graphically, in real-time and even in production: - number of instances per class with expandable links to referrers, - lock and contention visualization, - time spent for method optimizations (current and delta), - heap distribution, - virtual memory page usage, etc. - API for extending the core functionality, e.g. adding filters or focusing on certain classes or parts of your application I would recommend anyone working with understanding, analyzing and optimizing their JVM behaviour to have a look at it. /Par
  18. Re: JRockit[ Go to top ]

    To be clear you cannot use JRockit tools within a non-JRockit environment and frankly it is questionably the real value offered by such data as you are basically testing your application in a completely different execution environment to that of your production environment. I understand that is not always possible to recreate the same environment and workload conditions in a test environment but to use a completely different JVM seems redundant and counter productive to the testing goal. There are many profiling and performance management tools that will work within a standard JVM environment and can be used across multiple vendor platforms whilst providing much more application level contextual information (is this not the whole point). Frankly a lot of what is reported by the JRockit tooling has been in other tools for a very long time. The real benefit in use such tooling seems to be in tuning the particular JVM internals and not the application code. William
  19. Re: JRockit[ Go to top ]

    To be clear, that has never been claimed or even hinted at. My suggestion, for those who have an open mind, was to try out the JRockit tools against their applications running on a JRockit JVM (a JVM as much "standard" as any other, that's the whole idea with Java JVM:s after all), and if you like the results, consider switching to JRockit even for your runtime to gain the advantages mentioned. Also, I understand that you want to plug your product, William, but please do so by stating specific facts that other readers might learn something from rather than using vague (possibly and arguably wrong) negatives about what you perceive as competing software (in this case something that can be had for free vs buying a product). As for the real benefit of using the JRockit tools, I can assure you that both me and my colleauges have learned a lot about our application code and have been able to improve it accordingly. I am sure, if we were to raise it to a selling/marketing level, BEA would be able to provide you with more customer feedback to support that claim. Other than that, I am sure your product is great. :)
  20. Re: JRockit[ Go to top ]

    Par, My follow-up was not to plug my product but just to clarify that tooling promoted by yourself does ** only ** work with the JRockit VM. I do not perceive the BEA JRockit tooling as competition. This is nonsense. I have had a very good working relationship with the VM team itself and they have been very quick to resolve issues for us. I am sure they would be more than willing to state that our joint discussions in the past have in some way driven their own tool development. For example we had thread contention monitoring at the trace level (method invocation) years before it appeared in any tool include the recently released JRockit latency analyzer. We were also one of the few performance management tools to use their pre-Java 5 Mgmt API. We have in the past been extremely open (sometimes a little too open) in our discussions with them which we would not have been if we thought they were actually competition. This is not a free vs buy issue. JRockit is available in two editions: Developer Edition—free of charge and time limited; and Enterprise Edition—priced per CPU. JXInsight has a FREE Development Edition as well as a licensed Server Edition. William
  21. To be clear...[ Go to top ]

    To be clear you cannot use JRockit tools within a non-JRockit environment and frankly it is questionably the real value offered by such data as you are basically testing your application in a completely different execution environment to that of your production environment. I understand that is not always possible to recreate the same environment and workload conditions in a test environment but to use a completely different JVM seems redundant and counter productive to the testing goal.

    There are many profiling and performance management tools that will work within a standard JVM environment and can be used across multiple vendor platforms whilst providing much more application level contextual information (is this not the whole point). Frankly a lot of what is reported by the JRockit tooling has been in other tools for a very long time. The real benefit in use such tooling seems to be in tuning the particular JVM internals and not the application code.

    William
    Running under a profiler is not realistic either in some cases, esp. due to the performance degradation. I've had issues in code that were tracked down easily with jrockit that would have taken months of continous running under some of the other profilers to find. And fixing it under JRockit made the leaks go away under other VMs as well. Testing an app under a different VM is generally no harder than testing it under a profiler.
  22. Re: To be clear...[ Go to top ]

    Hi Jesse,
    Testing an app under a different VM is generally no harder than testing it under a profiler.
    I really hope you were deliberately being overly simplistic in formulating the statement above. A different JVM is a completely different beast especially with regard to GC and threading behavior. Changes in GC alone can introduce concurrency issues which then result in performance and reliability problems - previously undetected. Unless of course we are talking about a HelloWorld sized Java application. We test our product on many different platforms and I am always amazed at the difference and deviations from "expected" behavior and execution times across each platform. Call Stack Capture: Sun vs BEA http://www.jinspired.com/products/jxinsight/callstackbenchmark.html Java 5 Mgmt Sampling: IBM vs Sun vs BEA http://blog.jinspired.com/?p=189 Yes a typical code level profiler can easily perturb the timing resulting in similar problems mentioned above but I was not talking about a low level code profiling. There are many solutions on the market that can instrument selective parts of the applications and capture event profiles at extremely low overhead with relatively good consistent across platforms and JVM vendors. Hybrid Profiling http://blog.jinspired.com/?p=190 At the end of the day no one is realistically going to use a different JVM in a pre-production or production environment. Organization will not even consider a different version/build of the same vendors JVM. This does not happen in practice - IBM and Sun still dominate the production deployments. William
  23. Re: To be clear...[ Go to top ]


    I really hope you were deliberately being overly simplistic in formulating the statement above. A different JVM is a completely different beast especially with regard to GC and threading behavior. Changes in GC alone can introduce concurrency issues which then result in performance and reliability problems - previously undetected. Unless of course we are talking about a HelloWorld sized Java application.
    Azul Systems sells gear with 100's of CPUs and a fully concurrent GC - and we find the most amazing bugs in old supposedly stable programs. Adding more CPUs of course allows more concurrency (and thus more opportunity for latent datarace bugs to appear) - but also adding concurrent GC does as well. Just the timing of soft-ref clearing has exposed any number of bugs in supposedly stable programs. The owners of these programs were definitely heading for trouble the next time they upgraded JVMs - Azul just tends to get there first. Cliff
    cliffc at azulsystems dot com http://www.azulsystems.com
  24. Re: To be clear...[ Go to top ]

    Running under a profiler is not realistic either in some cases, esp. due to the performance degradation.
    Blatant plug for Azul Systems' gear: our profiling tools are always "always-on", no special launch flags. You can attach to week-old production app, see leaks, surf the heap, see heap allocation rates, GC cycle & pause times, look at the JIT'd code (annotated w/hardware perf counters), live thread stacks, hot contended locks, I/O rates, etc...
    I've had issues in code that were tracked down easily with jrockit that would have taken months of continous running under some of the other profilers to find.

    And fixing it under JRockit made the leaks go away under other VMs as well.

    Testing an app under a different VM is generally no harder than testing it under a profiler.
    Ditto for Azul Systems'. Buy one today for your department-wide Java development & QA! :-) ... I'm only sorta being tongue-in-cheek here: a small Azul box makes a great departmental-sized development box: ~100 cpus sharing 48Gig all in a flat SMP package. You can test with more cpus than you'll use in production (unless you deploy on our gear of course) and shake out your datarace bugs in development. Cliff Click
    cliffc at azulsystems dot com http://www.azulsystems.com
  25. Turn on permgen garbage collection[ Go to top ]

    It's possible to turn on garbage collection for the permgen space. See http://my.opera.com/karmazilla/blog/2007/03/13/good-riddance-permgen-outofmemoryerror for an explanation.
  26. EJBs and object leaks[ Go to top ]

    I really remember fighting OOMEs when EJB 1.1 was around...sloppy programmers would load tables with a large number of rows and then scratch their heads when the JVM blew up. Another cause of OOMEs is just bad programming when it comes to object clean-up. Some programmers just rely too heavily on the GC and the Java selling point of "No memory leaks". I have even fell victim to not properly coding objects to allow the GC to get to them. I do avoid calling the System GC method at all costs. However, there have been a few rare circumstances when manually calling the GC was the only way to get back memory resources. Regards, Tom Pridham
  27. There is another set of real world problems related to out of memory error. Let us assume that we are writing a shrink-wrap client application using a 32 bit SUN JVM. The application needs to work with wide range of in-memory data sizes, for instance a picture editor or a word processor. The application is quite well written, does not have memory leaks and the permanent generation is sized properly. First problem we are running into is how to set the maximum heap size for JVM. If we set it too low, the application may run out of memory with large input data. If we set it too high, JVM may not start due to lack of contiguous chunk of virtual memory. The usual solution to that would be to expose maximum heap size parameter to the end-user, but the end-user has no idea what this means and never was in a situation where this parameter needs to be set. An alternative would be to write a special native JVM launcher that will examine layout of the virtual memory and determine the maximum valid value of this parameter automatically. Does anybody have a better solution to this (besides switching to JRockit or 64 bit JVM)? Second problem surfaces when an out of memory error eventually occurs, as it is not easily possible to completely prevent it unless the size of the data being processed is limited or the application is written in a very special way to hold only part of data in memory. In the case of this error, we should preserve in-memory data in a non-corrupt state and give user a meaningful explanation of what has happened and what the next steps are. One way to ensure validity of the data, would be by having some sort of in-memory transactions or undo mechanism or self healing data structures. In addition, we would need to free up enough memory to be able to get to a valid state of in-memory data and then save it onto disk (there are several straightforward ways to do this). It is less obvious what to tell user to do next. Are there any proven techniques or frameworks to deal with these issues? Third problem has to do with full garbage collection breaking data locality. When a system is stressed with respect to memory and has swapped part of JVM heap onto disk and the full garbage collection is kicked in, the JVM is in effect completely stalled as it need to constantly thrash disk in order to access each piece of heap at least once. Is there any way to mitigate this issue? Artem