Discussions

News: Detecting Memory Problems Before they Hurt

  1. Detecting Memory Problems Before they Hurt (11 messages)

    In the latest issue of The Java Specialists' Newsletter, Heinz demonstrates how we can plug into the JDK 1.5 memory MX bean to detect when the memory usage of the Old Generation Pool exceeds some threshold.

    OutOfMemoryError Warning System

    In Issue 061 of my newsletter, I asked readers whether their applications had ever caused an OutOfMemoryError. I then asked them to email me if they would like to know how to receive a warning, shortly before it was about to happen. Wow, the response! The requests kept on pouring in, and so far, I have had over 200 inquiries. At the time, my ideas for a warning system were sketchy at best, and would have been hopelessly inaccurate, in comparison to what JDK 1.5 offers us.

    JDK 1.5 has added some wonderful new management beans that make writing an OutOfMemoryError Warning System possible. The most difficult part was probably finding resources on the topic. Google turned up two resources: The JDK documentation and a website written in Japanese ;-)

    An OutOfMemoryError (OOME) is bad. It can happen at any time, with any thread. There is little that you can do about it, except to exit the program, change the -Xmx value, and restart the JVM. If you then make the -Xmx value too large, you slow down your application. The secret is to make the maximum heap value the right size, neither too small, nor too big. OOME can happen with any thread, and when it does, that thread typically dies. Often, there is not enough memory to build up a stack trace for the OOME, so you cannot even determine where it occurred, or why. You can use the exception catching mechanism of Issue 089, but that is then an after-the-fact measure, rather than preventative.

    The article then goes on to demonstrate how to do this.

    The newsletter is available at [Issue 092] - OutOfMemoryError Warning System.
  2. I don't know why there are no comments on this but this is pretty clever stuff, at least I thought so anyway. J2SE 5 version 1.5 is full of new features, there are more changes in 1.5 than 1.4 (from 1.3) and probably even more than the change to Java 2 way back in the 90s.

    Good stuff, thanks for the posting.

    -John-
  3. This is great news. But fixing soft references in the JVM would help more.

    I have an application that uses caching. But I can't use soft references since they clear too fast and not in a FIFO order. With this hack I can use the OOM warnings to clear my caches and hopefully avoid OOM exceptions.
  4. -XX:SoftRefLRUPolicyMSPerMB
  5. fixing soft references in the JVM would help more
    Are they broken? Or just overly-pessimistically implemented? (My understanding is the latter.)

    Also, for supporting large caches, have you considered any of the caching solutions that support disk overflow etc.? There are even open source ones with disk support. If you are clustering, check out Coherence .. it can aggregate the caches from all the cluster nodes into one giant clustered cache, and move stuff off heap (into NIO buffers, disk storage, etc.) to avoid the heap issues that you described.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  6. It is very trivial to contol soft references, this is very trivial soft cache implementation http://voruta.sourceforge.net/xref/net/sf/voruta/SoftRefMemoryCache.html, you can fork it. You can use "MAX_STRONG_REF" as parameter to tune this cache.
  7. It is very trivial to contol soft references, this is very trivial soft cache implementation http://voruta.sourceforge.net/xref/net/sf/voruta/SoftRefMemoryCache.html, you can fork it. You can use "MAX_STRONG_REF" as parameter to tune this cache.
    Will definitely check out your memory cache :-)

    I have seen OOME occur in places where the Soft Reference would not have been of any use. For example, I had to migrate some MySQL database access to Microsoft SQL Server. The JDBC access code worked fine under MySQL, but when we ported it, we started getting OOME after a few days. This is obviously rather frustrating, as it becomes difficult to establish whether the bug has been detected and removed. I eventually gave up and simply rewrote the entire database access mechanism.

    When we logged the OOME, all we saw was:

    java.lang.OutOfMemoryError

    and that was it, no thread dump, nothing. Once in a blue moon we saw an OOME with a stack trace, and that was the only reason why we suspected the database, since the OOME appeared while we were reading from the database.

    With an early warning system, you can get a list of what the threads are busy doing, and then have a better chance of figuring out the problem.

    Last Thursday I also went and wrote a similar early warning system for monitoring the number of threads, and for auto-detecting Thread deadlocks in JDK 1.5.

    Heinz
  8. I don't know why there are no comments on this but this is pretty clever stuff, at least I thought so anyway. J2SE 5 version 1.5 is full of new features, there are more changes in 1.5 than 1.4 (from 1.3) and probably even more than the change to Java 2 way back in the 90s.Good stuff, thanks for the posting.-John-
    Thanks for the nice comment John :-)

    Heinz
  9. If I'm reading the code correctly, it looks like it's too specific and contains an error. By too specific, I mean that an OOE can be the result of running out of space for the permanent generation as well as for the old generation. As for the error, it will be triggered when the permanent generation causes a MemoryNotificationInfo.MEMORY_THRESHOLD_EXCEEDED event to occur. In that case, the code will get the max and used memory from the *old* generation as well as modify the usage threshold on the *old* generation. Here's some alternate code (hope it formats properly):

    package com.dotech;

    import java.lang.management.ManagementFactory;
    import java.lang.management.MemoryMXBean;
    import java.lang.management.MemoryNotificationInfo;
    import java.lang.management.MemoryPoolMXBean;
    import java.lang.management.MemoryUsage;
    import java.util.ArrayList;
    import java.util.Collection;
    import java.util.HashMap;
    import java.util.Map;
    import javax.management.Notification;
    import javax.management.NotificationBroadcaster;
    import javax.management.NotificationListener;
    import javax.management.openmbean.CompositeData;

    public class MemoryWarningSystem {

        private static final Map<String, MemoryPoolMXBean> POOLS = new HashMap<String, MemoryPoolMXBean>();

        private static final NotificationListener listener = new NotificationListener() {
                public void handleNotification(Notification notification, Object handback) {
                    if (MemoryNotificationInfo.MEMORY_THRESHOLD_EXCEEDED.equals(notification.getType())) {
                        CompositeData cd = (CompositeData)notification.getUserData();
                        MemoryNotificationInfo info = MemoryNotificationInfo.from(cd);
                        MemoryUsage memUsage = info.getUsage();
                        String poolName = info.getPoolName();
                        System.out.printf("Notification: %s, count=%d, usage=[%s]%n",
                                          poolName,
                                          info.getCount(),
                                          memUsage);
                        MemoryPoolMXBean memPool = POOLS.get(poolName);
                        setUsageThreshold(memPool, 0.8);
                    }
                }
            };

        public static void setUsageThreshold(MemoryPoolMXBean memPool, double percentage) {
            MemoryUsage memUsage = memPool.getUsage();
            long max = memUsage.getMax();
            memPool.setUsageThreshold((long)(max * percentage));
        }

        public static void main(String[] args) {
            MemoryMXBean mem = ManagementFactory.getMemoryMXBean();
            ((NotificationBroadcaster)mem).addNotificationListener(listener, null, null);

            for (MemoryPoolMXBean memPool : ManagementFactory.getMemoryPoolMXBeans()) {
                if (memPool.isUsageThresholdSupported()) {
                    POOLS.put(memPool.getName(), memPool);
                    setUsageThreshold(memPool, 0.6);
                }
            }

            Collection<byte[]> data = new ArrayList<byte[]>();
            while (true) {
                data.add(new byte[1024 * 1024]);
                try {
                    Thread.sleep(250);
                } catch (InterruptedException exc) {
                    exc.printStackTrace();
                    System.exit(1);
                }
            }
        }
    }

    When run:

    java -Xms4m -Xmx8m -cp classes com.dotech.MemoryWarningSystem

    You'll get something like:

    Notification: PS Old Gen, count=1, usage=[init = 1966080(1920K) used = 4194368(4096K) committed = 5308416(5184K) max = 5636096(5504K)]
    Notification: PS Old Gen, count=2, usage=[init = 1966080(1920K) used = 5242960(5120K) committed = 5308416(5184K) max = 5636096(5504K)]
    Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
  10. Permanent Generation OOME[ Go to top ]

    If I'm reading the code correctly, it looks like it's too specific and contains an error. By too specific, I mean that an OOE can be the result of running out of space for the permanent generation as well as for the old generation. As for the error, it will be triggered when the permanent generation causes a MemoryNotificationInfo.MEMORY_THRESHOLD_EXCEEDED event to occur. In that case, the code will get the max and used memory from the *old* generation as well as modify the usage threshold on the *old* generation.
    Very interesting, Kris. I have not encountered an OOME of the Permanent Generation. Thanks for the correction :-)

    Incidentally, I am currently working on a newsletter to warn you if you are creating too many threads, since that has been a source of OOME for me. There is an additional benefit that I will show when it's ready :-)

    Kind regards

    Heinz
  11. I promised Heinz a follow-up showing how to set the size of the permanent generation, but I thought I'd add another example as well. I'm currently dealing with a commercial app server that seems to have a leak in its permanent generation. The problem is accelerated in development settings where frequent app redeployment occurs. So, my initial thought is that this has to do with poor class loader management, and hence the example showing how to blow out the permanent generation through class loading. I run this with:

    java -XX:PermSize=4m -XX:MaxPermSize=4m -cp classes com.dotech.LotsOfClassLoaders

    Which produces:

    Exception in thread "main" java.lang.OutOfMemoryError: PermGen space

    PermSize sets the initial size of the permanent generation while MaxPermSize sets the maximum size. If you want to play with this on your system, you'll have to use your own JAR URL and class name. Just be sure that you're not using something that can be loaded by the built-in class loaders.

    package com.dotech;

    import java.net.MalformedURLException;
    import java.net.URL;
    import java.net.URLClassLoader;
    import java.util.ArrayList;
    import java.util.Collection;

    public class LotsOfClassLoaders {

        public static void main(String[] args) {
            ClassLoader parentLoader = LotsOfClassLoaders.class.getClassLoader();
            Collection<ClassLoader> loaders = new ArrayList<ClassLoader>();
            try {
                URL jarUrl = new URL("file:///home/kschneid/apache/jakarta/commons/commons-httpclient-2.0/commons-httpclient-2.0.jar");
                URL[] urls = new URL[] { jarUrl };
                while (true) {
                    ClassLoader loader = URLClassLoader.newInstance(urls, parentLoader);
                    try {
                        Class clazz = loader.loadClass("org.apache.commons.httpclient.HttpClient");
                        loaders.add(loader);
                    } catch (ClassNotFoundException exc) {
                        exc.printStackTrace();
                        System.exit(2);
                    }
                }
            } catch (MalformedURLException exc) {
                exc.printStackTrace();
                System.exit(1);
            }
        }
    }
  12. It seems that it just monitoring heap memory usage. Is there any api to raise a exception if varible is used more than specified size?