Home

News: Artima Article: "Myths of Memory Management"

  1. Artima Article: "Myths of Memory Management" (76 messages)

    In this Artima Article, author Christopher Diggins discusses "Myths of Memory Management," saying that "there is a widespread belief that garbage collection leads to automatic memory management, and that without a GC memory management is necessarily complex. These are both false."
    Given that a programmer in Java or C# should be setting their pointers to NULL, what is the difference between that and explcitly calling a delete? The answer is more or less nothing. The real question is what happens when the programmer makes the inevitable mistake. Undefined behaviour is unacceptable for most of us sane people, but silently keeping things alive is a very hard to detect resource leak. I think a programmer would have the best of both worlds if their language avoided undefined behaviour, threw errors when things aren't deleted when they should, and prevented things from being deleted which shouldn't be.
    What do you think? Is resource management that difficult to get right? Do you think setting references to null is better than using scoping rules to invalidate references?

    Threaded Messages (76)

  2. Artima Article: "Myths of Memory Management"[ Go to top ]

    It could be an interesting feature to have a Development-Time-Only 'delete' in java that throws/logs an exception if other references exist to the object. Of course, a method that returns the number of references that exist to an object would also work.
  3. From the article...[ Go to top ]

    "The best way to assure a resource is released is for the
    > programmer to set pointers to null when they no longer refer
    > to a resource (or let them go out of scope), and to
    > minimize the sharing of objects (which is just good
    > practice)."

    Even the author knows that going out of scope is one way to allow the objects to be marked for GC, but he still thinks setting to null and calling delete are equivalent?

    I agree with the rest, this article is looking for a problem that doesn't exist.
  4. setting refs to null is code smell[ Go to top ]

    I consider setting references to null as a code smell. I rely on proper program structure that ensures stuff goes out of scope. If you need to set a ref to null that means your variable doesn't get out of scope. In other words it is sort of a global variable. Global variables suck for this and other reasons.
  5. setting refs to null is code smell[ Go to top ]

    I consider setting references to null as a code smell. I rely on proper program structure that ensures stuff goes out of scope. If you need to set a ref to null that means your variable doesn't get out of scope. In other words it is sort of a global variable. Global variables suck for this and other reasons.

    I agree but this isn't really what we are talking about. We are talking about setting references to null in order to 'help' the garbage collector. It always surprises me how many people still belive this.

    I will note that there is one case I where someone demonstrated it was necessary when it wouldn't seem to be. In a loop like this:

    while (condition) {
        Object o = getBigObject();
    }

    it's normally fine to leave this alone. However, since o will generally be compiled out to a method level reference, it holds onto the big Object until the next call to getBigObject() changes the reference. If the VM runs out of memory in the middle of getBigObject(), however, it can't reclaim the memory referenced by o, so it throws a OOM. If you set o to null at the end of the loop body, you can resolve this issue.
  6. setting refs to null is code smell[ Go to top ]

    In a loop like this:while (condition) {    Object o = getBigObject();}it's normally fine to leave this alone. However, since o will generally be compiled out to a method level reference, it holds onto the big Object until the next call to getBigObject() changes the reference. If the VM runs out of memory in the middle of getBigObject(), however, it can't reclaim the memory referenced by o, so it throws a OOM. If you set o to null at the end of the loop body, you can resolve this issue.

    If nulling is important as you insist, then the compiler should automatically do it after the loop. If the compiler doesn't do this, then maybe it isn't as important as you claim.
  7. setting refs to null is code smell[ Go to top ]

    If nulling is important as you insist, then the compiler should automatically do it after the loop. If the compiler doesn't do this, then maybe it isn't as important as you claim.

    I amazes me that some poeple think that fallacious arguments are a substitute for actual knowledge.
  8. setting refs to null is code smell[ Go to top ]

    Here's an example of the issue:

        public static void main(String[] args)
        {
            int[] array = null;
            
            for (int i = 0; i < 100; i++) {
                System.out.println(i);
                
                //array = null;
                
                array = new int[300000];
            }
            
            System.out.println(array);
        }

    Set -Xmx1m and run without nulling. It fails with Sun's 1.4.2_04 compiler and VM giving an OutOfMemoryError. Uncomment the line that nulls the array and it executes to completion.
  9. interesting[ Go to top ]

    Here's an example of the issue:

        public static void main(String[] args)
        {
            int[] array = null;
            
            for (int i = 0; i < 100; i++) {
                System.out.println(i);
                
                //array = null;
                
                array = new int[300000];
            }
            
            System.out.println(array);
        }

    -Xmx1m and run without nulling. It fails with Sun's 1.4.2_04 compiler and VM giving an OutOfMemoryError. Uncomment the line that nulls the array and it executes to completion.

    Reading the code, I wouldn't have guessed that's what would happen. thanks for sharing the info.

    peter
  10. setting refs to null is code smell[ Go to top ]

    array = new int[300000];

    That will compile as:
    iconst 300000
    inewarray
    astore array

    After the inewarray op, the stack will contain a reference to a new array of 300000 ints, and the register named "array" will still contain a reference to an old array of 300000 ints. The old array is not GC-able until after the next op, which assigns the new array reference to the register named "array". At that point, nothing retains a reference to the old array, and only then can it be GC'd.

    Peace,

    Cameron Purdy
    Tangosol Coherence: This space available.
  11. setting refs to null is code smell[ Go to top ]

    One point about my code example above. There is nothing special about assigning the ref to null. It's not that the ref is null that allows the array to be GC'd, it' that there are no longer any refs pointing to it. For example,

    array = new int[0];

    Would also allow the 300000 length array to be GC'd before the next array is created. I only add this because it's a common misconception about Java that setting refs to null causes Objects to be GC'd.
  12. setting refs to null is code smell[ Go to top ]

    Flame me if I'm wrong, but in Java an instance variable's scope is bound to the scope of the owning object. The lifespan of an object is initiated when the code creates the object, and ends when the garbage collector collects the object. So the temporal scope of an instance variable is implicit, meaning the programmer can't control it, so by your logic instance variables are a code smell.

    Anyway, if you have a complex object graph, especially one with cyclical references, and you don't set references to null when you're done with the object graph, then the garbage collector has to work really hard to figure that those objects in the object graph are unreachable.

    Of course, in order to break apart the object graph for easy collection, you have to know when you're done with it, and you have to write a method that will break it apart. So you end up with writing a lame destructor (lame because it doesn't deallocate the objects) and calling it at an explicit point in your program. Furthermore, the psuedo-destructed objects should go into an invalid state, so now all the methods on the object (unless they restore it to a valid state) should throw an IllegalStateException if they are called. In C++ you'd get a segfault if the memory was no longer allocated, or glorious undefined behavior if it was deallocated and then the same address was used for something else.
  13. setting refs to null is code smell[ Go to top ]

    Flame me if I'm wrong, but in Java an instance variable's scope is bound to the scope of the owning object.

    If I understand you correctly, you are wrong, or at least offering a very incomplete model for garbage collection. But I will not flame you.
    Anyway, if you have a complex object graph, especially one with cyclical references, and you don't set references to null when you're done with the object graph, then the garbage collector has to work really hard to figure that those objects in the object graph are unreachable

    This is not the case. If you want to make this claim, you need to explain why.

    The simplest type of GC (that I am aware of) is a mark and sweep collector. So let's limit the discussion to that for the time being.

    In a basic mark and sweep GC scheme, the mark phase consists of going through and finding all reachable Objects. It starts from it's root set of references and follows the references until it's marked all reachable Objects. Then, the sweep phase consists of collecting all the Objects that are not marked.

    Here's why cyclical graphs are not difficult to GC. The cyclical references are all reachable or they ar all not. If they are reachable, the mark phase walks to one, then follows the cylce around and marks all of them. If none of the Objects in the cycle are reachable, the mark phase will not ever look at them. What's difficult about that?
  14. setting refs to null is code smell[ Go to top ]

    Ok, so the mark phase mark reachable objects, not unreachable objects. That's what I get for skimming instead of reading. Now I'm having a "duh" moment because that makes a lot more sense.

    Thanks for the explanation.
  15. What do you think? Is resource management that difficult to get right?

    To be honest, yes.

    Sure, things can get inefficient if you don't make an effort to manage some resources, but the point of GC is that you don't have to manage them all. In contrast to the article, my experience is that the design issue of who owns what resource is not inescapable and ignoring it in many cases does not have repercussions, and does not generally result in poor performance. I have no specific examples, just years of experience of letting the GC do its work.
  16. I have to agree as well. While I have no problems with memory management in my own code it can be a real pain when there are many programmers and weak coding standards, in particular when some of the developers are novices.

    The smart pointers in C++ are much more error-prone than a GC:d language and in particular the consequences are more fatal. A mistake with a smart pointer can blow up your program in production (not popular!), while a memory leak is more likely to require controlled restarts or more computing resources. The leaks can be flushed out using a profiler, but they are typically not quite as urgent to fix.

    If you have experienced developers, good coding standards and tight performance requirements manual memory management is fine. In most settings I would pick GC, though.

    However, I certainly miss destructors. A finalizer that may or may not be called at some unknown point doesn't come close.
  17. "However, I certainly miss destructors. "

    Deterministic destructors and stack-based objects would make a world of difference.
  18. I think the author is correct when he says, "Some programmers seem to think that because the language has a GC they simply can turn a blind eye to resource management. I think this accounts for why so many programs written in Java I come accross, consume staggering amounts of resources, and have miserable performance, despite all the research which says how efficient Java is compared to C++."

    You definitely cannot turn a blind eye to memory management. A lot of Java programmers don't bother looking at their resource usage under a debugger and get caught with memory leaks from things like the ever so common static collection resources not being released. If the only time you ever look at your code under a debugger is when there's a bug, then you're definitely vulnerable to resource bugs. Don't worry though...they'll find them in production!

    However, I agree with you that the author overstates the problem. Most memory leaks are actually pretty easy to find and you certainly do not need to set all your objects to null. I find it ironic that this guy clearly doesn't understand the intricacies of Java memory management while at the same time he derides the common Java coder who he feels doesn't seem to understand that his code can get memory leaks.
  19. I more than agree. You don't go far enough though...

    If you want to be able to safely execute untrusted code, you cannot allow for type safety violations. If you allow for explicit memory management you open up the possiblity of orphaning and leaking memory. If you allow for C style typecasting (memory re-interpretation), then you allow type violation. And for garbage collection to work, you need inviolable type safety (so that the heap can be walked).

    So, typesafety, security, and garbage collection all go together. This is not just about preventing mistakes, but about preventing abuse.
  20. Anyone who's ran into malloc errors in C code would say that memory management is difficult. Not mentioning pointer errors, these were major nightmares for me. We had to be exceedingly careful in order to keep resources correctly allocated and deallocated all the time, and it used to take just one small mistake to make the whole program go wild, and sometimes these bugs were very hard to find. I'd trade all that for an automatic GC anytime. Better risk some performance impact than fry brain cells over things than can be automated.
  21. Read the author's self description before worrying about his blog...


    About the Blogger
    -----------------
    Christopher Diggins is a software developer and freelance writer. Christopher loves programming, but is eternally frustrated by the shortcomings of modern programming languages. As would any reasonable person in his shoes, he decided to quit his day job to write his own ( www.heron-language.com ). Christopher hopes to find a company willing to hire him for development on Heron so that he can stop writing about software development techniques in C++. Christopher can be reached through his home page at www.cdiggins.com.
  22. Given that a programmer in Java or C# should be setting their pointers to NULL...
    FUD at its best... Not only it is unnecessary in most situations (read any Java book) but it is been lately discouraged to rely on such technique at all. Anyone who spent years dealing with C/C++ development would agree that productivity leap from a manual memory management in languages like C/C++ to an automatic one like in Java or C# is very dramatic.

    Regards,
    Nikita, GridGain Systems.
  23. Given that a programmer in Java or C# should be setting their pointers to NULL...
    FUD at its best... Not only it is unnecessary in most situations (read any Java book) but it is been lately discouraged to rely on such technique at all. Anyone who spent years dealing with C/C++ development would agree that productivity leap from a manual memory management in languages like C/C++ to an automatic one like in Java or C# is very dramatic.Regards,Nikita, GridGain Systems.

    Finding resource leaks (or performance) isn't so hard with the tools that are available. That being said, being a middleware developer, I do miss the performance benefits of C/C++, specifically, allocating objects on the stack.
  24. Given that a programmer in Java or C# should be setting their pointers to NULL...
    FUD at its best... Not only it is unnecessary in most situations (read any Java book) but it is been lately discouraged to rely on such technique at all. Anyone who spent years dealing with C/C++ development would agree that productivity leap from a manual memory management in languages like C/C++ to an automatic one like in Java or C# is very dramatic.Regards,Nikita, GridGain Systems.
    Finding resource leaks (or performance) isn't so hard with the tools that are available. That being said, being a middleware developer, I do miss the performance benefits of C/C++, specifically, allocating objects on the stack.

    I'm not sure how this compares to stack-based allocation, but are you aware that Object allocation in modern JVMs is much faster than heap based allocation in C++? For short-lived Objects, the collection and deallocation is extremely quick and efficient. Could it be that the need for stack-based allocation is only necessary given the slowness of heap-based allocation in C++?
  25. That being said, being a middleware developer, I do miss the performance benefits of C/C++, specifically, allocating objects on the stack.
    I'm not sure how this compares to stack-based allocation, but are you aware that Object allocation in modern JVMs is much faster than heap based allocation in C++? For short-lived Objects, the collection and deallocation is extremely quick and efficient. Could it be that the need for stack-based allocation is only necessary given the slowness of heap-based allocation in C++?

    Can you actually back that up? I'm sure you're aware that different libc and stdc++ libs use different algorithms, different compilers produce different code, that there's a hell of a lot more hardware to run C++ on than Java, so to which you're reffering to when claiming that?
    I for one have written a small benchmark that:
    - declares and defines a class "object" that contains an int and a string
    - the constructor takes an int and a string parameter (by reference in C++)
    - the ctor initialises it's int and string members with the provided values and then allocates an int[8192]
    - C++ only: destructor deletes[] the array
    - the object class provides a void foo() call that increments the member int
    - main() loops 1000*k (where k = argv.length > 0 ? Integer.parseInt(argv[0]) : 1000)
    - C++ : conditional compilation to test stack versus heap allocation:
    #ifndef ALLOC_DYNAMIC
        object o(i, s);
        o.foo();
    #else
        object* o = new object(i, s);
        o->foo();
        delete o;
    #endif
    - Java :
          object o = new object(i, s);
          o.foo();
          o = null;
    Well, guess what, java (IBMJava2-142 IBMJava2-amd64-142 j2sdk1.4.2 j2sdk1.4.2_04 jdk1.5.0 jdk1.5.0_03 jdk1.5.0_03.64) does 44s/39s/0.3s real/user/sys and C++ (GCC 4.0.1, glibc 2.3.5) does 0.3s/0.2/0.00s real/user/sys. Java runs with -Xmx512m -Xms256m. Also, when declaring a finalizer on the "object" class that decrements a public static _deleted member, the Java code compiled to native code with GCJ kicks all the VM's asses collectively by running almost twice as fast and using half the memory size (have not figured out a way (yet) to tweak GCJ's built-in (hans-boehm based) GC).

    Same test ran with an array of just 512 elements gives:
    0.146s c++/stack, 0.270s c++/heap, 11.158s Java/modern JVM ;) and 6.67s GCJ/native.

    Care to prove the benchmark's flawed? Care to try it yourself?
  26. Can you actually back that up? I'm sure you're aware that different libc and stdc++ libs use different algorithms, different compilers produce different code, that there's a hell of a lot more hardware to run C++ on than Java, so to which you're reffering to when claiming that?

    It's from the article hosted by IBM I linked above. Generational GC allows the free heap space to always be contiguous meaning that the VM never needs to search for fragmented memory. My understanding is that is not the case in C.

    I for one have written a small benchmark

    I hope you realize that small benchmarks are almost always inherently flawed. I don't have any idea how you are timing this. If I had to guess, I'd wager your benchmark is basically timing how long it takes the JVM to start.
  27. I just tested how long it takes to allocate a Object and a small String ("test") on my laptop in Java using an existing speed tester.

    The methods are called via interface references and I call them a million times per test and 100 tests per run to ensure I get a non-zero result for each trial (tester fails if result is less than 10 ms)

    empty-method: ~1.3*10^-8 sec.
    Object alloc: ~4.5*10^-8 sec.
    String alloc: ~1.0*10^-7 sec.

    Unfortunately, I don't have a C or C++ compiler on my laptop and cannot get one. Do these numbers match up with the results you are seeing? I don't have enough context from your post to determine whether that is the case.
  28. Can you actually back that up? I'm sure you're aware that different libc and stdc++ libs use different algorithms, different compilers produce different code, that there's a hell of a lot more hardware to run C++ on than Java, so to which you're reffering to when claiming that?
    It's from the article hosted by IBM I linked above. Generational GC allows the free heap space to always be contiguous meaning that the VM never needs to search for fragmented memory. My understanding is that is not the case in C.
    I for one have written a small benchmark
    I hope you realize that small benchmarks are almost always inherently flawed. I don't have any idea how you are timing this. If I had to guess, I'd wager your benchmark is basically timing how long it takes the JVM to start.

    OK, you'll definitely have to do better than that.
    First of all the developerWorks article does not make the same claims you did in the same terms.
    Second of all you seem to conveniently ignore the difference between 44s and 0.3s, putting it down to VM startup time. Riight.
    Yes, I do know that most microbenchmarks are flawed. Sadly there's this incovenient thing called proof. The article and the discussion was about memory allocation, right? Well the code does just that, allocates memory, repeatedly, and tries to make sure that the calls are not optimised away.
    Here's the code, try it yourself. Ran on AMD Athlon(tm) 64 Processor 3000+, running Fedora Core 4, kernel 2.6.12 x86_64, GCC 4.0.1.
    Code is compacted to save space.
    #include <iostream>
    using namespace std;
    class object {
      public: object(int i, string& s) : _i(i), _s(s) { _arr = new int[512]; _new++; }
        void foo() { _i++; } ~object() { delete [] _arr; _delete++; }
        static int _new, _delete;
      private: int _i; string _s; int* _arr;
    };
    int object::_new = 0;int object::_delete = 0;
    int main(int argc, char**argv) {
      int k = argc > 1 ? atoi(argv[1]) : 1000; string s("ala bala portocala"); cout <  for(int i=0; i<1000*k; i++) {
    #ifndef ALLOC_DYNAMIC
        object o(i, s); o.foo();
    #else
        object* o = new object(i, s); o->foo(); delete o;
    #endif
      } cout << object::_new << "/" <}
    And Java:
    public class memtest {
      public static class object {
        public object(int i, String s) { _i = i; _s = s; _arr = new int[512]; _new++; }
        void foo() { _i++; }
        public void finalize() { _delete++; }
        private int _i; private String _s; private int[] _arr; public static int _new=0, _delete=0;
      }
      public static void main(String[]argv) {
        int k = argv.length > 0 ? Integer.parseInt(argv[0]) : 1000; String s = new String("ala bala portocala");
        System.out.println("doing "+(1000*k)+" iterations");
        for(int i=0; i<1000*k; i++) { object o = new object(i, s); o.foo(); } System.out.println(object._new + "/" + object._delete);
      }
    }

    This one's for smaller arrays (512). Bump it up to 8192 and see what happens.

    So, please, prove me wrong. And next time don't "wager" and don't grossly misrepresent other people's articles, cool ?
  29. And next time don't "wager" and don't grossly misrepresent other people's articles, cool ?

    "Because a copying collector is used for the young generation, the free space in the heap is always contiguous so that allocation of a new object from the heap can be done through a simple pointer addition, as shown in Listing 1. This makes object allocation in Java applications significantly cheaper than it is in C, a possibility that many developers at first have difficulty imagining."

    What did I misrepresent about this? And you are acting like an ass. Does my claim personally offend you? What's your problem?
  30. Not quite the expert on the matter but I seem to remember that Strings are pooled in the JVM.
    Also I think it's fair to assume that applications that care about memory allocation speed usually crunch data for a living. While it's fair to say that these applications usually use pools of memory and don't just allocate whenever they feel like, I think it would make more sense to test allocating struct-like classes with lots of public members and/or memory chunks as arrays - as these seem more fit for the scenario.
    I don't want this to turn into a debate on the fine art of building performant data-crunching applications, so please excuse my potential ignorance. I do know a couple of things on the subject, but there's always more knowledgeable people around innit? So please, no "that's not the way it's done in a real top performance app" comments, if possible, I know that's not how it's done, but mr. Watson's claim was strictly about memory allocation, so there you go, there's my take on it.
    Oh and yes I do know timing is done with System.currentTimeMillis() but since the tests take so long I figured I could be lazy and ignore it. Also the number of runs (at least 1.000.000) is quite enough (if I remember correctly) for the JVM to JIT everything.

    Just saw your latest reply. Thank you, you're very kind. My problem is basically with your wagering about, as pointed out. You've clearly either misread the numbers or have tried to imply that I've no idea what I'm doing. And I stand corrected, the article's claims are indeed on par with yours - I've read this a long time ago and have not revisited it today, so I might have been a bit fuzzy on the speciffic claims.
  31. theres my take on it

    As Erik (whose brain is clearly working faster than mine) points out you test doesn't represent a real case. You avoid the exact issue the article dicusses by clearing the heap on each call.
    so long I figured I could be lazy and ignore it

    The number of iterations is probably fine, though I consider that a single test as anything less doesn't even register a non-zero execution time in my VM. However, your test is bloated with a lot of things that aren't directly pertinent to Object allocation. Could you run a control and see what result you get when the method body is commented out?
    Youve clearly either misread the numbers or have tried to imply that Ive no idea what Im doing

    I don't remember saying you have no idea what you are doing. All I said is I didn't know what you are doing. I have no problem with you disputing my claim or even proving me wrong. I've been wrong before, I'll be wrong again (probably.) Ask my wife. I just don't see the need for the hostile, defensive tone. I don't beat around the bush, if my bluntness caused offense, it was not intentional.

    BTW, is anyone else having problems with quoting posts? A lot of my attempts to post fail until I edit all the punctuation out of the quote.
  32. [Same test ran with an array of just 512 elements gives:
    0.146s c++/stack, 0.270s c++/heap, 11.158s Java/modern JVM ;) and 6.67s GCJ/native.

    Care to prove the benchmark's flawed? Care to try it yourself?]

    I actually tried, using this test:
    public static void main (String[] args) {
            class Foo {
                public Foo(int i, String s) {
                    this.i = i;
                    this.s = s;
                    int array[] = new int[8192];
                }

                int i=0;
                String s = null;
                public void foo() {
                    i++;
                }
            }
            long start = Calendar.getInstance().getTimeInMillis();
            for (int i=0; i<512; i++) {
                Foo o = new Foo (0, "FOO");
                o.foo();
                o = null;
            }
            long end = Calendar.getInstance().getTimeInMillis();
            System.out.println("Time elapsed in sec.:" + (end-start)/1000.0);
        }
    I got the following result:
    Time elapsed in sec.:0.078
    Am I missing something in your scenario ?
  33. Am I missing something in your scenario ?

    Yes. You need to also count the time it took to start the JVM, etc. That way he can win his argument ;-)

    Peace,

    Cameron Purdy
    Tangosol Coherence: Now available in PDF
  34. Given that a programmer in Java or C# should be setting their pointers to NULL...
    FUD at its best... Not only it is unnecessary in most situations (read any Java book) but it is been lately discouraged to rely on such technique at all. Anyone who spent years dealing with C/C++ development would agree that productivity leap from a manual memory management in languages like C/C++ to an automatic one like in Java or C# is very dramatic.Regards,Nikita, GridGain Systems.
    Finding resource leaks (or performance) isn't so hard with the tools that are available. That being said, being a middleware developer, I do miss the performance benefits of C/C++, specifically, allocating objects on the stack.

    I just noticed this in the IBM article above:

    "The JIT compiler can perform additional optimizations that can reduce the cost of object allocation to zero. Consider the code in Listing 2, where the getPosition() method creates a temporary object to hold the coordinates of a point, and the calling method uses the Point object briefly and then discards it. The JIT will likely inline the call to getPosition() and, using a technique called escape analysis, can recognize that no reference to the Point object leaves the doSomething() method. Knowing this, the JIT can then allocate the object on the stack instead of the heap or, even better, optimize the allocation away completely and simply hoist the fields of the Point into registers. While the current Sun JVMs do not yet perform this optimization, future JVMs probably will. The fact that allocation can get even cheaper in the future, with no changes to your code, is just one more reason not to compromise the correctness or maintainability of your program for the sake of avoiding a few extra allocations."
  35. It´s a false statement[ Go to top ]

    Given that a programmer in Java or C# should be setting their pointers to NULL, what is the difference between that and explcitly calling a delete? The answer is more or less nothing.

     It´s a false statement, GC works when the object is out of the scope, when has no references, if an object goes out of the scope all referenced objects goes out of the scope if they are only referenced by this object, a single null can send to GC tons of objects, it´s great, it´s invaluable.

     I think this guy has a very poor knowledge of Java (and GC based languages) and a nostalgic C++ programmer, a poor blog to publish in a Java portal.
  36. It´s a false statement[ Go to top ]

    Agreed, in fact, to answer the author's question... the difference is that both of those methods are unnecessary, as well as flat out dangerous.

    The fact that the first sentence of the summary of the article declares the idea of setting references to NULL in java (or C# for that matter) is quite troubling to find posted here. Setting references to NULL to "help" GC is the easiest way to sign up for a NPE in other parts of the code. I think the author should get back to designing his own language rather than make uninformed statements like this.
  37. Setting pointers to null *is* useful[ Go to top ]

    Setting pointers to null *does* help the garbage collector. Regardless of the type of garbage collector you are using (generational, mark and sweep, etc...), setting a pointer to null allows the algorithm to skip a step in its implementation. If you don't set the pointer to null, the virtual machine will try to follow this pointer and will, later, realize that the pointed object only has one reference, and is therefore up for garbage collection. If you set the pointer to null, you save one step. It's as simple as that.

    Now, while this practice helps the garbage collector, it doesn't help it by much, and I would argue that for most applications (JSE and JEE), you should never bother using it. It's a bit different for JME, though, where every byte counts and where I recommend (and personally use) this technique with lazy initialization to limit the amount of heap used by my applications.

    --
    Cedric
  38. Setting pointers to null *is* useful[ Go to top ]

    Setting pointers to null *does* help the garbage collector. Regardless of the type of garbage collector you are using (generational, mark and sweep, etc...), setting a pointer to null allows the algorithm to skip a step in its implementation. If you don't set the pointer to null, the virtual machine will try to follow this pointer and will, later, realize that the pointed object only has one reference, and is therefore up for garbage collection.

    This post seems to demonstrate a misunderstanding about how GC works. First of all lets be clear that 'having only one reference' has nothing to do with whether a Object will be garbage collected. An Object is GC-able when it is no reachable from the root set of references. It doesn't matter how many references point to the Object.

    The garbage collector doesn't follow pointers to see what Objects it can GC. In a nutshell, a mark-and-sweep GC follows all the paths of references from the root and marks those Objects as reachable. Everything else is swept.

    Here's an article you should read:

    "Consider the code in Listing 4, which combines several really bad ideas. The listing is a linked list implementation that uses a finalizer to walk the list and null out all the forward links. We've already discussed why finalizers are bad. This case is even worse because now the class is doing extra work, ostensibly to help the garbage collector, but that will not actually help -- and might even hurt. Walking the list takes CPU cycles and will have the effect of visiting all those dead objects and pulling them into the cache -- work that the garbage collector might be able to avoid entirely, because copying collectors do not visit dead objects at all. Nulling the references doesn't help a tracing garbage collector anyway; if the head of the list is unreachable, the rest of the list won't be traced anyway."

    http://www-128.ibm.com/developerworks/java/library/j-jtp01274.html
  39. Setting pointers to null *is* useful[ Go to top ]

    Setting pointers to null *does* help the garbage collector.

    Here's a classic article that' a little long in the tooth but should explain how GC works:

    http://java.sun.com/developer/technicalArticles/ALT/RefObj/index.html
  40. I think all of you folks who are saying that setting a variable to null won't help the garbage collector are assuming that the referencing object is also about to head out of scope.

    Memory pinning is a common cause of memory hogging in Java.

    Ex.

      public static void main(String[] args) {
          SomeLargeMemoryHog hog = ...
          SomeValue val = hog.getSomeValue();

          someProcessThatRunForAWhile(val);
      }

    In this case hog is pinned by the call stack which retains a reference to it. If the value val can exist without hog then setting hog = null will make it available for GC. leaving a reference will pin it until the long running method exits.
  41. Leaving a reference will pin it until the long running method exits.

    We had this exact issue, though in our case it wasn't being pinned in the call stack, but rather stuck in a member variable of a class that was stuffed in a Servlet Session. The correct fix would have been to rewrite the code and eliminate those semi-work variables being implemented with member variables, but in practice it was simpler to just NULL them out.

    This, however, highlights something missing in this discussion. The GC helps push memory management from the ground zero design time problem towards a later, application tuning problem.

    Certainly you have to worry about memory management, allocations and use, but you don't have to necessarily worry about it on day one. You can develop you applications at a reasonably high level and make a tuning sweep later (ideally not a horribly invasive sweep).
  42. Keep in control of your design[ Go to top ]

    Setting references to NULL to "help" GC is the easiest way to sign up for a NPE in other parts of the code.
    This strikes me as a code smell. If you are keeping a pointer alive just in case some other part of your code uses it, you have a bigger problem than memory management. Regardless of your implementation, you should always be able to answer the question: "Is this object still needed after this point or not?". Sometimes, the answer is "maybe" (multithreaded code for example), but in most cases, the answer to this question should be a clear "yes" or "no".

    --
    Cedrichttp://testng.org
  43. Keep in control of your design[ Go to top ]

    <quote>Setting references to NULL to "help" GC is the easiest way to sign up for a NPE in other parts of the code.</quote>

    I tend to think that it's one of the nicest non-memory related side-effects of setting pointers to null. Like that you're really sure that when an reference shouldn't be used anymore, that it's not used. Better having a NPE than modifying something that shouldn't be modified.
  44. Thats true.[ Go to top ]

    I already see this habit some Java programmers abuse by setting references to null everywhere in the code.

    There is no reason to set things to null to help GC do its work unless its part of your application logic or flow.

    Maybe renaming GC to “unreferenced non-active object collector” would make more sense otherwise some still think that anything that their code will not use is already garbage.
  45. Editors out to lunch?[ Go to top ]

    In every piece of non-trivial software, irregardless of the language

    IRREGARDLESS?! Come on. Seriously.
  46. Java Memory Managment[ Go to top ]

    When I started working with Java it had only the 'stop the world' garbage collector and we had a graphical data mining tool that leaked 8 MB a minue!
    Alternative GC methods are now available - thank god!

    By ignoring the destruction of your objects does not guarantee that Java will automatically clean them up. Java is more like Microsoft in this respect. It will only GC when it is completly and absolutly starved of resources - and then only maybe!

    What I found in Java was the 'new' operator, but what I missed was the 'delete' operator. What I couldn't understand was even though I performed a System.GC, sometimes it did not!?!

    Java as a runtime system that continues to be a resource hog. I believe that it is a resource hog in favour of Sun's strategy of using Java to sell Servers. If your Java app runs out of memory, buy more memory ...
  47. Java Memory Managment[ Go to top ]

    When I started working with Java it had only the 'stop the world' garbage collector and we had a graphical data mining tool that leaked 8 MB a minue!Alternative GC methods are now available - thank god!By ignoring the destruction of your objects does not guarantee that Java will automatically clean them up. Java is more like Microsoft in this respect. It will only GC when it is completly and absolutly starved of resources - and then only maybe!What I found in Java was the 'new' operator, but what I missed was the 'delete' operator. What I couldn't understand was even though I performed a System.GC, sometimes it did not!?!Java as a runtime system that continues to be a resource hog. I believe that it is a resource hog in favour of Sun's strategy of using Java to sell Servers. If your Java app runs out of memory, buy more memory ...

    I don't see what Java has to do with your application not releasing references. the GC is not psychic; if you keep strong references to Objects, it will not GC them. If you stopped assuming the garbage collector was the problem, you might find the bug in your code that is causing the issue.
  48. Java Memory Managment[ Go to top ]

    ...What I couldn't understand was even though I performed a System.GC, sometimes it did not!?!Java as a runtime system that continues to be a resource hog. I believe that it is a resource hog in favour of Sun's strategy of using Java to sell Servers. If your Java app runs out of memory, buy more memory ...

    If you were to actually read the JavaDocs it says quite clearly (http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#gc()).

    Calling the gc method suggests that the Java Virtual Machine expend effort toward recycling unused objects.

    Notice the keyword "suggest", the JVM is not required to run even if you call System.gc(). In fact, it is not recommended to call the gc manually.

    JVM tunning is becoming a lost art. Most people expect to run their application with the default settings and expect good performance. Unfortunately this is not the case, sadly you actually have to work to get good performance out of your application.
  49. First hand experience[ Go to top ]

    I enjoy reading technical articles about JVM and GC implementations, but putting all that aside, I've seen cases where setting objects to null helps.

    On several occasions I've profiled webapps that saw a gradual increase in heap under constant load. In one of the applications, the request produced a large object tree which is returned in the response. In the original implementation, I used Tomcat and JMeter to test the application. Once I saw the gradual increase in heap, I profiled tomcat and the app using OptimizeIt. What I saw was that under constant load, sun jdk 1.4.2 would continue to increase the heap. I decided to set the object graph to null once I transformed it to XML. when I re-ran optimizeIt, the heap remained constant under constant load.

    In most cases, developers do not need to set a reference to null, but if the application has to handle quite a bit of load, it may help. The key is to stress test and profile to make sure the application behaves correctly. Even if an application performs "ok", it's still a good idea to profile it and make sure it performs as expected.

    peter
  50. First hand experience[ Go to top ]

    I enjoy reading technical articles about JVM and GC implementations, but putting all that aside, I've seen cases where setting objects to null helps.

    don't always null != never null

    Basic logic, people. I shouldn't have to point this out all the time. 'Not always' is not the same as 'never'. Sometimes it is necessary to set references to null. No one said otherwise. The point is that it's not something we should be doing across the board. It's an excetpional case.
    On several occasions I've profiled webapps that saw a gradual increase in heap under constant load. In one of the applications, the request produced a large object tree which is returned in the response. In the original implementation, I used Tomcat and JMeter to test the application. Once I saw the gradual increase in heap, I profiled tomcat and the app using OptimizeIt. What I saw was that under constant load, sun jdk 1.4.2 would continue to increase the heap. I decided to set the object graph to null once I transformed it to XML. when I re-ran optimizeIt, the heap remained constant under constant load.In most cases, developers do not need to set a reference to null, but if the application has to handle quite a bit of load, it may help. The key is to stress test and profile to make sure the application behaves correctly. Even if an application performs "ok", it's still a good idea to profile it and make sure it performs as expected.peter

    I'm not discounting your experience here because I have been surprised before but what do you mean when you say you 'set it to null'? Did you walk the graph setting every reference to null? Or did you set a reference to the root to null? Was what you were setting to null reachable?

    All that matters to the VM is whether a Object is reachable. And there's nothing magical about setting things to null. You can set a reference to anything else and get the exact same effect. All that matters is whether the reference is a member of a reachable Object (or on the stack) and what Object it points to.
  51. First hand experience[ Go to top ]

    I enjoy reading technical articles about JVM and GC implementations, but putting all that aside, I've seen cases where setting objects to null helps.
    don't always null != never null Basic logic, people. I shouldn't have to point this out all the time. 'Not always' is not the same as 'never'.

    I probably shouldn't have directed this response at you in particular, Peter. It was convienient because I wanted to reply to the rest of your post. Laziness on my part.
  52. First hand experience[ Go to top ]

    Good questions...
    I enjoy reading technical articles about JVM and GC implementations, but putting all that aside, I've seen cases where setting objects to null helps.
    don't always null != never null

    Basic logic, people. I shouldn't have to point this out all the time. 'Not always' is not the same as 'never'. Sometimes it is necessary to set references to null. No one said otherwise. The point is that it's not something we should be doing across the board. It's an excetpional case.
    On several occasions I've profiled webapps that saw a gradual increase in heap under constant load. In one of the applications, the request produced a large object tree which is returned in the response. In the original implementation, I used Tomcat and JMeter to test the application. Once I saw the gradual increase in heap, I profiled tomcat and the app using OptimizeIt. What I saw was that under constant load, sun jdk 1.4.2 would continue to increase the heap. I decided to set the object graph to null once I transformed it to XML. when I re-ran optimizeIt, the heap remained constant under constant load.

    In most cases, developers do not need to set a reference to null, but if the application has to handle quite a bit of load, it may help. The key is to stress test and profile to make sure the application behaves correctly. Even if an application performs "ok", it's still a good idea to profile it and make sure it performs as expected.
    peter

    I'm not discounting your experience here because I have been surprised before but what do you mean when you say you 'set it to null'? Did you walk the graph setting every reference to null? Or did you set a reference to the root to null? Was what you were setting to null reachable?All that matters to the VM is whether a Object is reachable. And there's nothing magical about setting things to null. You can set a reference to anything else and get the exact same effect. All that matters is whether the reference is a member of a reachable Object (or on the stack) and what Object it points to.

    in my case, it was a compliance application which generates a huge report. I tried a variety of ways. Clearing all lists in the object graph and setting them to null, setting root to null, traversing the entire graph recursively from the bottom up. The one that produced constant heap size was recursively setting refs to null. In this specific case, it was safe, since it was data objects. During the benchmark, my system was running at 100% for about 5-12 hours. Basically, I ran benchmarks with 500K, 1million, 2million and 5million transactions.

    If I reduced the constant load so the utilization was below 30%, setting refs to null didn't really matter from what I saw. I probably ran 50+ benchmarks using tomcat, jmeter, and optimizeIt to really quantify the runtime behavior. It's rather tedious and took me about 3 weeks, but I wanted to make sure the entire application performed exactly as I needed. Most applications don't have these kind of extreme performance requirements.

    for a normal application that doesn't have high performance requirements, there's really no point in setting references to null from my experience.

    the key for me is understanding the performance requirements and figuring out exactly what an app does through profiling. Often performance and stress testing is left off the schedule, so I generally don't recommend setting refs to null. It's a darn good way to screw up an application and create all sorts of unexpected NPE.

    I don't know about others, but I personally wouldn't set everything to null as a standard practice. Only after some thorough profiling would I consider that approach an option. Even then, I'd be conservative and try to make sure it wasn't a flaw in the design first.

    peter
  53. This site is annoying me[ Go to top ]

    Im still struggling with quoting so please reference Peter's post above for context.

    If you have continual memory usage increase such that you eventually have an OOM, there is a leak in your code. I think I'm seeing this called a space leak. If it finally cleans once it bangs up against it's limit, then I have to say I'm a little credulous. Everytime Ive seen one of these scenarios, it was a bug.

    The problem with really complex or cyclical references is that a single strong reference to one of the elements can cause the entire graph to stay alive. Were there any SoftReferences in this app?

    I once had a problem with listeners. The listners were implemented as anonymous inner classes on GUI components (I know, I know, not my design) and added to our data Objects. The Swing Window would be closed and disposed but it was still hanging out in memory. Finally we figured out it was the anonymous listeners. As a quick fix, I made the data objects use WeakReferences around the listeners. Then I had to explain (repeatedly) that you needed an explicit strong reference to keep your listener alive.

    Is it possible you had that kind of situation? In that scenario, walking the graph and nulling every reference will resolve it.
  54. This site is annoying me[ Go to top ]

    I'm still struggling with quoting so please reference Peter's post above for context.

    If you have continual memory usage increase such that you eventually have an OOM, there is a leak in your code. I think I'm seeing this called a space leak. If it finally cleans once it bangs up against it's limit, then I have to say I'm a little credulous.

    Everytime Ive seen one of these scenarios, it was a bug. The problem with really complex or cyclical references is that a single strong reference to one of the elements can cause the entire graph to stay alive. Were there any SoftReferences in this app?
    the data model was generated from XMl Schema, so there weren't any soft references.
    I once had a problem with listeners. The listners were implemented as anonymous inner classes on GUI components (I know, I know, not my design) and added to our data Objects. The Swing Window would be closed and disposed but it was still hanging out in memory. Finally we figured out it was the anonymous listeners. As a quick fix, I made the data objects use WeakReferences around the listeners. Then I had to explain (repeatedly) that you needed an explicit strong reference to keep your listener alive. Is it possible you had that kind of situation? In that scenario, walking the graph and nulling every reference will resolve it.

    In my case, the compliance engine adds itself as a listener to the transaction object, but the transaction is retracted from the engine at the end of the evaluation cycle before the result is transformed and sent to the recipient. I doubt it is the listener, but there's a .00001% chance it might be. Even though I stepped through the code to make sure the compliance engine was correctly removed as a listener, I may have missed something.

    Having said that, I'm pretty sure the cause for the gradual increase in memory consumption is the result of extreme load. Just to give a little bit more information. I had jmeter sending requests non-stop without any think time to simulate really extreme non-stop stress. Obviously, most applications don't get any where near that.

    I also tried gradually increasing the load in an attempt to figure exactly what was happening. When the system had idle time, the JVM correctly GC the large object graph. After 2 weeks of triple checking every thing and debugging to make sure it wasn't a bug, it was just easier to set the references in the report object to null recursively.

    The object graph is a compliance report, which contains details about which regulatory rules failed. The structure is rather simple, but very wide in moderate cases.

    In this specific case, every single transaction generated 20-30 report entries, so batch transactions of 25, 50, 100, 150, 200 transactions could potentially create a report with 500-6000 entries. Each report entry contains any where from 20-50 rows of data. This is a simple example. The more extreme tests uses batch transactions of 500, 1K, 2K and 4K. Once you multiply that out, it means a single batch transaction of 20 can produce (20 x 20 x 20) 800 rows of data. Do that non-stop with batch tranx of 200 and you easily get 8K rows of data for each concurrent requests. I tested with 1-20 concurrent requests, so under full load, the system was producing around 150K rows of data.

    I'm not an expert in JVM's or GC, but I have spent a lot of time reading up on the topic. The interesting thing I observed in this exercise is that if I set jmeter to produce 4-10 transactions per second, the JVM had plenty of time to garbage the large object graph. once I hit 200 tranx per second on my laptop, the CPU utilization was at 100%. I'm guessing jvm just wasn't able to keep up. the compliance engine is very CPU intensive, so my un-educated guess is the compliance engine was starving other processes. By setting the refs to null, it reduced the cost of GC.

    keep in mind I could be totally wrong. At the end of the day, reaching the performance requirement is what mattered to me. I still agree that most cases, setting refs to null is bad practice. Doing it should be done with a profiler and stress testing tools.

    peter
  55. oops typos[ Go to top ]

    doh! I should proof read. that should have been 20 x 20 x 20 = 8000. so despite my typos and bad math, it's a lot of data for the JVM to GC with 100% load.

    peter
  56. oops typos[ Go to top ]

    doh! I should proof read. that should have been 20 x 20 x 20 = 8000. so despite my typos and bad math, it's a lot of data for the JVM to GC with 100% load.peter

    Yes, but nulling the references doesn`t change the amount of memory that the VM needs to GC.
  57. oops typos[ Go to top ]

    see this reference

    http://www.daimi.au.dk/~beta/Papers/Train/train.html

    Large highly connected object graphs may require more work to collect if the GC is using the Train algorithm.

    Sun's GC uses the Train algorithm for incremental collection.
  58. thanks for the link[ Go to top ]

    interesting link. thanks for posting it.

    peter
  59. This site is annoying me[ Go to top ]

    Im guessing jvm just wasnt able to keep up. the compliance engine is very CPU intensive, so my uneducated guess is the compliance engine was starving other processes. By setting the refs to null, it reduced the cost of GC.keep in mind I could be totally wrong. At the end of the day, reaching the performance requirement is what mattered to me. I still agree that most cases, setting refs to null is bad practice. Doing it should be done with a profiler and stress testing tools.peter

    That the VM could not keep up makes perfect sense. If you are running a a very high load, the VM isnt going to waste time with garbage collection unless it has to. This is a benefit of garbage collection (assuming you eventually give the VM time to cleanup.)

    What doesn't make sense to me is that nulling the references was making the GC work. If the GC wasn't running, it wouldn't matter that the references were null. So the GC must have been running. I think you said you were running in a profiler. If you were, you should be able to force a full GC and see what drops. If it doesnt drop, its still reachable. If you didn't have a profiler, you could run in verbose mode and watch the major and minor collections. If a major collection occurs and the memory is not GCed, its still reachable.

    Lets assume that you are correct what are the possiblities:

    1. There was a reference on the stack that is out of scope but the VM doesn't distinguish this (shouldn't be the case but used to be a problem in some cases.)
    2. Nulling the references somehow made the memory available to a minor collection.

    The IBM hosted article above states that what you are doing can actually interfere with the copy collector. Its absolutely possible that this is not correct in your case. Its just that what you are saying flies in the face of what I have read about GC in authoritative sources.

    P.S. what version of he JRE are we talking about here?
  60. This site is annoying me[ Go to top ]

    Im guessing jvm just wasnt able to keep up. the compliance engine is very CPU intensive, so my uneducated guess is the compliance engine was starving other processes. By setting the refs to null, it reduced the cost of GC.keep in mind I could be totally wrong. At the end of the day, reaching the performance requirement is what mattered to me. I still agree that most cases, setting refs to null is bad practice. Doing it should be done with a profiler and stress testing tools.
    peter

    That the VM could not keep up makes perfect sense. If you are running a a very high load, the VM isnt going to waste time with garbage collection unless it has to. This is a benefit of garbage collection (assuming you eventually give the VM time to cleanup.)
    What doesn't make sense to me is that nulling the references was making the GC work. If the GC wasn't running, it wouldn't matter that the references were null. So the GC must have been running.

    I agree that it doesn't make sense on the surface. I really
    wish I knew more about the internals of the JVM. In optimizeIt, I was able to click GC now and see it do a full GC. For what it's worth, when the load was below 50%, the heap remained constant. This is was with jdk1.4.2_03, 04, 05.
    I think you said you were running in a profiler. If you were, you should be able to force a full GC and see what drops. If it doesnt drop, its still reachable. If you didn't have a profiler, you could run in verbose mode and watch the major and minor collections. If a major collection occurs and the memory is not GCed, its still reachable.

    Lets assume that you are correct what are the possiblities:

    1. There was a reference on the stack that is out of scope but the VM doesn't distinguish this (shouldn't be the case but used to be a problem in some cases.)

    2. Nulling the references somehow made the memory available to a minor collection.

    The IBM hosted article above states that what you are doing can actually interfere with the copy collector. Its absolutely possible that this is not correct in your case. Its just that what you are saying flies in the face of what I have read about GC in authoritative sources.

    P.S. what version of he JRE are we talking about here?

    I completely agree that this goes against some articles about JVM's. In my case though, once the report has been transformed to XML, no other application is suppose to use that data. In this specific case, it was safe to clear the report. If the situation was different, and the data needed to last longer than a single request life cycle, I would have dug deeper.

    Given the complexity of the application, stepping though the application under load isn't that practical on a laptop. If I had a serious workstation with 4gb of RAM, it might be more feasible. I'm definitely not recommending people do this kind of thing casually or even consider the option. I resorted to this option after spending several weeks debugging. If I had an infinite amount of time to debug and track down the exact cause, it would have been nice. Since reality is "get it down yesterday" it's always a balance act. I've always wanted to do some low level benchmarks to simulate constant load and GC, but haven't had time.

    peter
  61. This site is annoying me[ Go to top ]

    I completely agree that this goes against some articles about JVM's. In my case though, once the report has been transformed to XML, no other application is suppose to use that data. In this specific case, it was safe to clear the report. If the situation was different, and the data needed to last longer than a single request life cycle, I would have dug deeper.Given the complexity of the application, stepping though the application under load isn't that practical on a laptop. If I had a serious workstation with 4gb of RAM, it might be more feasible. I'm definitely not recommending people do this kind of thing casually or even consider the option. I resorted to this option after spending several weeks debugging. If I had an infinite amount of time to debug and track down the exact cause, it would have been nice. Since reality is "get it down yesterday" it's always a balance act. I've always wanted to do some low level benchmarks to simulate constant load and GC, but haven't had time.peter

    I hope you don't think I criticizing. I only want expand my own understanding.

    Having said that, did you conisider the possiblity that the API that generated the XML could hold onto the root node? I don't think you mentioned what library you were using (if any.)
  62. just guessing[ Go to top ]

    I completely agree that this goes against some articles about JVM's. In my case though, once the report has been transformed to XML, no other application is suppose to use that data. In this specific case, it was safe to clear the report. If the situation was different, and the data needed to last longer than a single request life cycle, I would have dug deeper.Given the complexity of the application, stepping though the application under load isn't that practical on a laptop. If I had a serious workstation with 4gb of RAM, it might be more feasible. I'm definitely not recommending people do this kind of thing casually or even consider the option. I resorted to this option after spending several weeks debugging. If I had an infinite amount of time to debug and track down the exact cause, it would have been nice. Since reality is "get it down yesterday" it's always a balance act. I've always wanted to do some low level benchmarks to simulate constant load and GC, but haven't had time.

    peter

    I hope you don't think I criticizing. I only want expand my own understanding.

    Having said that, did you conisider the possiblity that the API that generated the XML could hold onto the root node? I don't think you mentioned what library you were using (if any.)

    I don't take things personally, so it's all good. I used XStream to serialize the request into object and back out to XML. The input is basically transactionSets and the ouput is reports. There's a slight chance (.001%) the compliance engine might get added as a listener more than once, which would mean that even though I call removeListener at the end of the evaluation cycle, one instance may still have a reference to the compliance engine. the compliance engine gets returned to a pool of engines, so maybe it sticks around longer because it is still referencing the compliance engine. I remember debugging it, but I'm human and could have missed it. I've definitely seen cases where this kind of odd behavior doesn't happen.

    For example, I stress test JMeter on a regular basis to make sure new plugins behave properly and do not have any memory leaks. The only cases where I've seen memory leaks was caused by JMS client libraries that didn't clean up after themselves correctly. I won't say which JMS provider it was, but I did come across one provider that created a thread a for JMS subscriber, but failed to clean it up correctly. The result in this specific case was that if I ran a test with 10 threads repeatedly, at the end of each run there were zombie threads, which produced a slow leak.

    The only conclusive thing I can say is that my knowledge of JVM is limited. One of these days I'll know slightly more. if I'm lucky, in 10 years I'll know it inside out. yeah right.

    peter
  63. just guessing[ Go to top ]

    I used XStream to serialize the request into object and back out to XML.

    I don`t know anything about XStream but it could hold a reference to the root node of the Objects. It could even put it in a static reference. It could be doing a lot of funky things behind the scenes.
  64. Kodo[ Go to top ]

    This website is a poor adverstisment for kodo. I`ll probably give it a thumbs down if someone asks about it.
  65. Kodo[ Go to top ]

    This website is a poor adverstisment for kodo. I`ll probably give it a thumbs down if someone asks about it.

    James - that's definitely the wrong conclusion. You should talk to Floyd before deciding on whom to blame. I'll give you a hint that neither KODO nor anything below KODO uses RMI.

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Coherent Caching
  66. Kodo[ Go to top ]

    This website is a poor adverstisment for kodo. I`ll probably give it a thumbs down if someone asks about it.
    James - that's definitely the wrong conclusion. You should talk to Floyd before deciding on whom to blame. I'll give you a hint that neither KODO nor anything below KODO uses RMI.Peace,Cameron PurdyTangosol Coherence: Clustered Coherent Caching

    All I know is that I`m frequently getting kodo exceptions instead of web pages here and for some reason this site doesn`t like my apostrophes. My guess is that they are not being encoded properly, which is freightening.
  67. 99.999% sure it's not XStream[ Go to top ]

    I used XStream to serialize the request into object and back out to XML.

    I don`t know anything about XStream but it could hold a reference to the root node of the Objects. It could even put it in a static reference. It could be doing a lot of funky things behind the scenes.

    I'm pretty positive it had nothing to do with XStream or Tomcat. I've profiled both a dozen times and neither have memory leaks from what I see.

    Having said that, the exact cause of the behavior I saw is still a bit of a mystery to me. I'm just glad I found a fix, which produced constant heap size for me.

    peter
  68. 99.999% sure it's not XStream[ Go to top ]

    I used XStream to serialize the request into object and back out to XML.
    I don`t know anything about XStream but it could hold a reference to the root node of the Objects. It could even put it in a static reference. It could be doing a lot of funky things behind the scenes.
    I'm pretty positive it had nothing to do with XStream or Tomcat. I've profiled both a dozen times and neither have memory leaks from what I see.Having said that, the exact cause of the behavior I saw is still a bit of a mystery to me. I'm just glad I found a fix, which produced constant heap size for me.peter

    Well you should never get memory leaks in Java but I think what you mean is that it`s not holding onto things that it shouldn't. This is pretty subjective, though. It`s only a "leak" if you believe it is. It`s entirely possible that there are SoftReferences in XStream to the Object model (perhaps for an optimization attempt.)
  69. My two cents - GC for many is a boon. There are three categories of developers that one encounters in every day life - The Good, The Bad and the Ugly. GC is for latter two. The science of GC is enough for the "good" to be happy who is happy to hack away the JVM parameters to get the best result. Anyway, we live in real world where most of them want is - the lesser I do is better
  70. ALL CAPS ATTACK: NULL NULL NULL[ Go to top ]

    Given that a programmer in Java or C# should be setting their pointers to NULL

    OK, the fact that he capitalized "NULL" makes me wonder where he's coming from .. is it the same odd-world that capitalizes TRUE and FALSE?

    #ifdef PEACE

    #define CAMERON PURDY
    #include <stdtagline.h>
    #endif
  71. Microbenchmark[ Go to top ]

    deallocate them, because then the heap will never get fragmented because everything will be allocated at the end

    I'd suggest creating a bunch of objects of different sizes, ranging from few bytes to maybe 256k. Make an array of say 5000 elements and store pointers/references to the allocated objects in it. The start randomly (or according to some pattern, making sure the objects have a wide variety of lifespans) deleting/making unreachable objects. Run it for a while. You might want to vary the array size from 1000 to 1000 or something like that, depending on how much memory you have.

    Whatever you do, don't push your computer to start using virtual memory, because that will end up dominating the performance.
  72. Why can Java include the free() function?

    This will reduce the footprint of Java program alot.
  73. I use Orion 2.0 and 1.4.2 Sun jre.
    One of my apps does this OutofMemoryException and the app is no longer usable (I am thinking of clustering it which could prolong its life more than the 2 weeks it runs now..Have applied various mem mgmt parameters to jvm blindly, but to no avail)
    I wish there was a declarative way of handling OutOfMemoryException, using which, I could reboot the server. Does any J2EE server have this capability?
  74. I use Orion 2.0 and 1.4.2 Sun jre.One of my apps does this OutofMemoryException and the app is no longer usable (I am thinking of clustering it which could prolong its life more than the 2 weeks it runs now..Have applied various mem mgmt parameters to jvm blindly, but to no avail)I wish there was a declarative way of handling OutOfMemoryException, using which, I could reboot the server. Does any J2EE server have this capability?

    OutOfMemoryError is a Throwable that you can catch. Whether the J2EE server has a high level hnadler that allows you to add custom handling of this is another matter.

    It's best to eliminate the OOMErrors if possible.
  75. >OutOfMemoryError is a Throwable that you can catch. Whether the J2EE server has a high level hnadler that allows you to add custom handling of this is another matter.It's best to eliminate the OOMErrors if possible.

    In general, as with most errors, it is not possible to guarantee the correct handling of OOMEs. Sun's standard Java libraries, for example, are not written to survive OOMEs -- their behavior is subsequently indeterministic if an OOME occurs since their state is non-transactional.

    To avoid OOMEs causing indeterministic behavior, all classes must be coded to work entirely with local variables, then "commit" their activities to fields (object state) only after everything succeeded. Furthermore, the state types should all be immutable, or none of this works at all.

    At any rate, it's an interesting conversation, but not one that a lot of people have thought through completely. We ended up selecting the following approach in Coherence to deal with OOMEs:

    1. Coherence assumes that an OOME that occurs anywhere inside Coherence is unrecoverable.
    2. Since it is unrecoverable, the node (cluster member) that receives an OOME must leave the cluster, because its behavior can no longer be proven to be deterministic.
    3. The result is that an internal soft-restart occurs, allowing the Coherence instance to re-instantiate and join the cluster.
    4. The reason that we can do this (give up and exit the cluster instead of attempting to recover) is that there is never an instant in which a member is a SPOF, so if a member leaves and rejoins, no clustered data is lost.

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java

    p.s. apologies if this posts two times .. it didn't get picked up on the first attempt
  76. OutOfMemoryError is a Throwable that you can catch. Whether the J2EE server has a high level hnadler that allows you to add custom handling of this is another matter.It`s best to eliminate the OOMErrors if possible.
    In general, as with most errors, it is not possible to guarantee the correct handling of OOMEs. Sun's standard Java libraries, for example, are not written to survive OOMEs -- their behavior is subsequently indeterministic if an OOME occurs since their state is non-transactional.To avoid OOMEs causing indeterministic behavior, all classes must be coded to work entirely with local variables, then "commit" their activities to fields (object state) only after everything succeeded.

    I work in a specialized field but what you describe here is the general case for our transactions. Data is defined in XML on a JMS queue. Once processed, the queue receives the confirmation. If the processing fails for any reason, including OOM or even a server crash, the message is still on the queue and is reprocessed.
  77. I agree that resource management[ Go to top ]

    I agree that resource management is an important topic. When a resource is no longer needed, you need to know that resource is returned to the system. I know that in most cases, developers do not need to set a reference to null. There are a lot of scripting programs and languages. IT is a large industry. Choosing from a range of servers is important for web hosting.

    Paul - http://www.connetu.com