Slobodan Celenkovic: Object Count Impact on GC Performance

Discussions

News: Slobodan Celenkovic: Object Count Impact on GC Performance

  1. Slobodan Celenkovic has written up a fine document talking about "Object Count Impact on Garbage Collection Performance," saying that "we Java programmers can be real pigs and allocate excessive numbers of objects that can overwhelm any garbage collection algorithm, regardless of how fast it is."

    In an example using a HashMap to lookup properties of an entity (i.e., a map containing "height," "weight," etc.), he points out:
    We should have used only 2 objects for a person, yet ended up using 9!!! To make things worse most programmers simply don't pay attention to such nasty little side effects of using a map. Most other map implementations use just as many objects if not more. As we use maps often in many places extra 7 objects quickly becomes a major problem when you reach extra half a million objects in the server software!!!

    One way to approach this problem is simply to declare appropriate classes and try to use primitives as much as possible. At the same time, there is no denying that maps are handy tools and we don't want to punish programmers for using them. Therefore, I decided to create another map implementation that would use the minimum number of objects internally. That way we could keep using maps without creating a huge number of objects. In particular, if I could eliminate the entry objects, the person map above would use 4 less objects and the difference would not be as bad.
    The "map implementation" he refers to is called MocMap (for "minimal object count map") and is intended to be fleshed out in his blog over time.

    What techniques do you use to track or tune object count creation in serverside applications? What typical practices do you use to help tune runtime performance?

    Threaded Messages (44)

  2. Linky to original article[ Go to top ]

    Object Count Impact on Garbage Collection Performance
  3. Linky to original article[ Go to top ]

    Indeed, and thank you. I was actually updating the article (again!) to have the link when you posted yours - it wasn't there when I went to edit the message, and was when I committed the update. Even worse, I'd done the update earlier and it got lost. :(

    Good eye, though.
  4. Couple questions[ Go to top ]

    First, nice article. I think you do a good job of focusing on the issue with HashMaps where appropriate, without going overboard and arguing "Maps is bad, mmkay?!"

    My first question is: how do you get those instance counts? Sounds useful.

    Second, is the problem of garbage collection on a given number of heap objects an O(n) problem? Or is it O(nlogn) or O(n^2)? I'm thinking it ought to be O(n), but I really don't know. If is is O(n), does this problem really rank up there in the list of causes of Java slowness?
  5. Couple questions[ Go to top ]

    My first question is: how do you get those instance counts? Sounds useful.

    One way is to use JRockit and type something like: "jrcmd [pid] print_object_summary" and you get something like:

    --------- Detailed Heap Statistics: ---------
    18.7% 2708k 26022 +2708k [C
    12.3% 1778k 38895 +1778k [B
     7.5% 1081k 34620 +1081k java/util/LinkedHashMap$Entry
     6.4% 928k 39626 +928k java/net/InetSocketAddress
     6.4% 922k 39349 +922k java/net/Inet4Address
     4.2% 615k 26240 +615k java/lang/String
    --------- End of Detailed Heap Statistics ---

    Regards,
    /Staffan - yes, I work for BEA
  6. Couple questions[ Go to top ]

    jhat - Java Heap Analysis Tool (OQL - Object Query Language -- a SQL-like language to query your Java heap):
    http://blogs.sun.com/roller/page/sundararajan?entry=what_s_in_my_java

    For 5.0, ./jmap -histo <pid>
    http://java.sun.com/j2se/1.5.0/docs/tooldocs/share/jmap.html
  7. Nice article! I don't think that many people know that Long.valueOf might return a cached Long.

    -- Andreas
  8. *.valueOf[ Go to top ]

    Of course, the crime here isn't that "valueOf" may used cached instances, which is completely appropriate, but the fact that I can't do:

    Integer i = Integer.valueOf(1);

    when I simply want to wrap the primitive.

    Then there is the question of what does the new autoboxing in Java 5 do? Does it also use the cache, or simply "new Integer(1)".
  9. *.valueOf[ Go to top ]

    AFAIK it's using constants for -127 .. 128 (preallocated on startup), assuming that low numbers are used more often than huge numbers.

    Holger
  10. From my experience, both short- and very long-lived objects cause minimal pain. What does hurt (==causes long GC delays) is 'survivors', that sort of objects that survive several sweeps, then they get promoted from 'new generation' area. Think of e.g. user session data, a window that was open for a while and so on. When these objects are eventually getting disposed, they are not sweeped by quick 'new generation' collector, because they are too old for it. So they remain, pollute heap and when it's full then 'old GC' comes into play and because it's _a lot_ of work, it takes up to few seconds for heaps ~1GB.

    The point I was trying to make is that object count for short-lived objects != object count for the survivors in terms of GC load. It's relatively cheap to create a lot of stuff that gets immediately disposed after request/transaction is done.
  11. From my experience, both short- and very long-lived objects cause minimal pain. What does hurt (==causes long GC delays) is 'survivors', that sort of objects that survive several sweeps, then they get promoted from 'new generation' area. Think of e.g. user session data, a window that was open for a while and so on. When these objects are eventually getting disposed, they are not sweeped by quick 'new generation' collector, because they are too old for it.

    The real problem isn't so much those middle age objects, but it's the speed that you run through them.

    The GC object scan is an O(n) process, as it pretty much just walks the tree (cognizant of cycles), copies those objects into the "new" heap, and then obliterates the old one.

    After an object ages in the "young" generation, it moves towards middle-age. And so on until the objects eventually get promoted to "ancient".

    In simplistic terms, you GC each generation when they get full, and promote following whatever aging algorithm you're using.

    What you want to avoid is a popular generation getting "filled up" often enough to promote lots of scanning, but not freeing up (through death or promotion) enough to give you some breathing room for time to the next GC.

    This is why low memory can kill you, as you continue to scan live objects. And if you have a lot of live objects, that's a lot of scanning.

    This is also why throwaway objects are less expensive than many think. Their primary cost is that they accumulate in the young generation and cause more GCs, but if there are few live objects in the generation, the scans are cheap. For example, if I have 1000000 objects in my heap, but only 10 are live, GC is very fast, but when you reverse that, with only 10 dead, GC will thrash and kill you.

    But this is why it's important to have a grasp of your object life times. You want the objects to settle from the younger generations to the older generations, but you don't want to keep the top layers "stirred" with a lot of migration back and forth, because that promotes scanning of your live objects in the older generations.

    Obviously, the fewer objects you persist, the faster your scans will be. That's simple math. But it's more complicated than that. Tuning your memory, the generations and lifespans have as much to with GC performance as simple object count.

    I'm not expert on it, I've seen the GC on our application going into degenerate behavior where all it does is Full GC (Bad!). It moves along happily with it's incrementals and then, pow, Full GC's over and over. It was never clear to me what was triggering that behavior. It wasn't a memory leak, but more a generational leak. I've seen the GC free up 300MB on a 1G heap and still stop doing incrementals.

    It's a black art, agreed, but new HashMaps or Integer cachings are only a part of the solution to tuning your GC (and I'd rather tune a GC with a couple of command line parameters than manually manage my dynamic objects in 1000000 lines of code thankyouverymuch).
  12. Minimal object count collections[ Go to top ]

    Strangely enough, I was looking for exactly such a Map implementation the other day for an application that makes heavy use of HashMaps for storing object properties and cached data. The closest I found was the Jakarta Commons Collections Flat3Map, which represents maps of up to three entries with an overhead of just one object, producing very good garbage collection characteristics.

    It is also worth using java.util.Collections to generate empty and single-member maps, lists and sets. The resulting objects are considerably more compact than their HashMap, HashSet or ArrayList equivalents, and in the case of empty collections, have effectively zero overhead as only a single instance is required per VM, even after serialization.
  13. Trove4j[ Go to top ]

    Have you looked at Trove4j?

    http://trove4j.sourceforge.net/

    I am not completely sure about their object allocation numbers/strategy, but in the last 2 projects where I used it, just switching from build-in ones to this improved performance quite well.
  14. Commons Flat3Map[ Go to top ]

    Keith,
    I did examine Flat3Map. Clearly the maximum capacity of 3 mappings is not sufficient and I wanted something that scales far beyond the size of 3.
  15. pigs[ Go to top ]

    "we Java programmers can be real pigs and allocate excessive numbers of objects that can overwhelm any garbage collection algorithm, regardless of how fast it is."

    What else am I supposed to do with 3.8 GHz and 2GB of RAM?

    Clinton
  16. pigs[ Go to top ]

    "we Java programmers can be real pigs and allocate excessive numbers of objects that can overwhelm any garbage collection algorithm, regardless of how fast it is."
    What else am I supposed to do with 3.8 GHz and 2GB of RAM?Clinton

    Tibco products can burn through 2 GB of RAM without breaking a sweat. You'll shouldh ave 2GB on the developer's boxes. The server's should have at least 4GB, 6 or 8 would be better.
  17. Object Count Impact on GC Performance[ Go to top ]

    Just wanted to point out that it is not the number of objects allocated that counts (pun intended), but just the number of long lived objects, for instance elements of Collections that are statically referenced. So the article's title is a bit misleading.

    Also read "Java theory and practice: Urban performance legends, revisited" by Brian Goetz.

    Cheers, Lars
  18. Hi,

    A nice feature we offer customers is our Tracer API which allows custom trace extensions to be plugged into our management console as well as access to our JVMPI agent counters - high resolution clock time, cpu time, gc time, thread blocking and waiting, object allocation size, and clock adjusted).

    http://www.jinspired.com/products/jdbinsight/api/com/jinspired/jxinsight/trace/Tracer.html

    import com.jinspired.jxinsight.trace.Tracer;
    import java.util.*;

    public class Main {
        public static void main(String[] args) {

            long[] start; long end[];

            Object[] array;

            start = Tracer.getThreadCounters();
            array = new Object[0];
            end = Tracer.getThreadCounters();
            System.out.println("new Object[0] = " + (end[Tracer.ALLOCATION] - start[Tracer.ALLOCATION]) + " bytes");

            start = Tracer.getThreadCounters();
            array = new Object[1];
            end = Tracer.getThreadCounters();
            System.out.println("new Object[1] = " + (end[Tracer.ALLOCATION] - start[Tracer.ALLOCATION]) + " bytes");

            start = Tracer.getThreadCounters();
            array = new Object[2];
            end = Tracer.getThreadCounters();
            System.out.println("new Object[2] = " + (end[Tracer.ALLOCATION] - start[Tracer.ALLOCATION]) + " bytes");

            start = Tracer.getThreadCounters();
            array = new Object[20];
            end = Tracer.getThreadCounters();
            System.out.println("new Object[20] = " + (end[Tracer.ALLOCATION] - start[Tracer.ALLOCATION]) + " bytes");

            start = Tracer.getThreadCounters();
            array = new Object[20][0];
            end = Tracer.getThreadCounters();
            System.out.println("new Object[20][0] = " + (end[Tracer.ALLOCATION] - start[Tracer.ALLOCATION]) + " bytes");

            start = Tracer.getThreadCounters();
            array = new Object[20][1];
            end = Tracer.getThreadCounters();
            System.out.println("new Object[20][1] = " + (end[Tracer.ALLOCATION] - start[Tracer.ALLOCATION]) + " bytes");

            start = Tracer.getThreadCounters();
            array = new Object[20][2];
            end = Tracer.getThreadCounters();
            System.out.println("new Object[20][2] = " + (end[Tracer.ALLOCATION] - start[Tracer.ALLOCATION]) + " bytes");

            start = Tracer.getThreadCounters();
            array = new Object[40];
            end = Tracer.getThreadCounters();
            System.out.println("new Object[40] = " + (end[Tracer.ALLOCATION] - start[Tracer.ALLOCATION]) + " bytes");

            new HashMap(); // preload classes
            start = Tracer.getThreadCounters();
            new HashMap();
            end = Tracer.getThreadCounters();
            System.out.println("new HashMap() = " + (end[Tracer.ALLOCATION] - start[Tracer.ALLOCATION]) + " bytes");

            new Hashtable(); // preload classes
            start = Tracer.getThreadCounters();
            new Hashtable();
            end = Tracer.getThreadCounters();
            System.out.println("new HashTable() = " + (end[Tracer.ALLOCATION] - start[Tracer.ALLOCATION]) + " bytes");

            new TreeMap(); // preload classes
            start = Tracer.getThreadCounters();
            new TreeMap();
            end = Tracer.getThreadCounters();
            System.out.println("new TreeMap() = " + (end[Tracer.ALLOCATION] - start[Tracer.ALLOCATION]) + " bytes");

            start = Tracer.getThreadCounters();
            array = new Object[100];
            end = Tracer.getThreadCounters();
            System.out.println("new Object[100] = " + (end[Tracer.ALLOCATION] - start[Tracer.ALLOCATION]) + " bytes");

            new ArrayList(); // preload classes
            start = Tracer.getThreadCounters();
            new ArrayList(100);
            end = Tracer.getThreadCounters();
            System.out.println("new ArrayList(100) = " + (end[Tracer.ALLOCATION] - start[Tracer.ALLOCATION]) + " bytes");

            new Vector(); // preload classes
            start = Tracer.getThreadCounters();
            new Vector(100);
            end = Tracer.getThreadCounters();
            System.out.println("new Vector(100) = " + (end[Tracer.ALLOCATION] - start[Tracer.ALLOCATION]) + " bytes");

            start = Tracer.getThreadCounters();
            synchronized(Tracer.class) {
                try {
                    Tracer.class.wait(10000);
                } catch (InterruptedException e) {}
            }
            end = Tracer.getThreadCounters();
            System.out.println("Thread.sleeep(10000) = " + (end[Tracer.WAITING] - start[Tracer.WAITING]) + " microseconds");
        }
    }


    Results JDK 1.4.2
    ========================
    new Object[0] = 16 bytes
    new Object[1] = 16 bytes
    new Object[2] = 24 bytes
    new Object[20] = 96 bytes
    new Object[20][0] = 416 bytes
    new Object[20][1] = 416 bytes
    new Object[20][2] = 576 bytes
    new Object[40] = 176 bytes
    new HashMap() = 120 bytes
    new HashTable() = 96 bytes
    new TreeMap() = 40 bytes
    new Object[100] = 416 bytes
    new ArrayList(100) = 440 bytes
    new Vector(100) = 440 bytes
    Thread.sleeep(10000) = 9581829 microseconds

    Please note you will have to run the Java process with our agent loaded using -Xrunjdbinsight:a=t


    William Louth
    JXInsight Product Architect
    CTO, JInspired

    "J*EE tuning, testing and tracing with JXInsight"
    http://www.jinspired.com
  19. Unnatural acts of Java[ Go to top ]

    Having to tune your object counts to match the way garbage collection works is an unnatural act that developers should not have to go through.

    The promise of Java's automatic GC is to be able to express your algorithms in a natural Java fashion, and GC would be smart and efficient enough to take care of dealing with memory issues.

    Unfortunately this has not been the case, and so we periodically have articles like Celenkovik's telling you how to write your programs to be more efficient. This may work for the <1% of Java developers that actually reads such articles, but the vast majority of developers are busy writing application software, not learning GC internals.

    Its the VM & GC engineers who need to make their products indifferent to idiomatic choices made by the developer.
  20. Unnatural acts of Java[ Go to top ]

    i share your wish that gc algorithms be perfected to make up for our occasional numskullery, but based on my very limited understanding of the means of gc, i'm not sure things will ever get to the point that the gc can completely forgive needlessly bloated heap use.

    >> busy writing application software, not learning GC internals

    yep... until said app software is put to a load test and is found to be a complete slug (speaking from personal experience). slobodan's article reminds me that java's promises give us some breathing room, letting us focus more attention on the app code; but those services do not (and, arguably, should not) let us off the hook completely.

    keeping somewhere in the back of my mind the things java is doing for me almost for free makes for smarter/better design/code. is java saying, "help me help you!"? (sorry, that movie was on tv the other night)

    those near-freebie services also make for good tech interview items. been asked a time or 2 about methods of gc.
  21. Unnatural acts of Java[ Go to top ]

    yep... until said app software is put to a load test and is found to be a complete slug (speaking from personal experience). slobodan's article reminds me that java's promises give us some breathing room, letting us focus more attention on the app code; but those services do not (and, arguably, should not) let us off the hook completely.

    I'm going to throw a slightly cynical view of GC out there...

    The better the GC engineers make the GC, the harder it gets to understand exactly what the GC is doing, and what the programmer can do to "help" the GC.

    The question is: What happens if the GC becomes so effective that it works for 99.999% of the situations, but is also completely incomprehensible by 99.999% of Java programmers?

    The result is that while the number of cases that the GC can effectively handle is increased, the number of cases that can be redesigned (as in making thought-out, intentional changes, not randomly changing things until the code works) by a mere mortal to work with the GC drops.

    Consequently, Java gets better at handling more situations, but the total number of problems that can be effectively solved in Java drops.

    I'm not sure if this is really an issue, but spend some time (if you haven't already) really looking into the documentation on how the GC in the JVM works. It's complicated. It's not just a simple mark-sweep algorithm. It does all sorts of things depending on all sorts of factors.

    I don't know if this is really an issue. But I do know that discussions among smart people about how the GC handles given situations rarely seem to show any sort of true understanding of the way the JVM's GC works, even if they have an good grasp of the theoretical aspects of GC.
  22. Unnatural acts of Java[ Go to top ]

    The thing missing from this article is any discussion about tweaking the GC parameters. Perhaps making Objects move to the old generation faster would help.
  23. Unnatural acts of Java[ Go to top ]

    Lars,
    I did mention that short lived objects (maps) are not a problem.

    Bob,
    The answer is yes and no. Yes, GC is supposed to take care of memory management and we, programmers, are supposed to simply create objects as needed without any worries. That is in fact what we do 99% of the time. No, we cannot completely ignore and forget memory management issues. That 1% of the time where we allocate a great quantity of certain types of objects, we may need to take a close look and ensure that these objects are reasonably efficient (both in terms of execution speed and memory use).

    I believe that "VM & GC engineers" have already improved Java's algorithms great deal and are good enough to handle huge heaps very well. Nevertheless there are real physical limitations of the machines Java VM is executing on that the VM/GC engineers cannot do anything about.

    Erick, you read my mind :)

    James,
    I am more interested in the general issue of object count that is applicable to Java as well as C# and other languages/platforms that use GC. The class I'll present in the future posts could easily be translated to C#. Java GC tuning is covered very well by Sun and I leave it to their engineers who have a much better knowledge/understanding of the Java heap internals to explain GC issues.
  24. Primitives as Objects[ Go to top ]

    I tried to create a JSR (I don't know if it's in Limbo or what) to change the language spec to treat primitives as first-class Objects (no-autoboxing.) I don't have much hope for it to happen in Java 2 or whatvever the numbering scheme is now. Iterestingly, auto-boxing means that most (if not all) the required syntax changes are already implemented in the compiler.

    Such a change would make all these primitive wrappers that plagued your example system unecessary. It would also make the language a lot more natural. Have you ever wanted to store a counter as a map value pre-1.5? It's a lot of contortions (and Objects) to acheive something pretty trivial.

    Anyway, it'd be nice if I could get people on board with this idea. It might actually be implemented if that were to occur.
  25. Primitives as Objects[ Go to top ]

    James,
    I am not sure I follow you. "primitive as first-class Objects (no-autoboxing)" sounds as if you want an integer value that is a primitive and an object at the same time. I am not sure if it is possible and how it would work. If it is an object then it has to descend from java.lang.Object and be managed by GC, etc. which is in fact what java.lang.Integer is today. Primitives don't have any sort of identity which is in direct conflict with the way object work. So you cannot have an object/primitive hybrid, at least I don't see how.
  26. Primitives as Objects[ Go to top ]

    James,I am not sure I follow you. "primitive as first-class Objects (no-autoboxing)" sounds as if you want an integer value that is a primitive and an object at the same time. I am not sure if it is possible and how it would work. If it is an object then it has to descend from java.lang.Object and be managed by GC, etc. which is in fact what java.lang.Integer is today. Primitives don't have any sort of identity which is in direct conflict with the way object work. So you cannot have an object/primitive hybrid, at least I don't see how.

    Let me clarify. Primitives would not change. They would be represented the same way as they are now.

    The diference would be that you could assign a primitive to an reference. In theory, this should not be difficult. References are the 'other' primitive. From a low level perspective, all they contain is an address.

    The VM would have to be able to 'know' whether a value is a primitive or a Object reference. In the case of a primitive, when you call toString on an int (for example)when it did the method lookup procedure that already exists for Objects, it would 'see' that it was an int and call the toString() method for int that is defined in the VM (probably per a JLS definition.)

    The hard part is that the VM must detect what the type of the variable really is. This could be accomplished with a few bits on the variable. This might incur some overhead of course, as there would be an extra step required to determine whether the reference is an primitive or not.
  27. Primitives as Objects[ Go to top ]

    I guess the point is to unify the Java type system. I think the decision to create dual type system was a flaw in the original language design.
  28. Unifying the Type System[ Go to top ]

    I guess the point is to unify the Java type system. I think the decision to create dual type system was a flaw in the original language design.

    It depends. Programming in languages with a unified type system are much more elegant, but there's a cost to pay in terms of memory consumption and execution time.

    Personally, I would like Java to be more like C++ (templates, operator overloading, optional explicit scope for objects, optional explicit deletion of objects w/destructors), but it also should borrow some from Python (particularly metaclasses and an improved version of descriptors).
  29. Unifying the Type System[ Go to top ]

    I guess the point is to unify the Java type system. I think the decision to create dual type system was a flaw in the original language design.
    It depends. Programming in languages with a unified type system are much more elegant, but there's a cost to pay in terms of memory consumption and execution time.

    You can make that argument about Object-Oriented langauges in general. I've never had a problem where anoptimal program was 'too slow' in any language. You are also assuming this would be significantly slower. I'm not convinced that is true.
    Personally, I would like Java to be more like C++

    templates: pass
    operator overloading: please no
    optional explicit scope for objects: not sure what this is
    optional explicit deletion of objects w/destructors: I would prefer an option for deterministic GC. Most of the time the VM is smarter than the developer when it comes to GC but there are certain (rare) cases where the developer can do better.
  30. Unifying the Type System[ Go to top ]

    optional explicit scope for objects: not sure what this is

    In C++, the destructors are automatically invoked for objects allocated on the stack when the object goes out of scope. This means for resources such as database connections or file streams, you don't have to worry about closing them when an exception gets thrown or something like that. When they go out of scope, they are automatically closed. I can think of lots of examples where I'd like to be able to say "If this scope it exited, clean up this object."

    In Java, you get those wonderful try/catch/finally blocks that have to check to see if (1) a resource was even created (2) was it ever opened, and then (3) close it. In a more generalized sence, you have to ensure that every possible path through your code contains code to free the resources that it has allocated.
    optional explicit deletion of objects w/destructors: I would prefer an option for deterministic GC. Most of the time the VM is smarter than the developer when it comes to GC but there are certain (rare) cases where the developer can do better.

    The problem is memory is probably the least scarce type of resource in a more general class of resources (files, database connections, network connections, etc) that need to be carefully managed. The GC is great for objects that just consume memory, but it doesn't really work for anything else.

    I also think if your code uses an object after you thought your code should be done with it, that's a serious error. With explicit deallocation, it will cause a serious crash.
    templates: pass
    I think Java generics serve about 90% of the comprehensible uses for templates in a superior fashion, and that much of the template usage in C++ isn't very good. But there are great potential usages for templates that generics won't satisfy, it just takes some mind-bending thought (kind of like Python metaprogramming).
    operator overloading: please no
    I think the most important part of an OO language is to be able to make your classes seem like part of the language, and then to bring your code a close to your problem domain as possible. A good OO language should let you write code (once the underlying classes are written) that look almost like it's a DSL. That's hard to do without operator overloading.
  31. Unifying the Type System[ Go to top ]

    optional explicit scope for objects: not sure what this is
    In C++, the destructors are automatically invoked for objects allocated on the stack when the object goes out of scope. This means for resources such as database connections or file streams, you don't have to worry about closing them when an exception gets thrown or something like that. When they go out of scope, they are automatically closed. I can think of lots of examples where I'd like to be able to say "If this scope it exited, clean up this object."In Java, you get those wonderful try/catch/finally blocks that have to check to see if (1) a resource was even created (2) was it ever opened, and then (3) close it.

    I don't worry about that much because I use an approach that IMO is better than destructors. Even if you have desctructors, you are still dependant on the Object being deallocated. What I do is specify an interface and let my class determine when the resource is to be deleted.

    For example, instead of iterating over the rows returned from a resultset in a hundred different classes and hoping every developer calls the cleanup method in a finally block, I provide a RowHandler interface. Then I have a single class that processes queries and passes the rows to the RowHandler one at time. Then, I can guarantee that in all cases, once the rows have all been processed, the resources are cleaned-up.

    I kind of see all the syntactic tricks and short-cuts in c++ to be a bit of a crutch. Destructors, to me, are bad OO. They make clean-up the side effect of a deallocation of memory. I prefer a more direct apporach.
  32. Off topic on C++ vs Java[ Go to top ]

    Destructors, to me, are bad OO. They make clean-up the side effect of a deallocation of memory.

    Yes and no. I think destructors should be more like the "close" method on many objects, and probably shouldn't be called destructors. I've seen soooo many problems with incorrectly managed scarse resources, and I think providing explicit syntax to say "when this reference to this object goes out-of-scope, invoke this method to cleanup the object."
    I kind of see all the syntactic tricks and short-cuts in c++ to be a bit of a crutch.

    Yes and no. Not many people really know how to use templates beyond leveraging the STL.

    Don't get me wrong, there's a lot about Java that I really like, and C++ always seems good in theory and awful in practice. But Java (as a language, not it's libraries and frameworks) frequently feels "dumbed down" compared to both C++ and dynamic lanuages like Python.

    How's this. The engineer in me finds the idea that a language should omit features because 90% of programmers misuse them offensive. The manager in me knows that I can't have only the top 10% of programmers working with me, so giving them one less way to shoot themselves in the foot is a good thing. I think Java has enjoyed much of it's success because it is a very accessible language.
  33. Off topic on C++ vs Java[ Go to top ]

    How's this. The engineer in me finds the idea that a language should omit features because 90% of programmers misuse them offensive. The manager in me knows that I can't have only the top 10% of programmers working with me, so giving them one less way to shoot themselves in the foot is a good thing. I think Java has enjoyed much of it's success because it is a very accessible language.

    I'm with you here. I think there are some features that could be added to Java to make it a better language. Hell, I just proprosed one. But I think it needs to be done slowly.

    I think Java is popular because (as you say) it's accesible but I think there's more than that. The advantage I see in Java is that the syntax is pretty sparse. There's not a lot of overlap in ways to code things. Sometimes this can be a burden but it also drives developers (myself included) to find better approaches. 'Necessity is the mother of invention', to quote the old saw.

    The other thing I like about Java is that it distills out the basics of OO. I have a degree in Computer Science but I credit developing in Java for making me really understand it. Not having all the options of C++ forces a developer to sink or swim with OO. That's my opinion, anyway.

    I really should look at Python more. I hear a lot of good things but I haven't gotten unlazy enough to play around with it.
  34. Off topic on C++ vs Java[ Go to top ]

    I really should look at Python more. I hear a lot of good things but I haven't gotten unlazy enough to play around with it.

    It took me about 6 or 8 months to start grokking Python enough to feel like it offered enough advantages over languages like Java to overcome it's disadvantages (I'm a big believer in static typing).

    The first thing to win me over is that Python programs do tend to "just work." I can't explain why. However, they also tend to have hidden bugs (IMHO due to the lack of static typing). So time to "functional" is extraordinary, but time to "production ready" isn't any better than other languages.

    What really got me was when metaclasses and descriptors really sank in. I found myself replacing 100s of lines of code with 10s by identifying patterns in my code (and patterns within those patterns) and coding them into metaclasses and descriptors. I could even make my code validate that patterns were being correctly applied.

    The result is very dense code (meaning very few lines, but the lines are very logically intensive, and get executed very frequently).

    Anyway, I think many of the positive aspects of Python could be ported into the Java compiler w/o changing the JVM.
  35. Off topic on C++ vs Java[ Go to top ]

    What really got me was when metaclasses and descriptors really sank in. I found myself replacing 100s of lines of code with 10s by identifying patterns in my code (and patterns within those patterns) and coding them into metaclasses and descriptors. I could even make my code validate that patterns were being correctly applied.The result is very dense code (meaning very few lines, but the lines are very logically intensive, and get executed very frequently).Anyway, I think many of the positive aspects of Python could be ported into the Java compiler w/o changing the JVM.

    Thanks, you've piqued my curiosity. I'll have to check it out.
  36. Unifying the Type System[ Go to top ]

    I think Java generics serve about 90% of the comprehensible uses for templates in a superior fashion, and that much of the template usage in C++ isn't very good. But there are great potential usages for templates that generics won't satisfy, it just takes some mind-bending thought

    I think generics are OK but I find that too many people look at them as a puzzle to solve. They are thinking up all these fancy ways to use them and writing crazy methods to avoid compiler warnings. They aren't worried about making the code sensible and maintainable or even accomplishing a meaningful goal. They are just goofing around with a toy. Most of the value of generics comes from the most simple Collection declarations. I don't need more over-complications.
    I think the most important part of an OO language is to be able to make your classes seem like part of the language, and then to bring your code a close to your problem domain as possible. A good OO language should let you write code (once the underlying classes are written) that look almost like it's a DSL. That's hard to do without operator overloading.

    I can go along with the sentiment but I've seen too much bad code. The nice thing about Java is that I can look at a snippet of code and know exactly what it does. Operator overloading casts doubt on that. I'm not arguing that it's perfect, but Java has a Zen quality to it. C++ feels more banal. Kind of brutish and unrefined. Operator overloading can be achieved with pre-compilers. I'm always surprised that more projects to create some haven't taken off.
  37. OT: Java Precompilers[ Go to top ]

    Operator overloading can be achieved with pre-compilers. I'm always surprised that more projects to create some haven't taken off.

    I'm not. Let's say you write a precompiler for Java that implements operator overloading. Now what happens when people try to use it in their favorite IDE? Maybe they can make it work by manually editting the Ant script, but now their powerful IDE has been reduced to a overweight text editor.
  38. OT: Java Precompilers[ Go to top ]

    Operator overloading can be achieved with pre-compilers. I'm always surprised that more projects to create some haven't taken off.
    I'm not. Let's say you write a precompiler for Java that implements operator overloading. Now what happens when people try to use it in their favorite IDE? Maybe they can make it work by manually editting the Ant script, but now their powerful IDE has been reduced to a overweight text editor.

    True. Theoretically, you could implemenet an Eclipse layer for it, I would think. I'm not saying Eclipse is the only IDE, but it seems to be becoming the defacto Java IDE, despite it's issues.
  39. Primitives as Objects[ Go to top ]

    James,
    If I follow what your saying, you want to associate type information with primitives so that you can treat them as objects with value semantics instead of reference semantics, rather than emulating it using compile-time boxing/unboxing between primitives and immutable objects.

    It seems to me that doing this will make you pay a fairly significant memory consumption penalty, because even though you're only need a "few bits," you're going to add a minimum of 8 and probably 16 or even 32 in order to maintain good memory alignment. Consequently software that makes heavy usage of primitives (other than references) would pay a significant storage penalty. Auto-(un)boxing only imposes a penalty when it is used, and the GC is supposed to be good at cleaning up small short-lived objects.

    My gut tells me that you'd get more bang-for-the-buck by more heavily optimising the JVM to treat primitive wrappers in a special manner, and continueing to use auto-(un)boxing to convert between primitive types and they're wrappers.
  40. Primitives as Objects[ Go to top ]

    James,If I follow what your saying, you want to associate type information with primitives so that you can treat them as objects with value semantics instead of reference semantics, rather than emulating it using compile-time boxing/unboxing between primitives and immutable objects.It seems to me that doing this will make you pay a fairly significant memory consumption penalty, because even though you're only need a "few bits," you're going to add a minimum of 8 and probably 16 or even 32 in order to maintain good memory alignment. Consequently software that makes heavy usage of primitives (other than references) would pay a significant storage penalty.

    I didn't make this clear (partly because it wasn't completely clear in my own mind) but I don't think it would be necessary to add type info to the primitive variables. If the VM is working with an int type, it could work as it does now. The difference would only come into play when the int was assigned to a refernece. The refenerence would need to have some bits set to tell it that this value is a literal and it's type. The VM already stores type info in references.

    One issue with this is that references would be forced to be able to hold 64 bit values, which may not be the case in a lot of platforms. Perhaps there could be a work-around to allow this to be done without making all references larger.
    Auto-(un)boxing only imposes a penalty when it is used, and the GC is supposed to be good at cleaning up small short-lived objects.My gut tells me that you'd get more bang-for-the-buck by more heavily optimising the JVM to treat primitive wrappers in a special manner, and continueing to use auto-(un)boxing to convert between primitive types and they're wrappers.

    Along the lines of this thread, I'm not concerned about short-lived Object wrappers but rather the medium-lived Object wrappers. As the author points out, the GC's algorithm is dependent on the number of Objects, not their size.
  41. Primitives as Objects[ Go to top ]

    I didn't make this clear (partly because it wasn't completely clear in my own mind) but I don't think it would be necessary to add type info to the primitive variables. If the VM is working with an int type, it could work as it does now. The difference would only come into play when the int was assigned to a refernece. The refenerence would need to have some bits set to tell it that this value is a literal and it's type. The VM already stores type info in references.

    That last sentence is bogus. Maybe I haven't thought it through completely. Maybe the idea still has some potential. I don't know.
  42. Unnatural acts of Java[ Go to top ]

    Very valid. You can not expect the hundreds of programmers in your company to keep all these things in their head while programming.
    Those days of low level programming (like in C/C++) days are gone now (unless you work for some system software company).

    With this in mind, the article is technically interesting, but
    practically little effective.
  43. ,[ Go to top ]

    Kiran
  44. practically little effective[ Go to top ]

    Kiran,

    At the end of the series I will present performance measurements to show that it is effective. It certainly does solve the problem in my server program.
    It may not be applicable to your software and you could question its wider applicability. That is a different issues from effectivness.
  45. Nevertheless there are real physical limitations of the machines Java VM is executing on that the VM/GC engineers cannot do anything about.

    in fact, that limit has now been pushed out a couple of orders of magnitude -- from typical heap siszes of up to 2 gig to 96 gig -- by building hardware that has support for GC.

    indeed, some of the assumptions made by Slobodan are categorically not true for Azul's Pauseless GC. for example Pauseless GC doesn't need to suspend all threads to mark the heap.

    Also, I find it appalling that he had a server with several hundred meg of heap and was getting pauses. I spend a fairly large part of my time talking with enterprise architects and developers about GC, and most of them are running 1-2 gig heaps with no pauses.

    Now I'm wondering if perhaps he didn't have a different problem, like heap paging. That can certainly cause pauses even on smaller heaps.

    --bob, who works for www.azulsystems.com