JSR 201: Public Review of Enumerations, Autoboxing, for loops

Discussions

News: JSR 201: Public Review of Enumerations, Autoboxing, for loops

  1. The JCP has announced the Public Review of JSR 201: Extending the Java Programming Language with Enumerations, Autoboxing, Enhanced for loops and Static Import.

    Grab the Public Review

    View the detail page for JSR 201

    NOTE: The close of Public Review is 21 February 2004, so get your comments in.

    Draft specs for enumerations, autoboxing, enhanced for loop, static import are being made available concurrently with the release of this JSR.

    Threaded Messages (27)

  2. On the for loop, it appears that there is some confusion in the spec between SimpleIterator and ReadOnlyIterator ... are they perhaps intended to be one and the same?

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  3. Here's an interesting ableit excruciatingly detailed thread from the
    Java Forums on the subject:
    http://forum.java.sun.com/thread.jsp?forum=316&thread=436884&start=0&range=15&hilite=false&q=

    Derek Kaczmarczyk
    http://www.strayneuron.com

    > On the for loop, it appears that there is some confusion in the spec between SimpleIterator and ReadOnlyIterator ... are they perhaps intended to be one and the same?
    >
    > Peace,
    >
    > Cameron Purdy
    > Tangosol, Inc.
    > Coherence: Clustered JCache for Grid Computing!
  4. If you can figure out that thread (the link you posted), please let me know ;-)
  5. This issue had been resolved.[ Go to top ]

    Download the public review draft from Sun. The links above are outdated.
  6. Understanding the thread[ Go to top ]

    If you can figure out that thread (the link you posted), please let me know ;-)

    Well, I'm not Derek, but I'm the one responsible for the thread.

    The crux of the matter is that Java implements covariance (and certain features of generics) using compiler-generated bridge methods. For example, take these two classes:

    public class A {
      A clone() {
        return new A();
      }
    }

    public class B {
      // covariant to A.clone
      B clone() {
        return new B();
      }
    }

    The 1.5 compiler would generate extra bridge methods to ensure that an invokevirtual call to clone() on a B ref would actually call B.clone().

    Because covariance relies on these generated bridge methods, you can't retrofit covariance into a class hierarchy that was compiled prior to 1.5 - unless you went in and did something tricky like rewrote the bytecode (which is what I did for a while).

    Apparently, there were concerns about the feasibility of fixing the general problem, so Sun decided to just drop SimpleIterator and the changes they had made in the new API. What you should take from this is that you have to be very careful with covariance and generics in the face of binary compatibility. Their implementation is not a black box.

    God bless,
    -Toby Reyelts
  7. To much review...[ Go to top ]

    ...maybe I am missing something, but on JavaOne, we had sessions and sessions telling us about autoboxing, generics, enhanced for loops, enumerations (the latter being something pretty much broken, IMHO, due to its very nature). We were told we would have all that in Java 1.5 "just around the corner" in fall 2003. Now it is almost Spring 2004, and we are getting a JCP review for it?? In my opionion, this need not be part of the JCP, it is a language feature, not an API, and I don't want much wanted language features at the sake of some review board (see the JSF fiasco). Also, things like autoboxing, generics, for loops etc. should be well understood enough by now for someone (not some rookie of course) to just write the specification down and sign it off internal to Sun.
  8. Re: To much review...[ Go to top ]

    just write the specification down and sign it off internal to Sun.


    You do realize that this would've led to everyone going bananas over Sun not caring about the community's opinion, don't you :)
  9. Re: To much review...[ Go to top ]

    You do realize that this would've led to everyone going bananas over Sun not > caring about the community's opinion, don't you :)


    Yeah, you might be right. On the other hand on the last JavaOne there were thousands of developers who were very very anxious to get their hands on a production version of this stuff. If you don't harvest if the fruit is ripe you will it will get foul (or in some occasion become excellent ice wine ;-)).
  10. My biggest fear is that with autoboxing, people will get lazy and Java programs will get slower. People might start using ints as keys to their maps, for example, not fully realizing that everytime you call get/remove/put with an int an Integer object is instantiated.

    I much rather would've seen, for example. the Map interface expanded to include a 'put' method that takes each of the primitives as the value, and a different map implementation for each primitive... IntHashMap, IntTreeMap, FloatHashMap, FloatTreeMap, etc...

    Every other feature that is just for lazy typers doesn't impact performance, but autoboxing does.
  11. I much rather would've seen, for example. the Map interface expanded to include a 'put' method that takes each of the primitives as the value, and a different map implementation for each primitive... IntHashMap, IntTreeMap, FloatHashMap, FloatTreeMap, etc...


    Then you would need 64 map implementations!!!
    7 primitives + 1 object for keys and the same for values!!!

    I think that the solution for primitive types porblem in some future Java version or completely new platform/language is in smart compilers/VMs. From the programming language and developer point of view all primitive types should be treated like full Objects, like Integer, Long etc. But from VM side primitive types should be treated like value types. Instead of storing an object reference to Integer class, compiler and VM would store four byte value type. As all primitive types are immutable value objects there would be no problems with parameter passing semantics.

    This way we will get the best from both worlds: transparent primitive types and good performance. I think that autoboxing syntax change is one step in that direction.

    Mileta
  12. Then you would need 64 map implementations!!!

    >> 7 primitives + 1 object for keys and the same for values!!!

    Sorry, it's 81 map implementations.
    (8+1)*(8+1) (8 primitives + 1 object)
  13. Different math[ Go to top ]

    I don't think it would be 81. For Maps, the keys are the ones with special calculations -- hashCode(), equals(). The values don't matter. So, I would propose 9 Map interfaces (8 primitives, 1 Object) each with 9 put(xxx,yyy) methods where xxx is the Map's primitive type and yyy is each of the 8 primitive types, plus the Object. For example:

    IntMap ints = new IntMap();
    ints.put(0, 0); // put(int, int)
    ints.put(0, "0"); // put(int, Object)
    ints.put(0, 0.0f); // put(int, float)
    // etc...

    Still, that's 8 new interfaces, and each would have 8 new methods! It's not a nice solution, and I like the concept of a smart-compiler better.

    Still, I once wanted to use 'long' as akey, so I did make a LongMap which was tremendously more efficient than a Map using Longs as keys. I'd rpefer not to have autoboxing at all and just force developers to make their own primitive collections.

    Or how about a Map interface that took a Key interface that somehow told the Map how to treat a primitive as a Key (basically, implementing a speccial hashCode(), equals() as a third party).
  14. Different math[ Go to top ]

    Brian,

    The real question is related to the cost of autoboxing. For a typical business app, it makes zero performance difference. In other words, the cost of boxing and unboxing a couple thousand ints or whatever is measured in the low single milliseconds, which is usually a rounding error for performance (especially compared to sending a single JMS message or doing a single simple JDBC query.)

    OTOH, in certain libraries, a few millis here and there are much more valuable, and specialized data structures are called for. I think that burdening the typical application developer with this type of responsibility is pointless, and would tend to cause many more application errors and code bloat.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  15. Cost of autoboxing[ Go to top ]

    [i]In other words, the cost of boxing and unboxing a couple thousand ints or whatever is measured in the low single milliseconds[/i]

    I think you understate the cost.

    1) You are allocating an actual Java object which (at least on Sun's VM) has an object header that takes an additional eight bytes of space above and beyond the original four bytes of integer data.

    2) Garbage collection overhead - Every single time you box a primitive, you create a new object that has to be collected.

    What this really means is that the cost isn't bad if you're just boxing some here and there, but it can add up really fast when you're dealing with millions of objects. And guess what? If you're using a collection, you'll be performing a box every single time you access an element in the collection. Add to this the fact that autoboxing was also the solution for creating generic containers of primitives and you can start to see where a problem may start to arise. The performance is so poor as to be unusable for most application. I say this from experience.

    What Java developers need is a means of creating generic containers that operate on primitive values efficiently. How that occurs - through new special "value types", through specialization of generics for primitives, through VM magic (ala Smalltalk), is less of an issue, but it needs to happen.

    God bless,
    -Toby Reyelts
  16. Cost of autoboxing[ Go to top ]

    Toby,

    Testing on my lowly P3/1Ghz laptop with JDK 1.4.2 (Hotspot server), I see average 70ms per 1 million boxing+unboxing operations. That means I am seeing 15,000 boxing+unboxing operations per millisecond, including GC time (because I run the entire thing in a big loop to force it to include GC times.) Keep in mind that my notebook is a bit slower than the typical modern server ;-)

    I happen to agree with you that blindly accepting boxing/unboxing overhead is foolish, but I stand by my claim that for most business logic it will not make a noticeable (if even measurable) difference in performance.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  17. Cost of autoboxing[ Go to top ]

    Hi Cameron,

    I can't confirm your timings, but 70ms is a very long time in the kind of software I work on (routing and scheduling), and the norm is that you perform many accesses to collections per second. In fact, the core of the algorithms involved revolves around efficient access to collections of data.

    We also work with millions of objects in memory. (Roughly 100 million). Multiply 100 million times 8 bytes, and you get 800M of overhead. Anything but trivial.

    Routing and scheduling may not be "average business logic", but that kind of logic runs in a large number of businesses.

    God bless,
    -Toby Reyelts
  18. Cost of autoboxing[ Go to top ]

    Toby,

    We also work with millions of objects in memory. (Roughly 100 million). Multiply 100 million times 8 bytes, and you get 800M of overhead. Anything but trivial.

    That is a specialized case, and calls for specialized data structures. On that point, you and I are in complete agreement. We have developed several such specialized data structures ourselves for high-volume and latency-sensitive parts of our software, and it results in 96% lower memory utilization (under load) than other packages that we've compared to.

    What I am saying, though, is that a combination of a box and an unbox of an int/Integer that costs 70 nanoseconds on a low-end notebook (i.e. 15 million box and unbox operations per second) is probably sufficient for the majority of business logic processing, particularly when the average business application is spending the overwhelming majority of its time in network, database or file I/O.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  19. Special cases[ Go to top ]

    Cameron,

    I think we agree vehemently on most things, yet...

    That is a specialized case

    I don't think so, or I think you exaggerate "specialized". Perhaps it depends upon what you believe Java is suitable for as a language. Languages that require developers to write hundreds of identical yet specialized containers to avoid memory and cpu bottlenecks are not appropriate for a very large number of "not-so-special" tasks. It is provincial to state that this problem is somehow unimportant.

    God bless,
    -Toby Reyelts
  20. Special cases[ Go to top ]

    Toby, you said you had one hundred million business objects in memory that your app is holding onto and working with. Even if they were extremely small / simple / tight objects, you're still talking about close to 2GB heap just to get that many tiny tiny objects instantiated!!! That's why I said it's a special case ... most of us still use these stupid 32-bit OS's with 1.6GB JVM heap limits ;-)

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  21. Special cases[ Go to top ]

    Actually, we run this in under 1.3G which is the VM limit on non-server Windows. (At least W2K). Using 1.3G of memory shouldn't be very special case in the days where people run IDEs that chew up 512M without blinking.

    Unfortunately, we're having to look at ways of stretching the limit a bit. I'm in the midst of considering a general heap allocator based on direct byte buffers....

    God bless,
    -Toby Reyelts
  22. Special cases[ Go to top ]

    Hi Toby,

    As an FYI, on Windows the NIO buffers appear to come from the same 2GB that the JVM is already chewing up ... that is my analysis at any rate.

    Have you considered the 64-bit Solaris JVMs? Or Itanium? Or Opteron?

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  23. NIO limits[ Go to top ]

    As an FYI, on Windows the NIO buffers appear to come from the same 2GB that the JVM is already chewing up ... that is my analysis at any rate.

    I'm not sure I understand you. I can allocate about 1.3G of space max in the Java heap, and I can allocate about 500M of direct byte buffer space (with some other crap running). That gets us what we want - beyond the 1.3G contiguous memory space limit in the VM. I haven't tried this out on a 3GB Windows Server machine yet, but I expect to be able to allocate 1.5G of direct byte buffer space there.

    Have you considered the 64-bit Solaris JVMs? Or Itanium? Or Opteron?

    Yes, we scale with the hardware. A W2K Prof 32-bit x86 box would be the low-end for routing with the largest networks. I don't know the heap limits on Sun's current AMD64 VMs, but I hope they have improved. For example, the heap limits on 64-bit Solaris are still pretty low.

    God bless,
    -Toby Reyelts
  24. Cost of autoboxing[ Go to top ]

    I think Cameron's point is still valid. For most business logic the extra overhead isn't going to matter and if there is some gain in maintenence, cleaner looking code and less "custom Collections" all the better.

    In any system where performance is the biggest concern you'd hope to have developers who understand the autoboxing overhead, along with the other 1000 performance concerns when using java.

    I'm wondering if future JVMs will be able to negate this altogether? Maybe a special optimized GC for autoboxed values? That stuff is above my head.
  25. Custom Collections[ Go to top ]

    custom Collections

    That is the point. We shouldn't need custom collections to obtain acceptable performance when using primitives. That's the current crappy solution - and boy is it ever crappy.

    We all want more maintainable code, but it shouldn't come at the cost of acceptable performance. Autoboxing is fine, because it does not take away from acceptable performance. Collections of primitive-based objects are not fine, because they do take away from acceptable performance.

    Many people (myself included) were joyfully awaiting the arrival of generics, because we hoped that it would also entail the end of custom collections. Unfortunately it did not, and here we still are.

    God bless,
    -Toby Reyelts
  26. This reminds me of...[ Go to top ]

    I think that the solution for primitive types porblem in some future Java >version or completely new platform/language is in smart compilers/VMs. From >the programming language and developer point of view all primitive types >should be treated like full Objects, like Integer, Long etc. But from VM side >primitive types should be treated like value types. Instead of storing an >object reference to Integer class, compiler and VM would store four byte >value type.


    Your idea reminds me strongly of how generics and collections work in C# 2.0.
  27. This reminds me of...[ Go to top ]

    Your idea reminds me strongly of how generics and collections work in C# 2.0.


    I suppose you mean how they WILL work in C# 2.0, as it is not yet releases and is AFAIK scheduled for release slightly after J2SE 1.5.

    But, handilng primitives int two different ways (from developer and from VM perspective) is somewhat different from aviding type casts in C# 2.0 generics.
    AFAIK C# still holds object references for autoboxed primitive types when they are stored in collections.

    Mileta
  28. I suppose you mean how they WILL work in C# 2.0, as it is not yet releases and is AFAIK scheduled for release slightly after J2SE 1.5.


    I stand corrected: I mean how it works in C# Beta 1, currently installed and running in this very machine (Miguel de Icaza has also generics working in Linux Mono). As fas as I know, there is no defined release date for Whidbey so no comments here (alas, if standard conventions has been followed, I guess Whidbey Beta 1 is more mature than Tiger Alpha ;-)

    > But, handilng primitives int two different ways (from developer and from VM perspective) is somewhat different from aviding type casts in C# 2.0 generics.
    > AFAIK C# still holds object references for autoboxed primitive types when they are stored in collections.

    If you use C# classic collections you are right, but if you use C# generic collections, the compiler is smart and don't box the values into objects but use the values themselves. More info at http://www.artima.com/intv/generics.html.