Discussions

News: Opinion: Object Identity and how JDK 1.2+ removed a core feature

  1. Sam Pullara of BEA, has written a history lesson about Object Identity. He explains how Object Identity is "at the core of all persistence containers and most distributed object systems", and how Java had a built in solution which was removed in JDK 1.2 and above. He explains why he thinks it was removed, and how he thinks a work around could have changed the life of Java.

    "Object identity is something that is at the core of all persistence containers and most distributed object systems. Finding the "same" object in another system or loading the "same" object that was referenced before is at the heart of all entity based containers. Java had a built-in solution to this but it was removed."

    He even ties in EJB: "Instead we had to come up with yet another specification, EJB, to handle object identity through the definition of a primary key for entities separate from the normal Java contracts."

    Read Sam Pullara on Object Identity and why JDK 1.0.2/1.1 was better than 1.2+

    Threaded Messages (29)

  2. Hear ye, hear ye...[ Go to top ]

    So true.

    And I still don't get why -with all the other itty bitty classes we've already got- we don't have a GUID generator as part of the JDK. Or failing that, how about a timestamp+IP+high+low OID generator at least?

    I'm aware of some "undocumented features": in our current project we use a class (wrapped in a utility class, of course) in RMI's distributed garbage collection package, but, we need something portable.
  3. Re: Hear ye, hear ye...[ Go to top ]

    Try downloading JDK 1.5 alpha and using it. You might get happy (wish I could say more, NDA limits me though).
  4. when javalobby or whom ever gave access to this version? silly NDA's.

    besides, isn't java the libraries a publicly discussed and decided specification that is merely _implemented_ by the jdk? it would seem odd to me that we'd find something unexpected in sun's alpha 1.5 jdk.
  5. "Object identity is something that is at the core of all persistence containers and most distributed object systems. Finding the "same" object in another system or loading the "same" object that was referenced before is at the heart of all entity based containers. Java had a built-in solution to this but it was removed."

    >
    This "built-in solution" was and is designed for hash maps, hashCode is not a object identity. I do not understand how "old" way is better for persistence and distributed object.
  6. I agree that hashCode is not the same as object identity. There is nothing in the contract that would indicate that the hashCode could be used as object identity and it was never ment to be used as one. Two different objects (in content) can have the same hashCode. You must always perform the 'equals' test when the hash codes match to be sure of the 'identity'.
  7. hascode and equals provider missing[ Go to top ]

    Indeed, the .net platform imposes an Object.GetHashCode() contract similar to the "old" java Object.hashCode() contract:
    <quote>The hash function must return exactly the same value regardless of any changes that are made to the object.</quote>
    I don't know which is better, but I know for sure that the hash based collections in jdk certainly miss the flexibility from the .net counterpart given by the interface System.Collections.IHashCodeProvider. Rather that providing such a great flexibility, jdk1.4 has two different kinds of hashtables, the old HashMap/Hashtable and another one, IdentityHashMap, that breaks all the hashtable contracts. If Jdk 1.4 breaks its own contracts, how is the casual java programmer encouraged to respect them ?
    Let's suppose we have some entity beans, and want them as keys in a hashtable, without any wrapper. Well, this is impossible using the standard HashMap/Hashtable/IdentityHashMap implementation. If we had an IHashcodeProvider, we could simply implement hashCode() in terms of EJBObject.getPrimaryKey(), and equals() in terms of EJBObject.isIdentical(EJBObject obj). Much simpler I think.
    Another equality/hashcode feature that I miss from .net is the static Object.Equals(object, object) method. The construct x == null ? y == null : x.equals(y) appears all over the place in jdk, but no one bothered to make it a public static method usable by everyone.
    Anyway, the jdk collections are overall better structured that the .net counterparts. Especially the collection names are more suggestive. I think the worse .net name for a collection is described here:
    <quote>A SortedList is a hybrid between a Hashtable and an Array.<quote>
  8. Hashtables/Hashmaps[ Go to top ]

    Doesn't this rule change also break Sun's implementation of hashtables and hashmaps (if you change the hashcode of something used as a key after it has been added)? Don't we just need to be aware that things used as keys for persistence (or even hashtables) should be immutable?
  9. Does not break hashtable[ Go to top ]

    Doesn't this rule change also break Sun's implementation of hashtables and hashmaps

    It does not break them, because of the "equals" part of the contract. Store something to hashmap with key k1. Then change the key so that hashcode changes (you get k2). According to the contract this means that k1.equals(k2) == false. And if the two keys are not equal, they should not return the same object from the map.

    Personally I don't see how hashcode could be used as a key, since nobody is saying that it should be unique. How can you use a non-unique key to find "the same object in athoher system"? Also the JDK 1.1 contract says that the hashcode may change between executions. Does not sound like an ultimate way of retrieving persisted objects.
  10. Does so break Hashtable[ Go to top ]

    The java.util.Hashtable implementation uses the hashcode to figure out which of many buckets a key object belongs in. If the hashcode of a key changes after it's placed in its bucket, you're not likely to ever find it again. More than likely, with the new hash code you'd expect to look in a different bucket to find it, and it won't be there. If you look in the old bucket you won't find it there either, because it won't be equal any more (since it has a different hash code than it did). Thus Hashtable and many other containers optimized with calls to the hashcode method don't work well if you change the hashcode of keys after the objects have been inserted.

    Of course, if it hurts when you do that, then don't do that. Don't use mutable objects as hashtable keys (or at least don't change them), and the spec (or at least the javadoc for Hashtable et al) probably should point that out.
  11. I can't understand why the author thinks "hashCode" should serve the purpose of determining the identity of an object. It's perfectly acceptable for the hashCode method of two non-identical objects to return the same value. For any object that hash more than 2^32 possible states (e.g. String, Long, Double, ...) you can't avoid this even if you want.

    Imposing the restriction that hash-codes should never change, along with the contracts of the hashCode and equals method, would mean that the hash-code can never depend on any fields that may change. This would force all non-immutable objects (which probably make 95% of the JDK, for instance) to leave a big part of their identity (if not all of it) out of the hash-code. This means poor performance in hashtables, which are the main users of the hashCode method.

    If you want identity to be content-independent (an object retains identity even after it's content is changed) you need to specify some other means of determining identity, such as a "special" identity field (e.g. the PK field of an entity bean). In such objects, the equals and hashCode methods can depend on the identity field instead of the context. But this only makes sense in such objects, which are a very little part of the spectrum of Java objects. Changing the general equals and hashCode contracts to accomodate a few persistence engines doesn't make any sense, IMO. Engines can provide interfaces that override and redefine these contracts, if the wish. Although I don't think hashcode is the right place to do that anyway.

    Gal
  12. It makes sense[ Go to top ]

    The reason for this change is simple: most of the time, you don't need object identity in Java, even when you store your object in a collection.

    Take a look at your code and see how many of your objects are anonymous... I would bet, most of them. Therefore, it's only logical to relax the restriction on the identity of objects since it makes it easier to reuse existing objects (e.g. the GridBagConstraints object, or more generally, pooled objects).

    --
    Cedric
  13. Object Identity[ Go to top ]

    I have tried to parse Sam's rather confused entry to see what he is getting at.

    As others noted, the pre-1.2 definition of hashcode() never would have yielded the object identity concept that Sam describes.

    I think Sam wants a function that presumes immutability and generates an ID that is unique and consistent for each combination of parameters that are passed into the constructor.

    So, what datatype would such a function return? What if my object's constructor contains 20 Strings, each 65,000 characters in length? A long or an int will be slightly inadequate for encompassing all possible unique states of the object.

    Constructing Sam's object identity, and implementing all necessary operations on the identity, is going to be very expensive for objects of arbitrary complexity. And presumably, the expensive key computation will have to occur at the time that the constructor is invoked to preserve immutability.

    Sam's function would be useful in some scenarios, but it is not at all clear to me that it belongs in the JDK.
  14. Absolute rubbish!

    The original definition of hashcode and equals were not logically consistent: If the 'value' of the object changes then equals must take this into account, however the original definition states that the hashcode must NOT change. Then, how can the hashcode be the same for all objects that are 'equal'? In general they can't.

    The spec was updated to reflect that the hashcode MUST be consistent with equals.

    The hashcode is not for object identity as hashcodes are not meant to be unique across objects that are unequal. The hashcode is for when you want a HASH of the objects identity. There has never been way to obtain an objects identity (in a value sense) in Java, pre- or post-1.2.

    - Michael.

    PS. a method like getPrimaryKey, or getIdentity is what Sam is after. Like a primary key, the value would not strictly need to be unique across all objects in the VM, but would need to be unique for all unequal objects of a particular class (or perhaps superclass, if inheritance gets involved). The value would need to be consistent across VM restarts, and therefore based on the data inside (state of) the object. There are tricky issues here, therefore I think Java is correct in not mandating such a concept for all objects.

    Also, does .Net really provide such a feature, or is Sam confused about what .Net really provides with its hashcode. If .Net has a hashcode method, then I am positive that its not uniquly the object's value's identity.
  15. I'm going to try and simplify what I was trying to say in my blog post as it seems that it might have been a little less clear than I would have liked. Let's start with why hashCode and equals exist and why they are an important methods on objects. Hashcode was created to allow objects to be used as keys into hashtables. Equals is designed to determine if two objects are equivalent. If we look at the default implementations of these two methods we see that they generate completely anonymous objects but still satisfy both contracts. Equals always compares the references and hashCode always returns a value based off of that reference that never changes no matter how the data in the object changes. This is not very useful when you want to use this object as a key into a hashtable because you will only be able to use that *exact* anonymous object to find its value in the table. This is why people override equals and hashcode for things that are going to be used as keys into hashtables. Lets look at String. Strings are equal if their characters are equals, that means that I can make String and you can make a String and they are equal even if it came from another VM. If I use it to lookup something in a hashtable I can be assured that if someone used the same characters in constructing their string I will find the data that they put in it. The difference comes when someone creates a mutable object but have implemented equals and hashtable much like string, i.e. underlying field equivalency determines equality and also generates the same hashcode for equal objects. Allowing the equals to mutate over the lifetime object ensures that the object cannot be easily used as a key. Let's say the object was Customer and there are some fields on that Customer like Id, Name, and Address. If you use all the fields for equals you will see that if you change the Address of the Customer they will no longer be equals to one another. If you just use the Id as the key, then you can change the Address without corrupting the objects identity. Additionally, Customer's with the same Id in another VM will be equivalent and that will allow you to load the other information if it is stored elsewhere. This translates directly into a persistence engine for Java objects if you follow this more specific (but equivalent) rule:

    All constructors must define the identity of the object. They set all the fields that used in the computation of the equals method and the hashCode method which are dependent on nothing else but those field values. They could be final because they cannot be changed after construction.

    In our example the constructor would look like:

    Customer(int id) { this.id = id; }

    I'm surprised that my blog entry generated enough interest for this forum, but happy that it got us talking about object identity. I could have been clearer if I had known the wide distribution that it was going to get. I suppose that if the community decided on some marker interface like Identifiable that would specify that your object is not anonymous and plays by these rules it would be easy for most containers to take that hint and generate the appropriate persistence and distribution code whether they be EJB, Hibernate, etc. This, of course, does not answer the questions about querying that need answered in any persistence system, but it gets further down the road to consensus.
  16. Sam:
    This is not very useful when you want to use this object as a key into a hashtable because you will only be able to use that *exact* anonymous object to find its value in the table. This is why people override equals and hashcode for things that are going to be used as keys into hashtables.

    I'm not so sure about that. The main reason why people override hashCode() is probably for performance. Like you said, the default behavior works fine and a hashCode() method that always returns 1 is perfectly compliant, although it will obviously yield terrible performances.


    All constructors must define the identity of the object. They set all the fields that used in the computation of the equals method and the hashCode method which are dependent on nothing else but those field values. They could be final because they cannot be changed after construction.

    This really doesn't belong in the specification. It might come in handy if you are trying to persist your objects but for all the other cases (99% of the Java code out there?), it is an unacceptable and useless requirement. As I said above, most Java objects are anonymous and have no need for an id. And if they did, hashCode() would be the worst place to store it.


    I suppose that if the community decided on some marker interface like Identifiable that would specify that your object is not anonymous and plays by these rules it would be easy for most containers to take that hint and generate the appropriate persistence and distribution code whether they be EJB, Hibernate, etc.

    I much prefer this solution: have a marker interface with a _getId() method and everybody is happy. But once again, this doesn't belong in the JDK.

    --
    Cedric
  17. It does belong in the JDK[ Go to top ]

    Well, I'll just have to disagree with this. I think it was Java's language designers original intention that the identity of objects should be immutable for performance and to make good keys. I think the change happened first in the class library with broken things like Date and then had to be propagated to the core Object semantics. If it were in the JDK we would be better off. Everyone having their own notion of this is not good. I'll just be quiet now and implement the same semantics 15 different ways for 15 different persistence systems.
  18. I don't think the reason that people override hashCode is for performance. It's to make it do what people expect.

    The default implementation of System.identityHashCode() does not, I believe, return 1. Otherwise you would be correct and it would be for performance.

    The default implementation on a Sun JDK is to return the object address. Therefore (unless you are lucky and get the same result modulo the size of the map) by instantiating a new instance of the key, you won't get the original value put in by a difference instance of the semantically same key.

    For example:
    Value v1 = new Value(1,1);
    map.put(new Key(1,2), v1);
    Value v2 = (Value) map.get(new Key(1,2));
    System.out.println("Are they the same?: " + (v1 == v2));

    This will say false. In fact, if the hashmap was empty, v2 will be null.

    Robert
  19. Let's say the object was Customer and there are some fields on that Customer like Id, Name, and Address. If you use all the fields for equals you will see that if you change the Address of the Customer they will no longer be equals to one another. If you just use the Id as the key, then you can change the Address without corrupting the objects identity.

    Customer customer1 = new Customer( "Sam", "123 Main St." )
    Customer customer2 = new Customer( "Sam", "456 Elm St." )

    If customer1.equals( customer2 ) should be true, then the problem you describe above is simply a poorly defined equals() implementation. Address should not be used in computing either equals() or hashCode().

    If customer1.equals( customer2 ) should be false, then changing the Address object will not 'corrupt the object's identity', because you have actually changed the object's identity, and the equals() method should return false.

    =====

    I suppose that if the community decided on some marker interface like Identifiable that would specify that your object is not anonymous and plays by these rules...

    Why is this necessary? If your object 'plays by these rules,' why not simply use your unique object identifier in your equals() and hashCode() implementation? The important thing is to not make JDK-level changes that enforces these rules in inappropriate contexts.
  20. I'm going to try and simplify what I was trying to say in my blog post as it seems that it might have been a little less clear than I would have liked.

    I don't think it was unclear. I understand the same things from your latest post, and I still think it's wrong.

    Let's start with why hashCode and equals exist and why they are an important methods on objects. Hashcode was created to allow objects to be used as keys into hashtables. Equals is designed to determine if two objects are equivalent. If we look at the default implementations of these two methods we see that they generate completely anonymous objects but still satisfy both contracts. Equals always compares the references and hashCode always returns a value based off of that reference that never changes no matter how the data in the object changes. This is not very useful when you want to use this object as a key into a hashtable because you will only be able to use that *exact* anonymous object to find its value in the table. This is why people override equals and hashcode for things that are going to be used as keys into hashtables. Lets look at String. Strings are equal if their characters are equals, that means that I can make String and you can make a String and they are equal even if it came from another VM. If I use it to lookup something in a hashtable I can be assured that if someone used the same characters in constructing their string I will find the data that they put in it.

    Agreed up to here.

    The difference comes when someone creates a mutable object but have implemented equals and hashtable much like string, i.e. underlying field equivalency determines equality and also generates the same hashcode for equal objects. Allowing the equals to mutate over the lifetime object ensures that the object cannot be easily used as a key.

    That is correct. The API entry for java.util.Map makes a note of this: "Note: great care must be exercised if mutable objects are used as map keys. The behavior of a map is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is a key in the map". Still, in certain cases the performance improvements outweigh the hazards. Regardless of hashCode, the equals method is used in a variety of contexts where mutable objects are perfectly acceptable. And if equality can change with the object's content, what good does a constant hashCode do? Mutable objects are not often used as keys, but when they are I think you would want the hashCode to represent the value as well as possible, to improve performance.


    Let's say the object was Customer and there are some fields on that Customer like Id, Name, and Address. If you use all the fields for equals you will see that if you change the Address of the Customer they will no longer be equals to one another.

    Yes, of course, that is content-wise equality. You may choose to implement it, it often makes sense.

    If you just use the Id as the key, then you can change the Address without corrupting the objects identity.

    Changing the Address does produce a *different object*. Whether you want to consider that object identical to some other object with the same Id is your choice. It's certainly not up to the JLS to *mandate* this choice over other choices, which are equally useful.
    I personally think that in most cases, even in the Customer you described above equality should be implemented content-wise. What if I want to update the database with the customer data I got from a user, and first I want to make sure it's different from the information currently in the database? (to avoid unnecessary updates). I would like to be able to check to see if the objects are "equal", and only update the DB if they are not. Of course you could add another method, say contentEquals. But I think this goes against the standard interpretation of the equals method. I'd rather add an identityEquals method to compare identities. But that's really a per-class design choice.

    Additionally, Customer's with the same Id in another VM will be equivalent and that will allow you to load the other information if it is stored elsewhere.

    Same is true for content-wise comparisons.

    This translates directly into a persistence engine for Java objects if you follow this more specific (but equivalent) rule:

    All constructors must define the identity of the object. They set all the fields that used in the computation of the equals method and the hashCode method which are dependent on nothing else but those field values. They could be final because they cannot be changed after construction.


    This may be a nice rule for certain types of objects (even though, as I mentioned, I still see a use for content-wise comparisons). But these objects are a small niche, certainly not representative of the full spectrum of Java objects. Are you seriously suggesting that the equals method of StringBuffer should not rely on it's content? That two lists should be equal regardless of their elements? Your rule certainly implies that.
    I'm not saying persistence frameworks can't refine the contracts of equals and hashCode to match their needs. But you have to look beyond the niche of persistence frameworks. The rules you are suggesting are not, by any reasonable definition, sufficiently general to support the needs of all Java objects (which is exactly the job of the Object.equals and Object.hashCode methods).


    In our example the constructor would look like:

    Customer(int id) { this.id = id; }
     
    I'm surprised that my blog entry generated enough interest for this forum, but happy that it got us talking about object identity. I could have been clearer if I had known the wide distribution that it was going to get. I suppose that if the community decided on some marker interface like Identifiable that would specify that your object is not anonymous and plays by these rules it would be easy for most containers to take that hint and generate the appropriate persistence and distribution code whether they be EJB, Hibernate, etc. This, of course, does not answer the questions about querying that need answered in any persistence system, but it gets further down the road to consensus.


    I personally don't think redefining equals and hashCode in a marker interface would solve the problem. I think a better route is to define something like

    public interface Identifiable {
      Object getIdentity();
    }

    Where two Identifiable object would return two Object that are equal.

    I think this handles the problem in a way that is much more natural for most Java classes, and still allows the classes themselves to implement equals and hashCode in a different manner, which may be useful for the designers of the class. If you wish, you can still implement identity-wise equals and hashCode, and simply "return this" in getIdentity.
    But this is a matter for a seperate debate. I think most of the criticism of the blog (at least mine) was directed not at the contract you are describing, but at the claim that it is the correct contract for java.lang.Objec (rather than some more specific class).

    Gal
  21. I pretty much agree with everything you have said here, there is no going back at this point and I don't expect people to. The blog entry was to point out what I believe the original intent was and how it has changed over time. I guess we need Gosling and Joy on here to tell us what they were thinking when they originally designed the language versus the current implementation.

    As for your particular example of StringBuffer, I might argue, if I really felt strongly about it, that no two StringBuffers would be .equals() and you would have to compare the String that they produce if you wanted to compare their content for equality. There a lot of simplifications you can make once you mandate the immutability of object identity. Certainly the things like thread safety are much easier to handle if mutable content is left out of the equals/hashcode methods. If you want to see some disgusting code that tries to handle such things, look at Hashtable.hashCode():

            int h = 0;
            if (count == 0 || loadFactor < 0)
                return h; // Returns zero

            loadFactor = -loadFactor; // Mark hashCode computation in progress
            Entry tab[] = table;
            for (int i = 0; i < tab.length; i++)
                for (Entry e = tab[i]; e != null; e = e.next)
                    h += e.key.hashCode() ^ e.value.hashCode();
            loadFactor = -loadFactor; // Mark hashCode computation complete

    return h;

    Yuck! I am sure it gets much worse if your hashCodes depend on mutable objects that you can retrieve from the classes interface and manipulate while the hashcode is being calculated. Synchronization of the hashCode method won't help you here.

    Thanks for the detailed reply!
  22. Just looked up the implementation for equals/hashCode in StringBuffer. There isn't one. It uses reference semantics as I suggested that it should. If you want to compare them you would have to convert them to their immutable value object (String):

    public class TestStringBuffer {
      public static void main(String[] args) {
        StringBuffer sb1 = new StringBuffer("sam");
        StringBuffer sb2 = new StringBuffer("sam");
        System.out.println(sb1.equals(sb2));
        System.out.println(sb1.toString().equals(sb2.toString()));
      }
    }

    Output:
    false
    true

    Obviously that isn't your point, but I thought it was amusing that they don't do that, probably because it was their original intention that mutable types only be compared by reference. Looking closer at the Hashtable implementation the funny shenanigans are for recursion not thread safety, but I can easily imagine both cases.
  23. Just looked up the implementation for equals/hashCode in StringBuffer. There isn't one. It uses reference semantics as I suggested that it should. If you want to compare them you would have to convert them to their immutable value object (String):

    >
    > public class TestStringBuffer {
    >   public static void main(String[] args) {
    >     StringBuffer sb1 = new StringBuffer("sam");
    >     StringBuffer sb2 = new StringBuffer("sam");
    >     System.out.println(sb1.equals(sb2));
    >     System.out.println(sb1.toString().equals(sb2.toString()));
    >   }
    > }
    >
    > Output:
    > false
    > true
    >
    > Obviously that isn't your point, but I thought it was amusing that they don't do that, probably because it was their original intention that mutable types only be compared by reference. Looking closer at the Hashtable implementation the funny shenanigans are for recursion not thread safety, but I can easily imagine both cases.

    You got me there, that wasn't what I meant. I'm glad you still understood my point. Replace "StringBuffer" with "Vector".
    I made my last post before I saw this post, so I also pointed out the recursion-vs-multithreading bit, I hope you don't consider this plagiarism :)

    Regards
    Gal
  24. I believe that it was the original intention that equals()/hashCode() would only be overridden for immutable objects and other objects would be treated with reference semantics as reflected in the java.lang classes. Obviously that isn't the case today, nor was it even the case for some of the original standard class libraries. Java actually does have another way of specifying content comparisons -- Comparable -- but it never caught on the way equals() did since it is a java.util interface.

    I'll summarize my position by saying that it would be really nice if Java had a standard way of specifying that something is an entity and that it has an object identity separate from its reference that is immutable for that object. I just wish it had been there from the beginning.
  25. I believe that it was the original intention that equals()/hashCode() would only be overridden for immutable objects and other objects would be treated with reference semantics as reflected in the java.lang classes. Obviously that isn't the case today, nor was it even the case for some of the original standard class libraries. Java actually does have another way of specifying content comparisons -- Comparable -- but it never caught on the way equals() did since it is a java.util interface.

    >
    > I'll summarize my position by saying that it would be really nice if Java had a standard way of specifying that something is an entity and that it has an object identity separate from its reference that is immutable for that object. I just wish it had been there from the beginning.

    Definately, I agree. I think a standartized interface would serve this purpose well.

    Gal
  26. I pretty much agree with everything you have said here, there is no going back at this point and I don't expect people to. The blog entry was to point out what I believe the original intent was and how it has changed over time. I guess we need Gosling and Joy on here to tell us what they were thinking when they originally designed the language versus the current implementation.

    Classes like java.lang.StringBuffer have been around since JDK1.0, and as far as I know it's "equals" method was never content-independent, so to me it seems highly unlikely that this was what Gosling and Joy ment. Even if they didn't design the StringBuffer class (which I doubt, since it has a language-internal use in the bytecode interpretation of String concatenation) I can't believe they ment for it's equals method to return a value independent of the content. It just doesn't make any sense IMO.

    As for your particular example of StringBuffer, I might argue, if I really felt strongly about it, that no two StringBuffers would be .equals() and you would have to compare the String that they produce if you wanted to compare their content for equality.

    Making a new String instance for every comparison wastes a lot of memory. Worse, it makes the API completely counter-intuitive (at least according to my intuition).

    There a lot of simplifications you can make once you mandate the immutability of object identity. Certainly the things like thread safety are much easier to handle if mutable content is left out of the equals/hashcode methods.

    Well, yes, code that only touches final fields tends to work more smoothly with multiple threads. This has little to do with equals/hashCode: any method that only touches final fields works better with multiple threads. That's not a reason not to use non-final fields. Otherwise, by that logic, no method should use non-final fields.

    If you want to see some disgusting code that tries to handle such things, look at Hashtable.hashCode():
     
             int h = 0;
             if (count == 0 || loadFactor < 0)
                 return h; // Returns zero
     
             loadFactor = -loadFactor; // Mark hashCode computation in progress
             Entry tab[] = table;
             for (int i = 0; i < tab.length; i++)
                 for (Entry e = tab[i]; e != null; e = e.next)
                     h += e.key.hashCode() ^ e.value.hashCode();
             loadFactor = -loadFactor; // Mark hashCode computation complete
     
      return h;
     
    Yuck! I am sure it gets much worse if your hashCodes depend on mutable objects that you can retrieve from the classes interface and manipulate while the hashcode is being calculated. Synchronization of the hashCode method won't help you here.


    Not sure where you got that ugly piece of code. I only have JDK1.3 and up installed with sources here, and both have the following:

        public synchronized int hashCode() {
    int h = 0;
    Iterator i = entrySet().iterator();
    while (i.hasNext())
    h += i.next().hashCode();
    return h;
        }

    That's not too bad.

    Anyway, the "ugliness" of this code has nothing to do with multithreading. What the version you gave was trying to do is prevent cycles, where a hashtable containing itself leads to recursive hashCode calls. I don't see how assuming that there are no content comparisons makes any difference for the implementation of this method. A content-free hashCode doesn't fix this, as the following code demonstrates:

      Hashtable table = new Hashtable();
      MyClass mc = new MyClass(table);
      table.put(mc, new Object());

    By your definition, the hashCode for MyClass may depend on table, because it is passed in through the constructor. So a cycle can still emerge. Anyway this is a highly specific edge-case. The java.util.Map interface states that the result in such a case are not defined. The same thing can happen with equals, or any other method who'se semantics imply recursive execution down the object graph (assuming the graph has cycles).
    If the implementer of the Hashtable class felt that this is a substantial problem, he/she could implement hashCode to be content-independent as you suggest. Or he/she could work around the problem as in the code example you gave. You have still not given any reason to restrict the equals/hashCode contract *a priori*. The current contract allows content-independent implementations, it just doesn't *mandate* them.

    Regards
    Gal
  27. misunderstanding[ Go to top ]

    This seems to be a misunderstanding of reference vs. value semantics. If you keep a default hash code / equals implementation, it's a reference semantic, i.e. pointer-based. If you override it, it's value-based, i.e. the pointer is irrelevant.

    The JDK 1.0/1.1 definition is too strict about not allowing modifibility of values, it implies that all value objects are immutable. But, values can change if it is a mutable object - think of StringBuffer. Immutability was never the intent of hashcode - read the JLS. The C# definition is merely a relaxed statement of the JDK 1.2, the point really is "how do you calculate equality - by pointer or by contents".

    The stuff about EJB primary keys is also confused. The hashcode value does not survive VM startups, whereas an EJB primary key is a persistent identifier. Distributed systems do *not* have traditional notions of object identity because memory models do NOT cross network boundaries. Persistent identifiers are a data theory problem, not a programming model problem. As for transient objects, RMI created a very workable approach to this memory model difference with Remote vs. Serializable objects to distinguish semantics, but it took quite a few updates to get it right (which is partially why EJB doesn't really adopt the complete model - the other reason was pressure to adopt IIOP which doesn't support distributed garbage collection).
  28. System.identityHashCode() ??? That tends to provide the non-persistent single-JVM object ID for the lifetime of the object..

    There's also a JVM id, although I can't remember where it comes from. Maybe RMI?

    So the combination provides a non-persistent OID.

    There is not and never has been a Java notion for a persistable/persistent OID. Some higher level specs, such as EJB provide long-term and serializable handles (e.g. session EJB handle) and keys (e.g. entity EJB primary key).

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  29. If you really want it . . .[ Go to top ]

    If you really want the old functionality, get it here.<br> The other posters are correct that hashCode was never meant to identify an object. The other interesting note here is that you can test instance equality with == and equality of two objects with .equals(). This is one of the first hard lessons you must learn in Java.
  30. Oid Implemention[ Go to top ]

    I think what you need is Identity for every single Object?
    so why not oid=f(currentMillions,className)?that's enough.
    But I don't know whether this feature should be core built-in.