HIGH/LOW Singleton+Session Bean Universal Object ID Generator

Discussions

J2EE patterns: HIGH/LOW Singleton+Session Bean Universal Object ID Generator

  1. Universal Object ID Generator.
    An Enterprise-ready, Simple, Performant, Scalable, Platform-independant EJB Primary Key Generator

    Forenote: this article refers extensively to Scott Ambler's article titled "Enterprise-Ready Object IDs" and the TheServerSide's pattern "Entity Bean Primary Key Generator". It is strongly advised to take a look at these before reading further.


    THE PROBLEM: OBJECT ID... WHY DO I NEED ONE?


    Well, as Scott Ambler writes, however you want to have it, most of us who are developing in the EJB world use relational databases to store objects. And where as relational databases need key, objects don't.

    Keys that have business meaning are intrinsically a bad idea since the meaning might change, therefore the database representation might need to change, which means... trouble. Hence the need for some meaningless key.


    WHY EXISTING SOLUTIONS DO NOT WORK?


    We want an Enterprise-ready, simple, performant, scalable, plateform-independant (aka vendor independent) OID generator. So what's wrong with what is already there?

    DATABASE CENTRIC SURROGATE KEYS
    Incremental keys, when they are supplied by your database, are vendor specific. Their implementation changes from vendor to vendor and therefore using them induces a certain degree of vendor lock-in, which is not always easy to overcome.

    UUIDs, GUIDs AND OTHER UNIVERSAL STUFF
    The Open Source Foundation's Universal Unique IDentifiers (UUIDs), Microsoft's Globally Unique IDentifiers (GUIDs) and the other Unique ID generator (such as the RMI based UID generator) have some or all of the following drawbacks (refer to Scott Ambler's article for more info):
    - They are often predicated on the concept of your application actually being in communication with your database.
    - They break down in multiple database scenario.
    - Obtaining a key causes a dip in performance.
    - There are minor compatibility glitches.
    - They are not scalable. Since generation of GUIDs and UUIDs depend on a time stamp with a precision of a thousands of a second, there is a risk of overlap, even minuscule.


    SO WHAT WORKS, THEN?


    The proposition Scott Ambler made in his article was to use a HIGH-LOW strategy. The strategy consists in a two logical parts: a HIGH value that comes from a source common to all object ID generators and a LOW value your local object ID generator.

    The HIGH values are more expensive to retrieve as they have to be fetched from a central source available to all, but are unique to each OID generator. The LOW value is initialised and incremented by the generator itself, locally, which makes it easy and fast to manage and obtain. The concatenation of the HIGH and the LOW makes a unique key.

    This provides an enterprise-wide unique object identifier... but not a universal one. Two different companies could use this very method generate their keys, and generate the same keys, unique in their own little world, but duplicated in the real world. To make an Object ID truly Universally Unique, we could add an identifier unique to the company, such as the company domain name.


    THE IMPLEMENTATION


    The code is available at http://www.geocities.com/ehsforward . It has been tested on WebSphere 3.5 and it works great.

    The solution provided here uses a combination of a singleton and a stateless bean:
    - the session bean fetches the high key from the database
    - the singleton has for responsability: 1) to ask the session bean for the next HIGH value when necessary; 2) to determine the LOW value and the unique identifier; 3) build and return a full UOID (HIGH + LOW + identifier).

    The solution has been implemented using the following guidelines:

    - Use a key composed of a 112 bits HIGH key, a 16 bits LOW key and a unique enterprise identifier (as per Scott Ambler's article).

    - Use a byte array to represent the HIGH and LOW keys. The byte arrays are encapsulated in a Key class that implements various functionalities such as incrementing the key, converting back and forth to String, etc. I think this solution is acceptable performance wise although I did not study the problem thoroughly.

    - Store the key as a String in the database (as per Scott Ambler's article). The bytes are converted to a simplified hex format. This is a single source implementation (we avoided the multiple sources implementation and its associated algorithm).

    - The UOID generator is not class specific. It generates UOID using the same rules for all classes and does not contain class specific information (as per Scott Ambler's article).

    - Create the HIGH key automatically in the database if it is not found.

    - Use singleton/factory pattern. This means that there will be one factory per JVM.
    [ NOTE: As mentioned in various postings, some EJB servers, such a SilverStream and Gemstone, are creating and destroying JVMs dynamically. I took the same position as Scott Amblers when he says in his article: "Yes, this is wasteful, but when you are dealing with a 112-bit HIGHs, who cares?" If this is really an issue, decisions might need to be taken at the level of such servers' configuration, but this is a separate discussion all together.]

    - To avoid Hotspot problems on the database indexes search created by the fact that keys start with a long sequence of identical characters, the key is incremented in reverse order. This is a first try to prevent the problem apprearing at the early stages of life of the database.

    This means that instead of incrementing like this:

    00000000 00000000 ... 00000000 00000000 -> 00000000 00000000 ... 00000000 00000001
    and:
    00000000 00000000 ... 00000000 11111111 -> 00000000 00000000 ... 00000001 00000000

    the Key class would increment the following way:

    00000000 00000000 ... 00000000 00000000 -> 00000001 00000000 ... 00000000 00000000
    and:
    11111111 00000000 ... 00000000 00000000 -> 00000000 00000001 ... 00000000 00000000

    - Looking for a new HIGH key involves retrieving the value from the database, incrementing it and update the row with the incremented value. While performing this we need to have exclusive access to the table row.

    Using Entity Bean with serializeable transaction for this is inadequate for several reasons:
    1) to garanty exclusive access to the table during a serializeable transaction, your EJB server has to implement pessimistic concurrency control algorithm and some do not (Oracle for example).
    2) performance wise, serializeable transactions are costly.
    3) getting the next HIGH key looks more like a service than an object (it "provides" the next available key) therefore should really be implemented as a Session Bean;

    The only cross-server compliant way to assure exclusive access the table row is to force an exclusive lock on the database using a SELECT FOR UPDATE clause and then use an UPDATE clause to store the incremented value of the key. This solution uses one simple transaction and does not require the Bean transaction to be serializable, which gives it better performance. In fact, since the isolation is taken care of at the database level, the lowest isolation level (read uncommited) is acceptable. Also this solution doesn't perform a lock promotion meaning there is no fear of dead lock.

    We only need a stateless Session Bean as there is no state to remember.

    To make it highly available we make sure a new transaction is created when using the Session Bean.
    This makes the overall solution lightweight and efficient.

    To summarise the alternative is:
    1) Use a stateless Session Bean .
    2) Use a SELECT FOR UPDATE clause followed by an UPDATE clause to get the HIGH key and update it.
    3) Set the transaction isolation level to TX_READ_UNCOMMITED and its attribute to TX_REQUIRES_NEW.

    -Finally in the implementation, we make sure the AUTOCOMMIT is turned off and then back to its initial setting. This is because an exception is thrown when using a SELECT FOR UPDATE on Oracle 8 database, as kindly mentioned by Weicong Wang. UDB 6.1 beware, tempering with the AUTOCOMMIT causes an exception to be thrown if your isolation level is TX_SERIALIZABLE.


    HOW TO USE IT AND HOW DOES IT WORK?

    Let's suppose you want your entity bean to take advantage of the UOID generator. Your entity bean will have a field called uoid. It will have to implement the following code in its ejbCreate method:


         public void ejbCreate() throws javax.ejb.CreateException{
            // get the singleton
            UIDDispenser dispenser = UIDDispenser.getDispenser();

            try{
               uoid = dispenser.getNextId();
            } catch (org.ejbutils.uid.UIDDispenserException e) {
               throw new javax.ejb.CreateException("Problem with the UIDDispenser : " + e.getMessage());
            }
         }

    You can have a number of ejbCreate methods to initialise you entity bean differently. Simply make sure that such method call this method first.

    The UIDDispenser builts the UOID as follows. First it looks whether it has an assigned HIGH value.
    - If it has not, it asks the session bean (here called UIDHighKeyGenerator) to provide it with one. The UIDHighKeyGenerator gets the HIGH value from the database table and returns it to the UIDDispenser (in the process it has incremented the HIGH value of the table all this in a transactionaly safe way). UIDDispenser stores the HIGH value and initialises its LOW value to zero.
    - If it has, it checks the its LOW value has not reached its maximum. if it has, the dispenser gets a new HIGH value from the session bean and initialises its LOW value to zero. If it has not, it increments the LOW value.
    Finally, the UIDDispenser just has to combine HIGH + LOW + unique identifier and returns the result as a String.


    POTENTIAL DRAWBACKS


    A uoid is a combination of a 112 bits (or 14 bytes) HIGH value, a 16 bits (or 2 bytes) LOW value and a unique identifier. The string representation of a byte used here is made of 2 chars, making the uoid a 32+<length to the unique identifier> long string. Some DB administrator objected to that. I have not studied the problem thouroughly. You should probably check whether this is really a problem and study the trade-offs.


    CONCLUSION


    This provides us with a Entreprise-ready, single performant, scalable, plateform/vendor independent UOID generator.

    I shall insist on the "Entreprise-ready" nature of this solution. If you need a uoid in your application but you do not need it to be unique in the whole world and 128 bits long so that you never see the end of it, you still can use the implementation provided here. for the first, you just change the code to not add the unique identifier to the uoid. For the second just modify the HIGH_KEY_BYTES or LOW_KEY_BYTES (not recommended for this one) static fields in the UIDDispenser to the value you feel appropriate. Forthcoming version of this implementation will have options to allow you to do all this without modifying the code. Just be carefull in your decision: you do not want another Y2K disaster for you application.


    Emmanuel SCIARA

    Threaded Messages (43)

  2. Hi Emmanuel,

    After that discussion thread "Entity Bean Primary Key Generator" I was also trying to come up with a clean solution. My solution differs from yours only by using 2 EJB-s, one session and one entity bean. An entity bean is used to access the high value (in my case) and the session bean provides the low values. This approach solves the problem of transaction not being started when using only one entity bean for UID generation (a new transaction has to be started when a new high value is requested in order to store the new incremented high value).
    The pseudo-code for the entity bean is:


    public class HighKeyGeneratorBean implements EntityBean
    {
        public long high;

    ....

        /**
         * Gets a new High value
         */
        public long getNewHigh()
        {
            ++this.high;
            return this.high;
        }
    }

    This last method will have a transaction attribute: "Requires new"

    The session bean will implement a variation of the HIGH/LOW pattern by allowing setting the value level for the HIGH value. This allows the tweaking of key generation accordingly to how many requests are, in order to avoid loosing to many low values (when app. server shut down, redeployment ......).
    The pseudo-code for the session bean:

    public class KeyGeneratorBean implements SessionBean
    {
        private long low;
        private long high;
        private int step;
        private int retries;

        /**
         * Creates a new KeyGenerator
         */
        public void ejbCreate()
        {
            this.low = 0;
            this.step = ((Integer)JNDIHelper.lookup("java:comp/env/HighStep",Integer.class)).intValue();
            this.retries = ((Integer)JNDIHelper.lookup("java:comp/env/RetriesNumber",Integer.class)).intValue();
            this.highKeyGeneratorHome = (HighKeyGeneratorHome)JNDIHelper.lookup("java:comp/env/ejb/HighKeyGenerator",HighKeyGeneratorHome.class);

            this.high = this.getNewHigh(this.retries);
        }

    ......

        /**
         * Gets a new unique identifier
         */
        public long getNewKey()
        {
            ++this.low;
            if (this.low > this.step)
            {
                this.high = this.getNewHigh(this.retries);
                this.low = 1;
            }
            long key = this.high*this.step + this.low;
            return key;
        }

        /**
         * Gets a new High value
         * @param number of retries if transaction serialization problem occur
         * (Oracle jdbc driver throws ORA-08177)
         * @return new High value
         */
        private long getNewHigh(int retry)
        {
            if (retry == 0) throw new GeneralException("KeyGeneratorBean",
                "Failed to get a new High value");
            try
            {
                HighKeyGenerator highKeyGenerator = this.highKeyGeneratorHome.findByPrimaryKey("high");
                return highKeyGenerator.getNewHigh();
            }
            catch (Exception e)
            {
                return this.getNewHigh(retry - 1);
            }
        }
    }

    As you can see the getNewHigh() private method is retrying to get a new high value in case of serialization problem for Oracle database. The getNewKey() method will have a transaction attribute: "Not supported".

    This implementation seems to be general enough to be deployed on any app. server. It is also extremely fast because you have a pool of key generators (the session beans).

    I hope to get some feedback on this (negative if it's possible) because I haven't found any drawbacks until now.
  3. Hi Mircea,

    It seems the major difference between both implementations is the use of entity beans. Have another look at the article, I discuss why the solution I implemented does not go for entity beans (Variations across db vendors in implementation of serialiseable transaction, performance, concept of getting a high key more a service).

    Hope this helps.

    Emmanuel
  4. Why do you generate HIGH from a database?

    I have an implementation that uses a single entity bean that does everything in memory.

    I have an implementation of a UUID generator that is derived from org.w3c.util.UUID that I use for highs.

    You have the flexibility of using a single generator for all your beans or have multiple generators.

    You could also have a stateless session bean as a facade (that is what I do).

    Here is the code of the entity bean :

    ============================================================
    ============================================================
    package com.pyxis.ejb.base.pkGen;

    import java.lang.Math;
    import javax.ejb.EntityBean;
    import javax.ejb.CreateException;
    import javax.ejb.FinderException;
    import com.pyxis.ejb.base.entity.EntityBeanAdapter;
    /**
     * DESCRIPTION:
     * This is a bean managed Entity Bean used to generate unique IDs that
     * serve as primary keys for other entities.
     * It uses a high low pattern as described by Scott Ambler in the
     * following article :
     * http://www.sdmagazine.com/articles/1999/0012/0012p/0012p.htm?topic=uml
     * When a new key is needed, the bean appends the next sequential low value
     * to the current high value. When we reach the maximum for the low value,
     * a new high is generated.
     * We use a UUID generated from the network name of the machine for the high.
     * Since everything is done in memory this is a very efficient pattern.
     * You can use the same generator for multiple even all entities.
     * <p>
     * Copyright: Copyright (c) 2001
     * Company: Pyxis Technologies
     * @author Francois Beauregard
     * @version $Revision: 1.0 $ $Date: 2001/01/22 17:11:00 $
     */
    public class PKeyGeneratorBean
    extends EntityBeanAdapter
    implements EntityBean, PKeyGeneratorBusiness
    {
    /** ID of generator (primary key) */
    private String mGenID = null;
    /** High Value */
    private String mHighValue = null;
    /** Low Value */
    private int mLowValue = -1;
    /** Maximum value for the low. Default is 9999 (4 digits) */
    private int mMaxLowValue = 9999;


    /**
     * Creation of the bean. This bean is never created.
     */
    public String ejbCreate() throws CreateException
    {
    throw new CreateException(
    "This bean cannot be created. Use the findByPrimaryKey method.");
    }
    public void ejbPostCreate() {}

    /**
     * Since everything is done in memory, nothing to do to find the bean.
     * Only initialize the generator ID member and return the passed ID as
     * a primary key.
     */
    public String ejbFindByPrimaryKey(String genID)
    throws FinderException
    {
    mGenID = genID;
    return genID;
    }

    /**
     * The bean is loaded so we must generate an initial High Value.
     */
    public void ejbLoad()
    {
    generateNewHigh();
    }

    /**
     * This method is used to specify the number of digits for the low value.
     * Default is 4.
     */
    public void setLowValueDigits(byte digits)
    {
    mMaxLowValue = ((int)Math.pow(10, digits)) - 1;
    }

    /**
     * This method generate the next key value.
     * <p>
     * @return Generated key (String).
     */
    public String generateKey()
    {
    // Do we need a new high
    if (mLowValue >= mMaxLowValue)
    {
    generateNewHigh();
    }

    // Next low value
    mLowValue++;

    return mHighValue + "-" + String.valueOf(mLowValue);
    }

    /**
     * This method generates a new high value and store it in a private member.
     */
    private void generateNewHigh()
    {
    mHighValue = (new UUID()).toString();
    mLowValue = 0;
    }

    }
    ============================================================
    ============================================================

    What do you think?

    If anyone is interested in having the full source code, send me an email at fbeauregard@pyxis-tech.com

    Regards
    Frank
  5. Hi,

     I wish to have the full source code for this singleton method I am love to try that
  6. Does the singleton need to be used across different beans? I ask because most app servers use custom ClassLoaders which will mean it is only a singleton within a ClassLoader.

    Dave Wolf
    Internet Applications Division
    Sybase
  7. Dave: It seems to me the singleton/class loader question is moot provided you can guarantee that a unique High value is generated whenever the class is first loaded or created. Since the objective is to avoid value collisions across invocations, you should be safe provided this guarantee holds, no?
  8. What happens when a server in which the Unique generator bean is running crushes? Does it generate unique keys each time it's started?
    I have been having problems with this.
  9. My concern was two singletons in different ClassLoaders could create the same HIGH

    Dave Wolf
    Internet Applications Division
    Sybase
  10. Yeah, I share the same concern as Dave. Using a singleton is not a safe thing to do within an Enterprise Java Bean.

    I am quoting a statement made by Joshua Fox in Javaworld (http://www.javaworld.com/javaworld/jw-01-2001/jw-0112-singleton_p.html)

    " The EJB containers' ability to spread the identity of a single EJB instance across multiple VMs causes confusion if you try to write a Singleton in the context of an EJB. The instance fields of the Singleton will not be globally unique. Because several VMs are involved for what appears to be the same object, several Singleton objects might be brought into existence. "

    Nevertheless I have to mention that, I have used Singleton within a bean and I have not faced any problems. Since we are dealing with App server nuances, IMHO it is better to be on the safe side and not use it.

    Thanks
    -Mahesh.
  11. Yes, but I think in this instance (no pun intended) the function of the Singleton doesn't require that you ACTUALLY have REALLY just one unique instance. That's the hope, but it doesn't have to be so. Put another way: the HIGH/LOW class doesn't have to be a Singleton. The only thing you need to know for certain - and this MIGHT be a problem - is that the calculated HIGH value is ALWAYS unique. If I have 5 concurrent instances of the HIGH/LOW Singleton (intentionally or otherwise), as long as the HIGH value for each of them is unique I can rest assured that they'll each generate absolutely unique keys. I mean, that's the purpose of this particular Singleton, to generate unique keys, not share data. How's that sound?
  12. Hi Gordon,

    Sounds good, except the "If I have 5 concurrent instances of the HIGH/LOW Singleton": if you have multiple instances is not a Singleton anymore!!! See second message in this thread where I propose my solution which does what you want, a High key generator (the entity bean) that is garanteed that will generate unique values (the latest High value is in the database) and multiple instances of Low key generators (the session beans). This approach allows you to run the key generation in a cluster or multiple VM-s!
  13. Mircea, you're right, it wouldn't be a Singleton anymore. I don't think I did a very good job explaining myself. I meant to address the concerns some responders had with using a Singleton in an environment where multiple class loaders are involved (read Mahesh Nair's comments). They were worried that, in that environment, you might not effectively HAVE a Singleton, and they're right. My point was that in this particular case, it didn't much matter that the same 'Singleton' might be instantiated in multiple VM's as long as its basic function was not compromised, e.g., producing unique high and low keys. As long as that holds true - as long as any instance is guaranteed to produce a different set of keys than any other instance of the same class - it doesn't much matter whether this particular class is a Singleton or not, except to improve performance perhaps.

    Regards,

    Gordon.
  14. Hi Gordon,

    Yes, I know what you meant. My response goal was to point out that the Singleton concept is not valid in that context anymore. I think the only way you can simulate the Singleton behaviour is when the data resides in one place (the database) and the access to that data is serialized.
  15. Sounds good, except the "If I have 5 concurrent instances

    > of the HIGH/LOW Singleton": if you have multiple instances
    > is not a Singleton anymore!!!

    I am pretty sure that the singleton pattern does not state anything on the number of istances and it's a way to "control" number of instances (one single istance is only the most common strategy). A pool is another possible implementation strategy of Singleton pattern.
    Anyway, this is not very useful to the real discussion!

    uL
  16. But if the HIGH value is UNIQUE, the LOW value is not necessary.
  17. The LOW value is available therefore usefull. True its not necessary but then you are significantly reducing your "namespace" for lack of a better term. Plus you have to hit the db for HIGH values, LOWs are locally generated.
  18. Dave: I was under the impression that the specification (EJB 2.0 Section 23.1.2 - Programming restrictions) disallowed the modification of static fields by EJBs or classes used by them, therefore forbidding singltons.

    [EJB 2.0]

    23.1.2 Programming restrictions

    This section describes the programming restrictions that a Bean Provider must follow to ensure that the enterprise bean is portable and can be deployed in any compliant EJB 2.0 Container. The restrictions apply to the implementation of the business methods. These restrictions also extend to the dependent classes that are used by an entity bean with container managed persistence. Section 23.2, which describes the Container’s view of these restrictions, defines the programming environment that all EJB Containers must provide.

     An enterprise Bean must not use read/write static fields. Using read-only static fields is allowed. Therefore, it is recommended that all static fields in the enterprise bean class be declared as final.
  19. To Brett, and to all that are so evangelical about Sun's spec without considering what's good and what's just plain dumb (including my friend Dave):

    A little quote from Sun's site:

    "Workaround

    As a Java programmer, you have to assume that the specification
    is meaningless."

    Is that clear enough ?
    Sure it doesn't refer to all specs, but the issue was also related to Singletons. lol :)


  20. Emmanuel,

    A simple suggestion to make the SQL Database independent. SELECT FOR UPDATE wont work for all the databases. So, why not use this approach.

    1) Set Autocommit to false on your JDBC Connection
    2) Use UPDATE HIGH_KEY = HIGH_KEY + 1 FROM <TABLE> WHERE KEY_NAME =<XYZ>
    3) SELECT HIGH_KEY FROM <TABLE> WHERE KEY_NAME = <XYZ>
    4) Commit the transaction.

    This approach guarantees that the record is locked in the database and incremented to the next high value. The subsequent SELECT will fetch you the latest HIGH value.
    Let me know if there is any hole in this approach.

    - Vasu.
    sgullipalli at hotmail dot com
  21. Yes Wen, you are right. If the HIGH key is stored as a varchar, my approach won't work. But, cant we store the HIGH key as a NUMBER ( either Long or Float) in database. When the KEY( HIGH + LOW ) is generated, that can be converted to CHAR or whatever to make the key length 128 bits.

    If it is not possible, a round-about way could be

    (1) UPDATE <Table> SET HIGH_KEY = HIGH_KEY WHERE KEY_NAME = <XYZ>
    (2) SELECT HIGH_KEY FROM <Table> WHERE KEY_NAME = <XYZ>
    (3) Increment HIGH_KEY (whatever algorithm is used to increment)
    (4) UPDATE <Table> SET HIGH_KEY = <New Value> WHERE KEY_NAME = <XYZ>

    The disadv of this approach is we are hitting the database additionally once to make sure that record is locked.

    Let me know your feedback.

    Vasu.

    ---------------------------------------------------------

    Regarding "A simple suggestion to make the SQL Database independent...",
    it is database independent but there is one drawback. I believe that the HIGH key is not an integer column in the database. Rather it can be of varchar. How can you do "plus 1" for a value of varchar?

    Wen
  22. I would also be interested in a DB-independant way to do this. Using FOR UPDATE does not work on JDataStore, the default database for the Inprise application server.

    -Mark McMillan
  23. Emmanuel,
      I am new to EJB.Would you like to tell me how to modify ContextUtil.java to make it work with Inprise Application server?
      Zhang
  24. Is Emmanuel SCIARA's email address available? I've posted a link to this article on another mailing list and I'd really appreciate his input, if possible.

    Thanks,

    Martin
    Martin.Welch@natwest.com
  25. All

    Sorry, I have not been able to reply to all your questions so far, I have been a bit busy... I will try to ASAP. This generated some quiet interesting input!

    Martin, my email is ehsforward at yahoo dot com

    'get back on this list soon.

    Emmanuel
  26. All

    Sorry, I have not been able to reply to all your questions so far, I have been a bit busy... I will try to ASAP. This generated some quiet interesting input!

    Martin, my email is ehsforward at yahoo dot com

    Will get back soon.

    Emmanuel
  27. I'm curious about the statement:

    >Using Entity Bean with serializeable transaction for this >is inadequate for several reasons:
    >1) to garanty exclusive access to the table during a >serializeable transaction, your EJB server has to >implement pessimistic concurrency control algorithm and >some do not (Oracle for example).

    It seems to me that the entity bean should work no matter if the server uses optimistic or pessimistic concurrency control. In the optimistic case, if the second transaction reads the data and tries to update the HIGH, the database should throw an exception. You'd have to retry the transaction and then the value would be correct. Am I missing something?

    I read Floyd's article on optimistic vs pessimistic locking in the first TheServerSide newsletter. It seems that to the client, pessimistic locking in the appserver should be equivalent to optimistic locking in the appserver with SERIALIZABLE isolation level on the database. Isn't this true?
  28. Hi all,

    Sorry to have been away for so long...

    Let me try to answer some of the questions and remarks posted on this thread.

    To Francois:

    I need to use the database because i do not use the UUID generator from w3c for the reason I explain in the article and therefore I need a central place from which I get my Highs.

    To Dave/Gordon/Mahesh/Mircea:

    I think Gordon and Mircea explained the situation well. The UIDDispenser "singleton" is not a singleton as per say but rather something in the memory that will hold the value of the HIGH and reuse it (instead of going back to the database). There will be one "Singleton" per classloader, but each of them will be garantied to distribute a truly unique key, and that is all we are interested in.

    You might have some problems in having too many UIDDispensers in the one JVM but this depends on the way you will deploy your beans: if you use fine grained modules (the "one ejb per ejb-jar" approach) you should have one classloader per module and therefore one UIDDispenser per ejb (although I will check that this week); if you use coarse grained modules ("the all application's ejbs in the same ejb-jar" approach), one UIDDispenser will be used accross all your application's ejbs.

    I will give more details about these issues in the next release of the ejbutils (this week).

    To Collin:

    Sorry to hear you had problems. The next release should be easier to get around with... If you still have problems, drop me a mail and I will see how I could help you.

    As for your question: You will always have unique key. If the server crushes, UIDDispenser will get new HIGH key and will use it to distribute unique keys.

    To Shankar:

    Follow the links in the article and you will be able to find the code for download.

    To Vasu:

    Yes I know I came across the "SELECT FOR UPDATE" problem (aka some dbs don't support it) while trying to deploy to JBoss with Hypersonic. I will have to find an alternative to that. THe HIGH should still be stored in a CHAR. I am not sure if your proposition is dealing properly with concurrency though: first, you are updating the HIGH_KEY before using it where as another UIDDispenser might have already updated the value; second, how do you demarcate the boundaries of your transaction and how is your row locked?

    To Zhang:

    Sorry I cannot help: I have no experience with Inprise...

    To Brett/Costin:

    I agree somewhat with Costin although I will be more cautious: you can be defiant towards the specification... as long as you know what you are doing and what might be the consequences of your acts. In this case, I think we know what we are doing (do we?? ;) ).

    To Mark:

    Working on it as mentioned to Vasu... Although I wonder if production worthy database would all implement the "FOR UPDATE" feature.

    To Kenneth:

    Cannot find Floyd's article right now (search engine is not working today), so I cannot comment. One of the question would be: what exception is thrown and how to handle it? I need to see how it exactely happens before I can do anything about it.

    That's all folks! Sorry again for the delay! I will try to be more available.

    Emmanuel
  29. If the second transaction tries to update the HIGH, but the first transaction has already obtained a write lock, I'd expect a RollbackException, TransactionRolledBackException, or a HeuristicRollbackException, or something like that.

    I think in a distributed environment we always have to be prepared for something like this to happen. And we always need to be ready to retry transactions.
  30. Emmanuel,

    The initial Update SQL will update the HIGH value in database with itself (SQL: Update Table Set HIGH = HIGH). Consider there are two transactions (A) & (B). When Transaction (A) executes the Update query, it gets a lock on the table row ( the locking strategy could be either Pessimistic Locking or Optimistic Locking - both works in this case). When Transaction (B) tries to execute Update Query, it will not be able to acquire the lock and will be waiting for the record lock to be released by Transaction (A). Hope, this clears your doubt.

    Vasu.
  31. Hi guys,

    In order to create a viable alternative "FOR UPDATE", the database has to support transaction isolation levels, which is not as wide spread as initially thought.

    It seems that database that now support isolation levels also support "FOR UPDATE". (For instance, mySQL's latest release 3.23.36 introduced both at the same time)

    I am willing to support this alternative only if I know for sure that there are databases used out there that prove the above wrong.

    Any comments?

    Emmanuel
  32. Emmanuel,

    Please replace the SELECT FOR UPDATE with the following"

    UPDATE sequences
    SET max_key= max_key+1
    WHERE sequence_id= ?

    SELECT max_key FROM sequences WHERE sequence_id= ?

    Other than that, I would still recommend that one should use database specific sequence generators.
  33. Hi Costin!

    This has been discussed and is not safe in a concurrent environment...

    Have a look at this support request.

    As for using database specific sequence generators, there are arguments for and against. This UID generator tries to solve the problems enumerated in the article at the start of this pattern! But you are quiet right, sometimes one does not need more that database sequence generators!

    Emmanuel
  34. Emmanuel,

    Can you please explain what is the problem with that.

    It is true that under specific circumstances (very rare and avoidable) some transactions may rollback, but other than that it is perfectly safe, no two transactions will produce the same key.

    But it seems to me that the wole discussion is for nothing, because the database defined generator are really a better solution from all points of view.

    The "cons" that were raised (such as dependence on database specific issues) are really a joke from my perspective.
    Why can't someone define something in a property file ( such as a class name, or a specific SQL statement) to resolve that.

    Cheers,
    Costin
  35. OK, everyone hold on, the world isnt coming to an end, but Im about to agree with Costin.

    These patterns seem to spend a signficiant amount of time solving problems we have already solved. There are many non RDBMS specific solutiosn which use the syncrhonization power of the RDBMS which present no chance for duplicate keys, use such simple SQL that they are 100% portable, and painfully simple to write, and use small simple keys which hash nicely in RDBMS b-tree indexes. Id be glad to lay one out that solves all these problems, needs only one row from one table, and is 100% portable.

    Why are we reinventing the wheel here?

    Dave Wolf
    eBusiness Division
    Sybase
  36. Yes Dave

    Please do show some examples or do show any links for more information to prove your point.

    navin
  37. You can do this very simply, safely, and portably using a stateless session bean and a single table with a single row.

    create table KEYGENERATOR
    (
    seed int not null
    );
    insert into KEYGENERATOR select 1;

    The stateless session bean marked a transaction required and using an isolation level of at least 1 has psuedo code like

    // assuming an instance member called _current and
    // and environment variable storing a _blocksize

    if(_current == null)
    {
        update key generator set seed = seed + _blocksize
        select seed from key generator
        _current = seed - _blocksize
        return _current;
    }
    else // assuming not our first time being called
    {
       if(current % _blocksize == 0) // need a new seed
       {
         update keygenerator set seed = seed + _blocksize
         select seed from key generator
         _current = seed - _blocksize
       }
       else
       {
         return _current
       }
    }

    Advantages

    1) Portable
    2) Needs only one table with one row
    3) Guranteed unique to all cluster members sharing a database
    4) Small bit size number easier to index
    5) By basically grabbing "blocks" of ids from the key gernator table we only ping the database when we need a new block. This is tuned by setting the block size. There is then almost no contention on the table

    Simple, easy, handles 95% of the things you need to do.

    Dave Wolf
    eBusiness Division
    Sybase
  38. Dave,

    In you example, if I have two transaction A and B, am I right in thinking that the following can happen:

    A updates
    B updates
    A selects
    B selects

    This means that transaction A and B could end up with the same key...

    As I replyed to Colin, unless there is an easy way to avoid this, saying it happens only rarely is not a good excuse IHMO. And if it happens, it would extremely difficult to trace.

    Emmanuel
  39. B cannot update until A commits after A updated first.
    B will issue the UPDATE statement and wait for A to commits.
  40. Haaa now we are talking Colin!

    If you say is correct than I agree that this is the way to go.

    But I am a bit puzzeled here: doesn't this depend on the isolation level of the transactions?

    Also does it still work when A and B use concurrent datasources? Is this behaviour the same amongst all databases?

    Emmanuel
  41. Emmanuel, Emmanuel,

    You're like a lot of nice OO guys, make a lot of nice object designs but you forgot database basics.

    The isolation level NONE is only supported by MSSQL server as far as I know and you have to set it on purpose, otherwise the other 3 isolation levels all behave the way you need.

    More , when you do a SELECT ... ... FOR UPDATE you are opening a FOR UPDATE anonymous cursor.
    Some older versions of databases may consider that an error because you cannot later say UPDATE ... WHERE CURRENT OF CURSOR_NAME.

    The behaviour you need with UPDATE is the same against all serious and decent databases.
    Even MySQL , although you that is hardly a serious database considering its support for transactions.

    On the other hand you should use database generators (SEQUENCE in Oracle, GENERATOR in Interbase, IDENTITY column in MSSQL and so on) whenever possible.

    To make that portable you create an IDGenerator interface
    and configure the name of the implementation class in a property file.
    This way your portable and you use UPDATE .. SET VALUE=VALUE+ ... as a last resort,
    because database generators, while they serialize access, they don't hold locks for the duration of a transaction.

    Cheers,
    Costin
  42. Well i think i have lost the plot !!

    As far as i can see this pattern should be used to generate a unique id, to be used as a primary key (usually) for a table in a database. The pattern basically retrieves the high order bits of the unique id from a common resource
    (like a database) and the low order bits from a local class. When the low order bits run out then the next value of the high order bits is retrieved from the common resource.

    But why ??
    Why not just have the method (within a session bean) that retrieves the next value of the locally cached sequence and then checks if it has reached the max value in the current cached sequence numbers. If it has then read the next value of the sequence number from the database and multiply it by the cache size. Return the sequence number.


    Example

    SessionBean
    {
    cacheSize = 64; // Determines how ofter you will hit the database
    sequenceVal = -cacheSize + 1; // just to force a load first time, this should be persisted with the session bean

     int getNextVal()
     {
      if (++sequenceVal % cacheSize == 0)
      {
        reload(); // Could use a sequence (reloadOracleSequence) or a table (reloadTransactionalSequence)depending if there is any concern with losing id's.
      }
      return sequenceVal;
     }

    // non transactional, just a good old oracle sequence. Instead of the multiplication you can set the increment in oracle i think
    reloadOracleSequence()
    {
    select seq.nextval * cacheSize from dual
    }

    // in a transactional way
    reloadTransactionalSequence()
    {
    update t_sequence set value = value + 1 where sequence_id = ?
    select value from t_sequence where sequence_id = ?
    }

    }

    OK the code is just to show the concept, does anyone have any thoughts. Maybe i am missing the point ??
    Please note that the sequence is also a integer, for performance reasons.

    Problems: With the above the first cache of sequences will be lost.
  43. Colin,

    The problem is exactely how you discribe it: it is very rare but it happens. Is it a sufficiant reason to ignore it? I don't think so.

    It's like saying: "my sofware handles the glass of water on the table (nothing happens), not on the table (it falls and water night be alover the place) but not when it is on the border of the table (will it fall... or not)."

    Now if it is avoidable, please let me know how. If it can be added to the code, I will be very happy to do so.

    As for whether it is an overkill, it is the same good old answer: it depends what you want to do. It is true that for most cases, but *not* all cases, a db generated key will suffice. But you might have situations where you need such a key generator, which are mostly described in my article.

    A very interesting discussion has taken place in this thread of the JBoss-dev mailing list. Have a look.

    Cheers

    Emmanuel
  44. Hi!

    Just to let you know that a new version of the EJBUtils UID generator is out... with quiet a bit of new stuff. have a look!

    http://ejbutils.sourceforge.net/

    Emmanuel