CMP Primary Key Sequence Generated By Bean Self Cache

Discussions

J2EE patterns: CMP Primary Key Sequence Generated By Bean Self Cache

  1. CMP Primary Key Sequence Generated By Bean Self Cache

    Using a CMP, we can write fairly less codes and do same job as a BMP bean. But we often have such problem how we can generate a primary key and it a non-duplicate key value, when we create a new entity bean. I tried doing some ways, e.g. using key generator – another session EJB that finds the latest key value from the database, or creating a data access object instance to do it. I appreciate the ways above. They exactly resolve the problem. But the extra work, to access a database every time when creating a CMP bean, would increase net traffic and container overhead. It is a trade off. Can we do better? The answer is “yes”. We can use the EJB self cache the latest key value. Only when the server is restarted, it fetches the latest key value form its persistence by using a stateless session bean.

    The new idea is:
    1. Design a stateless session bean to process fetching the latest key value from the CMP persistent database.
    2. Declare a static integer variable “idCount” in bean class to cache the latest id value and a static boolean variable to check if the latest key is initiated.
    3. Whenever a bean instance is created, setxxxContext() will always be first invoked by the container. So we put all functions to fetch the latest key value by using a stateless session bean in the method. If the id cache has not been initiated yet, the entity bean will call the session bean to generate the latest key value, or it will do nothing about it.
    4. In ejbCreate() method, increase idCount(update the cache) and assign idCount to the primary field depending on the id field variable type(e.g. String, Integer, etc). If there is a create exception, update the key cache back.

    Here is the key parts of an example:

    public class NewsBean implements EntityBean
    {
    public Integer id; //primary key
    …

       private EntityContext context;
    private static int idCount=1; // the cache of the latest key value
    private static boolean isInitiated = false; // check if the key cache is initiated

    public Integer ejbCreate(int news_type, String content, String subject, … ) throws CreateException{
    id = new Integer(++idCount);
    this.news_type = news_type;
    this.content = content;
    this.subject=subject;
    this.author = author;
    …
    return null;
    }

    .
    .
    .
    public void setEntityContext (EntityContext ctx)
      {
    context = ctx;
    if(!isInitiated){ // the id key cache is never initiated
    try {
    Hashtable env = new Hashtable();
    env.put(Context.INITIAL_CONTEXT_FACTORY,
    "weblogic.jndi.WLInitialContextFactory");
    env.put(Context.PROVIDER_URL, "t3://localhost:7001");
    Context ic = new InitialContext(env);

    NewsIDFinderHome home =
    (NewsIDFinderHome)PortableRemoteObject.narrow(ic.lookup("statelessSession.NewsIDFinderEJB"), NewsIDFinderHome.class);

    NewsIDFinder finder = (NewsIDFinder)home.create();
    idCount = finder.getMaxID(); // fetch the latest PK value
    isInitiated = true; // set key cache initiation checker is true
    }
    catch (Exception e) {
    e.printStackTrace();
    }
    }
      }

    public void rollbackID(){
    idCount--;
    }
    }

    The session bean class is:

    public class NewsIDFinderBean implements SessionBean {

       private SessionContext ctx;
    private int maxID = 1;
    private int totalRecords;

      /**
       */
      public void setSessionContext(SessionContext ctx) {
        this.ctx = ctx;
      }

      /**
       */
      public void ejbActivate() {
      }

      /**
       */
      public void ejbPassivate() {
      }

      /**
       */
      public void ejbRemove() {
      }

      /**
    */
      public void ejbCreate() throws CreateException {
    Hashtable env = new Hashtable();
    env.put(Context.INITIAL_CONTEXT_FACTORY, "weblogic.jndi.WLInitialContextFactory");
    env.put(Context.PROVIDER_URL, "t3://localhost:7001");
        try {
          InitialContext ic = new InitialContext(env);
    DataSource ds = (DataSource)ic.lookup("jdbc.ShuttleDB");
    Connection con = ds.getConnection();
    String query = "select max(id), count(id) from shuttle_news";
    Statement stt = con.createStatement();
    ResultSet rst = stt.executeQuery(query);
    if(rst.next()){
    rst.getInt(1);
    if(!rst.wasNull()){
    maxID = rst.getInt(1);
    }
    totalRecords = rst.getInt(2);
    }
    rst.close();
        } catch (Exception e) {
    System.out.println(e);
          throw new CreateException("Error in ejbCreate()");
        }
      }

      /**
       business method
       */
      public int getMaxID()
      {
        return maxID;
      }

      /**
    business method
       */
      public int getTatolRecords()
      {
        return totalRecords;
      }
    }

    Using the bean latest key value cache, it can much reduce the net traffic. It only needs access the session bean to fetch the latest key value once. I hope the pattern will be helpful for all ejb developers.

    I declare that maybe my pattern is not the original one.

    Welcome any comment.

    Threaded Messages (28)

  2. what if the database is shared? Your pattern may not work right?

    cheers,
    vijay
  3. If all the client use this entity bean for primary key,
    It's will be all right.
  4. I could be wrong, but..

    - the static reference to idCount is valid for only a single JVM
    - entity beans on other machines would each have their own idCount
    - when ejbCreate() is called on different machines, this could result in duplicate IDs


    While your pattern may work if you run on just a single machine, running in a distributed environment is necessary for any non-trivial application.

    Comments?

    -JC
  5. Yes, I agree with Justin.

    The EJB Specs suggests that all static variables of EJB should be defined as final. Otherwise, you lose the distribution feature of EJB (which might be the reason for using EJB).

    Am I right?
    Cuong
  6. Your pattern is an ANTI-PATTERN.

    Access to idCount and isInitiated is unsynchronized, so it is bad even with a single JVM multithreaded server.
    If the app server runs on multiple JVM is bad altogether.

    Even if you try to make it work, it won't.

    Why do you think database vendors introduced sequences as distinct objects in the database ??

    Just because SELECT max(id)+1 FROM ... will never work.

    Cheers,
    Costin
  7. hi, Costin
    I don't agree you at all. Oliver's idea is exact new. At least, the pattern can resolve CMP bean unique id sequence problem on single JVM and with unshared db. I think the pattern can be improved. When a DuplicateKeyException occurs, force the client initializing the id. So, thus can resolve the shared db, multi JVM and mutil threadedserver problems stated on previous comments. EJB is not allowed to use synchronized modifier. For the argument
    "Why do you think database vendors introduced sequences as distinct objects in the database ??
    Just because SELECT max(id)+1 FROM ... will never work.", I think you made a mistake. Whatever the PK id is set unique or not in the db, the max(id)+1 is always a new pk value. I tried the modified codes. When there is a DuplicateKeyException, I forced the client initialising the id pk by using method call, whatever the db is shared or server is mutilthreaded server. It is better than the pattern: initialising id every time when creating a new ejb.

    Rds

    Julie
     
  8. Hi Julie,

    Oliver's idea is not exactly new, a lot of people might have thought of (at least I did in the old Client Server days), because it seems like the natural thing to do.
    But it's not exactly a good idea so no responsible programmer should use it.

    It works if you consider a single JVM and add synchronization on those fields(yes it's forbidden by the spec but who cares ?).

    But in the general case it is bad comparing with the usual alternatives.

    "Select max(id +1)" (I made an error in the previous post)
     is :
    - first of all an expensive operation,

    - second it doesn't solve the fundamental problem generating unique IDS:
       different clients will have the same max returned.
       even if you retry the operation in case of failures , especially inmultiple JVM case, you're almost guaranteed to fail again under moderate to heavy load.

    - max(id+1) will fail if somebody else just inserted the same max(id+1) but not commited yet (you don't see the new insertion), and depending on the particular database locking policy and isolation level it may fail miserably.

    There's tons of other reasons, therefore database vendors especially introduce sequences, or autogenerated unique types so people will avoid making the select max(id+1) mistake.

    It may happen that you are happy with the bad solution during development , or if the site doesn't have a lot of load.
    Then, the solution works perfectly, but it is nevertheless bad.

    There are at least two good solutions discussed on this site
    so they should be used instead.

    Cheers,
    Costin
  9. Doesn't max() make a momentary table lock?

    /Theis.
  10. Not necessarily.

    If you mean that concurrent transactions will not be able to update or insert, it is true for the isolation level SERIALIZABLE.

    You also hit the problem that the behavior is different wether the insert/update was executed prior to select max() or after select max().

    However this is not always the case, it depends on your database, the isolation level of the transaction and a little bit on the locking policy.

    For instance, Oracle almost never let you aquire a table lock.
  11. I just wanted to comment on "it is forbidden in the specs, but who cares?": This is a very bad approach (talking about "responsible programmers"). It was not forbidden because sun just didn't like the synchronized keyword, it is forbidden for a good reason, because the container manages threads.
    Normally, if you use synchronized within a java program, you'll be able to produce a deadlock with two synchronized requests, object A wating for X having a lock on Y and B waiting for Y having a lock on X.
    In EJB, you can produce a deadlock with a _single_ synchronized method, because the container may (in its stub/container classes) already use synchronization.
    Thus, you may indirectly call a synchronized method with every call to a bean method. If two beans are now accessed concurrently and both call a synchronized method you could end up with a deadlock.
    We all know deadlocks are nice to find and easy to debug, but what makes it worse is: you may not notice it during development, because it will only occur sometimes in high-load situations (many beans instantiated concurrently), even after deployment, customers will be satisfied (normally they only play around the first days/weeks/months they got the new app), only if all users "really" use the app the deadlock may occur... application "partially" crashed... some parts are stalled.
    If I develop an enterprise application, I really don't want to trust that it will "pobably never happen" although I know it could... according to Murphy's laws it _will_ happen ;-)

    regards

    Messi
  12. Yes, I was talking that no "responsible programmer" should use select max(id+1).

    While I contend that a responsible programmer can break the spec a little bit, especially when the spec is stupid.

    You have to be a pretty stupid programmer to be able to trigger a deadlock just by using synchronized.
    Do I need to remind you that if you use a order relation among your locked resources and always aquire resources in the same order you'll never have a deadlock ?

    As per the responsibility, you can send your comments to Sun.


    "Workaround

    As a Java programmer, you have to assume that the specification
    is meaningless."

    Actual quote form developer.java.sun.com


  13. Sorry, I really didn't mean to offend you, my point was just: In general, you won't produce a deadlock if you do everything correctly, but you cannot know what the appserver is really doing, because the vendor is free to implement is as he likes. And if you keep the same order with your statements, that doesn't mean that you won't ever produce a deadlock... because do you know that the vendor of your appserver also does this?
    And what I mean is: why should I break the spec if I have a different, well working implementation for PK generation that is reasonably fast and doesn't break the specs?

    kind regards

    Messi
  14. You bet I wasn't offended.

    You're right in general, but one can devise special cases when you can be sure your synchronized doesn't produce a deadlock, no matter how your app server calls your beans.

    If we have to live by the letter of the spec, then app server are a closed environment where you cannot integrate any decent third party Java framework (we shouldn't use synchronized, shouldn't use IO, shouldn't use static bla, bla, bla).

    One has to weigh in if some things in the spec were put it in there just because of the commodity of the spec writers.

    Anyway the pattern was no good with or without synchronized.
  15. Here´s a solution that I´m almost know for sure works for EJB´s and Oracle8.1.6 Enterprise DB (oops getting platform specific), else I got a major problem.

    Make use of built in sequence generator in Oracle with the folowing SQL statement:

    CREATE SEQUENCE globalSequencer INCREMENT BY 1 START WITH 1 NOMAXVALUE CACHE 10;

    This statement creates a sequence in the db an can be accessed from some business method e.g: getNextKey()
    The method asks the db for next key with a SQL statement like this:

    SELECT globalSequencer.NEXTVAL FROM dual;

    Shouldn´t this solution solve the problem?
    Best Regards
    Niclas Rothman
    nir@e-sense.dk



  16. Hi,
    having read thru the replies I have to agree with Costin (not with his reference of "stupid"). Several issues:

    1.One important issue is that whilst you have read this "max id" another record could be inserted thus forcing a duplicate key value. Even if you implement a select for update lock, on a web system this would cause a slow down in performance on the database

    2. The overhead on a hit to the database and finding out that your key is duplicated is double a simple call to a sequence generator.

    3. It is bad practice to rely on exception generation to implment a unit of work.

    4. Relying on duplicate generation means 3 hits - one for the attempted insert of a record, two to get the latest max id (which could be invalid if another record has been inserted) three to insert your record with the new id.

    Francis

  17. Actually, Cloudscape recommends that particular solution for generation of PK ID's in comon situations (Cloudscape does not support PK autogeneration).

    They recommend to

    SELECT MAX(PK) ... FOR UPDATE;
    INSERT;
    COMMIT;

    To select the next value of a PK (I'm saying this from memory). This lock only affects one row.

    Anyway, I'm not going to use this sistem. Sounds against the KISS principle. Sounds weird.
  18. I think that Oracle devised the sequence as a response to the problem of

    Table A your data table
    Table B contains next key value for table A in a single row

    Table B becomes very hot in a busy database.

    This is a very simple pattern which easily improves on the max(x+1) scenario.

    Cheers
  19. Hi Costin,

    When you said :

    > There are at least two good solutions discussed on this site
    > so they should be used instead.

    Could you provide reference.

    Thanks
  20. I'll have to agree with Costin here; this pattern will cause more headaches than solutions.

    1) The only way this pattern is salvagable on multi-VMs or clustered environments is if you tightly bind the SSBs to an synchronous messaging update system; implementing this is not trivial, if even possible.

    2) "SELECT max()..." only works if you lock the table via selectForUpdate, but this causes serialization problems for normal table access.

    If you use Oracle db, use sequence! If you use a db that does not provide sequence, then you MUST tolerate a non-cached pk id fetch.



  21. Hi!

    Probably it's posible to generate different key for different JVM just using diffrent seed. I meen first machine use 1+3, second 2+3, third 3+3 (in case of 3 JVM). Just like identity field in SQL Server.

    But here becoms responsibility of assemblers to change seed in deployment descriptor.

    Mike
  22. *If* you don't care about the order of keys why not just choose a random number and retry if you have a collision?

    As long as your random numbers come from a large interval (say 2^32) collisions should be very rare even for tables with many rows.

    Tom
  23. depends on how much trust you have in your code.

    I don't think you want to compensate for something which could very easily be caused by a bug in the system...

    -JC


  24. Oliver Yang,

    Thanks for posting the code. Interesting discussion. Reflects general focus on web site/distributed sw developers to use the database as the real source of enterprise data management. I noticed nobody commented on JNDI or EJB being able to synchronize access to the Entity Bean you designed. I would be surprised if EJB does not even support a semaphore like management of the EJB. The focus on using the database often means foregoing EJB's at all. Begs the question of why use a distributed architecture like J2EE or .NET when modern RDBMS's support load balancing, failure management, etc.

    Thinking about performance, I like your design. The probable case is that you will run all the web site on one server. The 200 ms or whatever to call over to the db server just to get a unique or sequence number should glare out as a FAILURE of good software infrastructure. That also makes implications about the amount of work needed to effect such a simple task at all.

    That said, I vote for using the easiest method. If you get to release 2 and have the luck of success to need to improve performance, then implement somthing that minimizes the network calls. Of course, Oracle can only have implemented a pattern similar to what you will devise: lock, get, increment, unlock. Basic computer science stuff made exceptionally more complex by EJB infrastructure. Well, tradeoffs.

    I'd still be interested in how installing the EJB on one server and using JNDI would make this work/fail.

    Here's a copy of the post for the easy/first out solution:
    (i assume a separate computer runs ORCL - what I always see - so this is costly in terms of network overhead)

    Good discussion,
    Tim Jowers

    ----- post ----
    Posted By: Niclas Rothman on April 15, 2001 in response to this message.
    Here´s a solution that I´m almost know for sure works for EJB´s and Oracle8.1.6 Enterprise DB (oops getting platform specific), else I got a major problem.

    Make use of built in sequence generator in Oracle with the folowing SQL statement:

    CREATE SEQUENCE globalSequencer INCREMENT BY 1 START WITH 1 NOMAXVALUE CACHE 10;

    This statement creates a sequence in the db an can be accessed from some business method e.g: getNextKey()
    The method asks the db for next key with a SQL statement like this:

    SELECT globalSequencer.NEXTVAL FROM dual;

    Shouldn´t this solution solve the problem?
    Best Regards
    Niclas Rothman
    nir@e-sense.dk

    --- /post -----

     
  25. Oliver,

    I don't know what others think about this, but for my part, yes, I'm proposing the database as the "ultimate" source for sequence numbers.
    The reason for this is it is proven to be reliable, fast, stable and you get "sequential" numbers (I already mentioned this).
    I don't think what you say, "FAILURE of good software infrastructure" because the "database way" doesn't involve any serious network traffic if you adhere to the "high-low" pattern.
    I already said my company has a small component which works this way (without violating J2EE specs). Simply use a SB for retrieving the keys, which use an entity bean as "cache". The SB will request a high key from the EB and then add its low keys. If the server failed there is no problem: you'll only lose some high keys (but how often does your appserver fail???).
    You can see that the database roundtrip overhead involved is very small... every e.g. 2^16 keys you'll have to make a DB-roundtrip... that is not exactly a lot ;-)
    I admit there is still some overhead involved with the call to the session bean, but with the appservers I'm using (e.g. BAS) this is also neglectible, because after the first calls it is very fast.
    I also compared our approach with the "PK by random" (i.e. trying to generate a unique key without accessing the DB) approach (which seems better to me than the approach proposed here) and it is definitely faster.
    So I cannot understand which objections some of you have regarding the DB-approach(?)

    kind regards

    Messi
  26. Messi,

    You're right. That's a great approach. Probably the fastest and most robust. My objection was hitting the database every time - I hadn't thought about your "high-low" pattern. It combines the best of both worlds.

    Tim Jowers
  27. The generalised approach, as used in databases, is to grab a certain number of keys each time. Thus the CACHE declaration in the Oracle DDL above. This can be rewritten in EJB's as Messi has, but you've got to wonder why you have to. Treating rdB's as dumb data holders seems, well, dumb.
  28. Messi

    Your DB solution cleary works when you need to generate pk unique keys, but it doesn't if you need to generate sequencial numbers.

    The reason is simple: once you fetch a value from an Oracle Sequence, it cannot be rollback, even if you transaction does so. It follows that you may end up with some keys not been used (whenever some situation ocurrs that causes a rolback in your application).

    Besides, I would like to see a solution that is not tied directly to a particular DBMS. May be future versions of the EJB spec should include a sequence interface that should be implemeneted by the J2EE server producer...

    Cheers,

    Luís Fernando
    ===============
  29. I tried using this pattern, but I am getting following error, do you have any thoughts why I am getting this error.

    Bean : statelessSessionNewIDFinderEJB
    Method : public abstract NewIDFinder create() throws CreateException, RemoteException
    Section: 7.10.6
    Warning: The method return values in the home interface must be of valid types for RMI/IIOP.

    2004-01-08 11:29:39,625 ERROR [org.jboss.deployment.MainDeployer] could not create deployment: file:/C:/Java/jboss-3.2.1_tomcat-4.1.24/server/default/deploy/clone.jar
    org.jboss.deployment.DeploymentException: Verification of Enterprise Beans failed, see above for error messages.
    at org.jboss.ejb.EJBDeployer.create(EJBDeployer.java:487)
    at org.jboss.deployment.MainDeployer.create(MainDeployer.java:784)
    at org.jboss.deployment.MainDeployer.deploy(MainDeployer.java:639)
    at org.jboss.deployment.MainDeployer.deploy(MainDeployer.java:613)
    at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:324)
    at org.jboss.mx.capability.ReflectedMBeanDispatcher.invoke(ReflectedMBeanDispatcher.java:284)
    at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:549)
    at org.jboss.mx.util.MBeanProxyExt.invoke(MBeanProxyExt.java:177)
    at $Proxy7.deploy(Unknown Source)
    at org.jboss.deployment.scanner.URLDeploymentScanner.deploy(URLDeploymentScanner.java:302)
    at org.jboss.deployment.scanner.URLDeploymentScanner.scan(URLDeploymentScanner.java:458)
    at org.jboss.deployment.scanner.AbstractDeploymentScanner$ScannerThread.doScan(AbstractDeploymentScanner.java:200)
    at org.jboss.deployment.scanner.AbstractDeploymentScanner$ScannerThread.loop(AbstractDeploymentScanner.java:211)
    at org.jboss.deployment.scanner.AbstractDeploymentScanner$ScannerThread.run(AbstractDeploymentScanner.java:190)
    2004-01-08 11:29:39,635 ERROR [org.jboss.deployment.scanner.URLDeploymentScanner] Failed to deploy: org.jboss.deployment.scanner.URLDeploymentScanner$DeployedURL@c95b738b{ url=file:/C:/Java/jboss-3.2.1_tomcat-4.1.24/server/default/deploy/clone.jar, deployedLastModified=1073578940466 }
    org.jboss.deployment.DeploymentException: Verification of Enterprise Beans failed, see above for error messages.
    at org.jboss.ejb.EJBDeployer.create(EJBDeployer.java:487)
    at org.jboss.deployment.MainDeployer.create(MainDeployer.java:784)
    at org.jboss.deployment.MainDeployer.deploy(MainDeployer.java:639)
    at org.jboss.deployment.MainDeployer.deploy(MainDeployer.java:613)
    at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:324)
    at org.jboss.mx.capability.ReflectedMBeanDispatcher.invoke(ReflectedMBeanDispatcher.java:284)
    at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:549)
    at org.jboss.mx.util.MBeanProxyExt.invoke(MBeanProxyExt.java:177)
    at $Proxy7.deploy(Unknown Source)
    at org.jboss.deployment.scanner.URLDeploymentScanner.deploy(URLDeploymentScanner.java:302)
    at org.jboss.deployment.scanner.URLDeploymentScanner.scan(URLDeploymentScanner.java:458)
    at org.jboss.deployment.scanner.AbstractDeploymentScanner$ScannerThread.doScan(AbstractDeploymentScanner.java:200)
    at org.jboss.deployment.scanner.AbstractDeploymentScanner$ScannerThread.loop(AbstractDeploymentScanner.java:211)
    at org.jboss.deployment.scanner.AbstractDeploymentScanner$ScannerThread.run(AbstractDeploymentScanner.java:190)