Safe bulk updates for EJBs

Discussions

J2EE patterns: Safe bulk updates for EJBs

  1. Safe bulk updates for EJBs (33 messages)

    I would like to start from saying that maybe someone already published similar idea. But if not, I think this trick might help you. In our system we use BMP EJBs for Auction, Bids etc.

    Usually there is no need to handle more than 2-10 EJBs for single transaction and given the fact that it takes 30-40ms for each EJB to load (I load properties from single row resulting as a join of main table with other one-to-one related tables and use proxy stubs for expensive properties modeled as one-to-many in database) we are satisfied with performance. But sometimes there is a need to mark/change thousands of records. And it is really expensive to load 10 000 EJBs just to increment some int field.

    Correct approach is to use JDBC calls for bulk updates. But there is a small problem: when you use db-is-shared flag set to false (Weblogic) container will effectively cache EJB by not calling ejbLoad for every method invocation (once the bean is loaded ejbLoad will be called only then a transaction fails). So, when you directly update record in the DB, you have to notify cached ejb that it should refresh itself.

    Here are simple classes and interfaces that allow your EJB to decide if it should refresh itself:

    public interface SingleDBChangeMonitor {
      void setChanged(Object id, String propertyName, boolean wasChanged) throws RemoteException;
    /** convenience method*/
      void setChanged(Collection ids, String propertyName, boolean wasChanged) throws RemoteException;
    /** returns null if nothing was changed for this id*/
      ChangeInfo getChangeInfo(Object id) throws RemoteException;
    }

    public class ChangeInfo {
      public ChangeInfo(String changedPropertyName, long changeTime){
        myChangedPropertyName = changedPropertyName;
        myChangeTime = changeTime;
      }
      public String getChangedPropertyName() throws RemoteException {
        return myChangedPropertyName;
      }
      public long getChangeTime() throws RemoteException {
        return myChangeTime;
      }
      public boolean equals(Object o){
        if ( !(o instanceof ChangeInfo) ) {
          return false;
        }
        ChangeInfo other = (ChangeInfo)o;
        return ( myChangedPropertyName == null ? other.myChangedPropertyName == null : myChangedPropertyName.equals(other.myChangedPropertyName))
        && (myChangeTime == other.myChangeTime);
      }
      private String myChangedPropertyName;
      private long myChangeTime;
    }

    ChangeInfo encapsulates information about a change for particular EJBs. You can extract information about update time and name of the changed property (if propertyName == null that means change monitor wasn’t given all the details and you will have to refresh the whole ejb). Here a sample code for typical getXXX method in your EJB:

      public PropertyMap getPropertyMap() throws RemoteException{
        ensureFreshness();
        return super.getPropertyMap();
      }

      protected void ensureFreshness() throws RemoteException{
        ChangeInfo changeInfo = InternalAccess.getDBChangeMonitor().getLineItemMonitor().getChangeInfo(myId());
        if ( changeInfo == null ) {
          return;
        } else {
          if ( myLoadTime < changeInfo.getChangeTime() ) {//somebody has changed DB since we loaded data; myLoadTime is set in load() method to current time
            load(); //this method loads data (ejbLoad just calls it)
          }
          InternalAccess.getDBChangeMonitor().getLineItemMonitor().setChanged(myId(), null, false);//remove from the list of outdated objects
        }
      }

    I have an interface for getting SingleDBChangeMonitor instances for each type of EJBs.

    public interface DBChangeMonitor {
      SingleDBChangeMonitor getAuctionMonitor() throws RemoteException;
      SingleDBChangeMonitor getBidMonitor() throws RemoteException;
    }

    Implementation of SingleDBChangeMonitor can vary: you can keep updated ids in memory and use java calls to SingleDBChangeMonitor.setChanged method to explicitly mark changed data or you can have your sql statements for bulk updates insert modified ids to the dedicated table and implement SingleDBChangeMonitor to read these ids from DB. Choice really depends on your system. Here is memory implementation of SingleDBChangeMonitor:

    public class SingleDBChangeMonitorImpl implements SingleDBChangeMonitor {
      public void setChanged(Collection ids, String propertyName, boolean wasChanged) throws RemoteException {
        Util.assertNotNull("collection of ids is null", ids);
        Iterator i = ids.iterator();
        while(i.hasNext()){
          setChanged(i.next(), propertyName, wasChanged);
        }
      }

      public void setChanged(Object id, String propertyName, boolean wasChanged) throws RemoteException {
        Util.assertNotNull("id is null", id);
        if ( wasChanged ) {
          ChangeInfo changeInfo = new ChangeInfo(propertyName, System.currentTimeMillis());
          synchronized(this){
            myIdChnageInfoMap.put(id, changeInfo);
          }
        } else {
          synchronized(this){
            myIdChnageInfoMap.remove(id);
          }
        }
      }

      public synchronized ChangeInfo getChangeInfo(Object id) throws RemoteException{
        ChangeInfo changeInfo = (ChangeInfo)myIdChnageInfoMap.get(id);
        return changeInfo;
      }

      private HashMap myIdChnageInfoMap = new HashMap(100);
    }


    Final word about ChangeInfo and SingleDBChangeMonitor. Implementation I gave supports a change of only one property or all of them (if propertyName == null). It's fine for our system. If you plan to update more that one property and would like not to use propertyName == null (for example, you don't use proxy stubs for value objects and really load *everything* in ejbLoad), then you will have to modify getChangeInfo to return collection of ChangeInfo objects (and rename SingleDBChangeMonitor ;) ).

    I would like to hear your comments. Once again I don't claim it's an original idea, so if you read it somewhere please let me know where original article is.

    Threaded Messages (33)

  2. Just found practically the same thing in weblogic.developer.interest.ejb by Dmitri Rakitine published on Dec 2000.
  3. Safe bulk updates for EJBs[ Go to top ]

    Here is the actual URL - http://dima.dhs.org/misc/readOnlyUpdates.html

    ben
  4. Safe bulk updates for EJBs[ Go to top ]

    Would this pattern work for a cluster with multiple servers?

    ben
  5. Safe bulk updates for EJBs[ Go to top ]

    Current implementation - no. In a cluster there is no need to change interfaces but you will have to provide cluster-safe implementation of the described interfaces (that's why I have RemoteExceptions in all the methods). You can use RMI to have singleton or you can use JMS (Dmitri) and implement setChanged to publish corresponding event to all servers. In latter case there will be multiple instances of ChangeMonitor in cluster but they will be in sync through JMS.
  6. In cluster this trick makes sense only for RO beans (and when you say RO it becomes 100% Dmitri solution), because ejbLoad will be automatically called every time you call a method on an ejb.
  7. Roman, this pattern is about updating cached entity beans only, and from this point of view there is no difference between read-only and db-is-shared=false ones (even in the cluster) - the choice between read-mostly (RO and RW pair - if transactional read-modify-update behaviour is needed) and db-is-shared=false (if updates are strictly write-only or done by some other means, bypassing entity beans) depends only on the application needs.

    Thanks for mentioning it ;-)

  8. I have to admit, I haven't used RO beans yet ;), but as far as I understand there is a difference in throwing RuntimeException from RO and RW entity bean : for RO bean data will be reloaded, but for RW bean also the whole transaction will be rolled back. Is this correct, or for RO bean tx will also be rolled back? Please correct if I am wrong.
  9. Where can I find good example of read-mostly pattern? Is WL 6.0 example a good one?
  10. As far as I can tell, read-mostly example didn't change from 5.1 to 6.0. The new one illustrating updateable read-only beans in 6.1 will be much cooler - http://newsgroups.bea.com/cgi-bin/dnewsweb?cmd=article&group=weblogic.developer.interest.61beta.caching&item=10&utag=
  11. Where can I find good example of read-mostly pattern? Is WL 6.0 example a good one?


    I was playing with 6.1 caching and updated read-mostly example to use invalidation-target to automatically update read-only part instead of read-timeout-seconds (it sets it to 0): http://dima.dhs.org/misc/readMostlyImproved.jar
  12. Thank you.
  13. Couldn't see where you do invalidate. Is it a complete example? Should invalidation be in updateStock?
  14. Invalidation happens automatically - see weblogic-ejb-jar.xml - invalidation-target:

    <!--
    The invalidation-target element specifies a Read-Only Entity EJB which
    should be invalidated when this Container-Managed Persistence Entity
    EJB has been modified.

    Note that the target ejb-name must be a Read-Only Entity EJB, and this
    tag may only be specified in an EJB 2.0 Container-Managed Persistence Entity
    EJB.

    Example:

    <invalidation-target>
      <ejb-name>StockReaderEJB</ejb-name
    </invalidation-target>

    Since: WebLogic Server 6.1

    -->
  15. See [17.3.1] in the EJB spec.

    If non-app exception is thrown from session or entity bean business method, container has to

    - Log the exception or error [Note B].
    - Rollback the transaction/Mark it for rollback.
    - Discard instance [Note C].
    - Throw RemoteException/TransactionRolledBackException to the client.

    [C] Discard instance means that the Container must not invoke any business methods or container callbacks on the instance.

    So, this refresh method is simply based on the fact that container discards bean instance - for the next client new one will be created and ejbLoad() called.
  16. Hi Dmitri,

    Thank you for the pattern link. Bus as I thought exception inside RO bean will discard the whole tx. Which is not the desired behavior I guess. My approach will silently reload data without exception resulting in the tx rollback (I mean there will be no exception if there is no data confict). What do you think about it?
  17. Hi Roman!

    I was thinking about pattern very similar to yours, because seppuku is very inefficient in case of bulk updates. The problems I ran into were cluster-related - maintaining replicated repository of invalid pk's in the cluster becomes a bit complicated. Even on a single server it is complicated - for how long do you keep invalidated pk's in memory? Of course, you can implement DBChangeMonitor to read from the database, but this overhead can be comparable to ejbLoad'ing, thus negating benefits of caching.

    I think that, on average, 6.1 invalidation mechanism is quite adequate - if you update 1 of few records, then broadcasting invalid pk's is Ok, if update touched sizeable portion of the table, then invalidateAll() will do.
  18. It may be a bit off topic, but here is an alternative pattern for the problem. I will call it the "Simple Simon Bulk Update" pattern.

    - Go stateless. Scrap the Entity Beans. This will eliminate the clustering/cache synch problems. It will also make the system more simple and faster. If you need caching, either give your database more memory and let it do the caching, or let the client (web app) do the caching, since the client knows exactly what needs/does not need to be cached.

    - Have a stateless session bean with a method that takes the 10000 records as a single bulk value object.

    - Have the session bean perform the update using JDBC. One transaction for the entire update. Very simple and high performance.

    Disclaimer: This pattern will not earn you any sexy awards, but you will have a happy client:-)
  19. If you decide to model Monitors as RMI singleton you will have to change methods because once you update ejb in one server in cluster you don't know if the same ejb was updated in another server so you can't just call setChanged(false) on a singleton. JMS solution is more flexible and nice.
  20. Safe bulk updates for EJBs[ Go to top ]

    I am not sure this is really that safe. Your pattern needs to consider EJB writes(setters) not just reads(getters). Or maybe just something in the ejbStore method itself.

    What happens if the EJB (CMP) has deferred writes waiting to be committed while your jdbc process blows through all of the supporting rows in the database. The EJB will eventually overwrite the values without knowledge that the batch process changed them.

    For instance, I use bulk update methods where I send an EJB an update(Dictionary) call. This updates (setters) many fields on the EJB. When this call exits the EJB the container will construct the appropriate SQL update statement and commit the changes to the database.

    Maybe a check on the change info during ejbStore would help.

    I think there are options to on the container that might help. Something that will lock the database records on read for update. That would seem to be a requirement for this pattern to be safe.
  21. Safe bulk updates for EJBs[ Go to top ]

    Of couse your mutators will first call ensureFreshness() and only if there is no data confict it will actually accept new data. Otherwise mutator methods throw Application Exception (OutdatedPropertyMap in my application) so that the caller will have to react in proper manner:

    public PropertyMap setPropertyMap(PropertyMap newMap) throws RemoteException, CompoundVetoException, OutdatedMapException {
        ensureFreshness();
        return super.setPropertyMap(newMap);
      }


    I forgot to mention it. Thank you for highlighting this important point.
  22. Safe bulk updates for EJBs[ Go to top ]

    But there is still a window of time when the bulk jdbc update can take place after the mutator's ensureFreshness and the time when the container submits the SQL update statement.

    Not matter how small the window is I don't think it is completely closed unless you do a "read-for-update" approach in the container.
  23. Safe bulk updates for EJBs[ Go to top ]

    Yes, you are correct. There is such a window. And you can only throw RunimeException from ejbStore because you don't know when it will invoked.

    But in my application the trick is still safe because I write only *changed* data (which is a recommended strategy anyway). If your ejb just dumps its data to the DB (changed and unchanged), described 'pattern' will present a problem as you noticed.
  24. Safe bulk updates for EJBs[ Go to top ]

    I think the "window" is not that important, the later ejbStore()/tx will be thrown out/roll back by DB. It is a matter of Tx control of db, something repeatable read, the third type of four tx isolation levels.

    As for using JMS/RMI in cluster system for the monitor, it will delay the time of call caused by the infor back and forth on the clusters or synchronization. It may be better to let the DB handle all these concurrency problems, to make the db-is-shared=true, in cluster system.

    Bye the way, your application is using fine-grained entity bean, right?

    minjiang
  25. Safe bulk updates for EJBs[ Go to top ]

    Yes, I use fine-grained entity bean.

    People, I am curious, what is your performance? I have Compaq laptop PIII 800, with WLS6.0, Oracle8i installed, and with all the logging on (which takes about 30% of the overall EJB time) my ejb with 10 initialy loaded properties takes as I mentioned 30-50ms to load and to prepare all value (proxy and non-proxy) objects.

    How is it compared to your system (similar data volume + logging on)?
  26. What about making Entity beans also Message Driven beans and using JMS to publish the "update event". The Message Driven Entities could examine the event and reload() if necessary. This could also work in a cluster.
  27. Ah! I see that's just what Dimitri's pattern is all about. Interesting.
  28. Yes, this is exactly what it does.

    To invalidate cached (read-only or sb-is-shared=false) entity bean instance a method which throws a non-application exception is invoked on it, which results in container discarding this instance. In the cluster, JMS or multicast can be used to propagate invalidations (I used JavaGroups in 5.1).

    WebLogic 6.1 implements very similar functionality (more efficiently), so starting with 6.1 seppuku pattern is no longer needed.

    Note that this is non-transactional caching, meaning that it is possible that during some very short interval client will see stale data.
  29. Safe bulk updates for EJBs[ Go to top ]

    I didn't understand your explanation here. Do I have to extend this classes to a entity bean. Is there a place that I can look the whole example. I would greatly appreciate for your detail explanation.
  30. Safe bulk updates for EJBs[ Go to top ]

    No, you do not extend this class. Your BMP bean uses it to detect changes in DB in order to decide whether it should reload data (there is description of behavior for ejbLoad/ejbStore and other methods). My pattern is used on a single server box for write beans with db-shared = false when you need tx behavior. Dimitri pattern uses RO beans which do not have tx behavior. If you don't understand something, just see Dimitri site, he has practically the same thing, explained.
  31. Safe bulk updates for EJBs[ Go to top ]

    Just wanted to clarify - difference between Dimitri pattern and mine is not really RO or write beans but how you *propagate* a change in DB to your entity bean and how your bean *react* to a change in DB . Propagation should be taken from Dimitri (will work in cluster). The rest should be taken from me, because IMO it is more flexible than throwing RuntimeException in case of write beans (you actually *try* to resolve stale data problem yourself without discarding the transaction). Finally, most likey you will want to use WL 6.1 read-mostly approach.
  32. Safe bulk updates for EJBs[ Go to top ]

    Thanks for your reply. Could you please give me the URL for dimitri's web site and it will be helpful for giving me the URL for the article in dimitri's website.
  33. Safe bulk updates for EJBs[ Go to top ]

    http://dima.dhs.org/misc/readOnlyUpdates.html

  34. Safe bulk updates for EJBs[ Go to top ]

    Is is possible to do the same thing in visualage/websphere