Discussions

J2EE patterns: Fat Key Pattern: BMP Bulk Load and Data Cache

  1. Fat Key Pattern: BMP Bulk Load and Data Cache (52 messages)

    In early ejb servers a call to ejbFindByPrimaryKey will result in a database hit to retrieve a primary key, and each subsequent business method invocation on the returned EJBObject will result in additional db hits, via ejbLoad, to "flesh out" the bean. Many vendors, include Weblogic, have since realized that for CMP, since the container controls the generated sql as well as the transactional boundaries, this find-and-load process has been optimized so that only a single database hit will occur within a transaction. This can also be true for Collection findByXXX() methods; only a single database query is required to find all of the relevent rows and "flesh out" the retrieved beans with these rows.

    Unfortunately, the container cannot do the same sort of sql optimization for BMPs for the simple reason that the container has no control over the sqls. In order to implement a BMP, the user has to implement two distinct sql calls, one for ejbFindByXXX and one for ejbLoad.

    I have devised a simple pattern called "Fat Keys" which will allow BMP beans to mimic the caching and bulk-loading properties of CMP. I devised and implemented this pattern on my own back in August 1999 and first publicized it on ejb-interest at java dot sun dot com. Subsequently another developer, Ana Bhattacharyya, emailed me and said she has came up with a very similar pattern. This not only demonstrates development convergence but also somewhat validates the effectiveness of this pattern by two independent sources. As a caveat, both of us tested this solely on BEA Weblogic 5.1; hence I cannot guarantee its effectiveness on any other server.

    In a nutshell, this is how Fat Key works : When the BMP container calls ejbFindByPK or ejbFindByXXX, instead of doing sql select for just the existence of the row(s) in the db and returning "slim" primary keys, why not sql select for the full row(s) and use this to populate "fat" primary keys. Under Weblogic, these primary keys returned from ejbFinders are stored in a primary key cache and is subsequently retrieved again in ejbLoad() by calling getEntityContext().getPrimaryKey(). Because you instantiated "fat" keys, you can take the data from your keys to populate your bean in ejbLoad, instead of having to lookup the db again for row data!

    The Fat Key pattern depends is greatly simplified by another pattern known as Value Objects or Bulk Acessor Objects. The key is to implement both the primary key as well as the entity bean classes to BOTH share the data of the Value Object. This can be done in one of two ways, either using inheritance or has-a relationship. My example below, showing BeanData (ValueObject), PrimaryKey and EntityBean classes, will use has-a relationship. I have simplified my bean to contain only one field: data1.

    public class BeanData implements Serializable {
    private String data1;

    public String getData1(){return data1;}
    public void setData1(String data1){this.data1 = data1;}
    }

    public class PrimaryKey implements Serializable {
    // typical implementation for a simple pk
    private int ID;

    public int getID(){return ID;}
    public boolean equals(long otherID){return ID == otherID;}
    public int hashCode(){return ID;}

    transient BeanData beanData = null; <== This is what makes this key "Fat"!

    public PrimaryKey(int ID){this.ID = ID;}
    }

    public class Bean implements javax.ejb.EntityBean {
    private BeanData beanData;

    public String getData1(){return beanData.getData1()};
    public void setData1(String data1){beanData.setData1(data1);}
    }

    So far nothing unusual with the 3 classes, EXCEPT that my PrimaryKey class has a transient reference to BeanData. Notice that the PK class does NOT take this transient reference into account when calculating hashCode() or equals(); the PK class should pretend its not even there, especially from the client's prespective, where this field will be nulled from serialization. PrimaryKey.beanData is stored and remains in the WL container and is used by Bean.ejbLoad, Bean.ejbFindByPrimaryKey and Bean.ejbFindByXXX, as shown below:.

    public class Bean implements javax.ejb.EntityBean {
    private BeanData beanData;

    public String getData1(){return beanData.getData1()};
    public void setData1(String data1){beanData.setData

    //All of my BMPs delegate sql calls to DataManager, which is something I implemented
    private DataManager dataManager;

    /**
                 * ejbFindByPK does the typical sql select given a row primary key id; however
    * for my FatKey pattern the ResultSet is a fully populated row, and not just the id. * This is used to "fill up" the returned PrimaryKey
      */
    public PrimaryKey ejbFindByPrimaryKey(PrimaryKey pk)
    {
    //sql call here returns all fields of the single row with pk.ID
    ResultSet rs = dataManager.getRow(pk.getID());
    List justOnePK = constructPrimaryKeys(rs);
    return (PrimaryKey)justOnePK.get(0);
    }

    /**
    * ejbFindByXXX also does the typical sql select, this time with field criteria(s). One or
    * more fully populated row(s) is returned in the ResultSet, and each instantiated
                 * PrimaryKey will by filled with a populated row
    *
    * @param criteriaMap is a Map representation of name-value pair comparisons for
    * the SQL WHERE clause
    **/
    public PrimaryKey ejbFindByXXX(Map criteriaMap)
    {
    ResultSet rs = dataManager.getRows(criteriaMap);
    return constructPrimaryKeys(rs);
    }

    /**
    * helper function used by ejbFindByPK and ejbFindByXXX; takes a ResultSet and
    * returns a List of Fat PrimaryKeys
    */
    protected List constructPrimaryKeys(ResultSet rs)
    {
    List rtn = new ArrayList(rs.getFetchSize());
    while(rs.next()){
    int ID = rs.getInt("id");
    String data1 = rs.getString("data1");

    PrimaryKey pk = new PrimaryKey(ID);
    pk.beanData = new BeanData();
    pk.beanData.data1 = data1; <== Fill the Fat Key with data
    rtn.add(pk);
    }
    return rtn;
    }

    public void ejbLoad()
    {
    PrimaryKey pk = (PrimaryKey)geEntityContext().getPrimaryKey();

    //The following check is possibly not necessary, unless the WL pk cache,
    //when full, begins to serialize least-used pks
    if(pk.beanData != null)
    this.beanData = pk.beanData;
    else
    //populate this.beanData manually from the database
    }

    }

    Finally, this pattern works under the assumption that a both the initial findByXXX call and subsequent business method call(s) all occur within a single transaction. Both CMP and BMP finder/loader caching need to work within a transactional scope to prevent the risk of "dirty data". Hence, aside from Value Objects, this Fat Key Pattern is also nicely complemented by the Stateless Session Bean Facade Pattern. By batching calls to entity beans within SSB methods, you can guarantee transactional integrity and pick up on the intended performance gain when you implement Fat Keys.

    Threaded Messages (52)

  2. Hi Gene,
    what's the difference between "Fat Key" and the db-is-shared deployment parameter which is provided by Weblogic?

    best regards
    Darek Cebernik
  3. Hi Darek,

    <db-is-shared> is an EXTREME form of WL's cache optimization. If this flag is set to false, WL assumes only a single client (itself) uses the underlying db, allowing it to call ejbLoad only once per primary key during the entire lifetime of the server instance.

    Fat Key optimizes calls to ejbFindByXXX and ejbLoad for BMP, but only WITHIN a single transaction context.

    Hope that answers your question!

    Gene
  4. Fat Key Pattern: BMP Bulk Load and Data Cache[ Go to top ]

    The same article was wrotten in MSDN review ; (1997 i think) ; It was an application with data objects with
    auto expense properties...
    4 years ago ...
  5. Fat Key Pattern: BMP Bulk Load and Data Cache[ Go to top ]

    The same article was wrotten in MSDN review ; (1997 i think) ; It was an application with data objects with

    auto expense properties...

    I don't think it was "the same article", since I never remember writting for MSDN review, nor do I believe MSDN deals with Java, or did ejbs exist back in 1997! ;-)

    If you meant an article with similar caching concepts was written 4 years ago, then I don't doubt that at all! I in no way am claiming that my concept is original. Like what Mark Grand's Pattern in Java is to GoF's Design Patterns, the purpose of my exposition is to apply specific implementation to abstract patterns.

    Gene
  6. Hi Gene,

    That's a very good pattern.
    You said that you posted it on ejb-interest at java dot sun dot com.
    But I think the proper place for this pattern to stay is in the EJB sepc itself !
    So would you try to send your comments to ejb-spec-comments at sun dot com ?

    What happens is that they made a wrong decision, that is they optimized for the least common case.

    Normally, in common applications when you search database for objects on specific you almost always retrieve the whole data, and in very rare circumstances you retrieve only the primary key.
    At least I never did it this way.
    But because they put it in the spec, now it looks like it is a good pattern to make database roundtrips.

    What you did looks to me like the pattern that I define "design a workaround for a bad design".
  7. Hi Costin,

    > What you did looks to me like the pattern that I define "design a workaround for a bad design".

    I wholeheartedly agree. If one goes through the trouble of constructing a complex sql query in BMP (or EQL WHERE clause for CMP) to evoke a finder's method, I think it's pretty safe s/he wants EVERY bean returned to be fully loaded and ready for use/examination.

    Hence I think there needs to be a ejbFinderLoad or ejbBulkLoad as well as an ejbLoad. We still need fine-grained ejbLoad for out-of-transaction business method calls, but ejbBulkLoad would be evoked in conjunction with ejbFindByXXX to do exactly as what the method describes. My BMP FatKey pattern is an intermediate solution until a better architecture springs up!

    Gene
  8. I think it's pretty safe s/he wants EVERY bean returned to be fully loaded and ready for use/examination


    Even if the client doesn't iterate through the whole collection, the pattern is still safe, much safer than you thought initially.

    Then, there's no need , I think, for "ejbFinderLoad / ejbBulkLoad" because there really is no other way to reach to an entity bean, but through ejbFindByPrimaryKey, or ejbFindByXXX, so the simplest fix is to make both of them return instances/collections of the full entities.

    The drawback is that the app server will not be able to chose a lazy instantiation policy.
    This means the rare case where the client inspects only some of the references returned by calling ...Home.findByXXX(...), and completes the transaction.
    However this is the rarest case, for which the wise guys from Sun chose to optimise.

    The correct decision is always optimise for the common case so, if they feel like they did a smart job, they could have an ejbFindPKByXXX / mapped to findRefByXXX on the Home interface, which will use a lazy instantiation technique leading to the N+1 Database calls disaster.

  9. It seems that your simplest fix won't work because of needness to instantiate entity(s) from within the entity which is not a problem when returning single (you can return this) but is a problem when returning collection of them. Alternative would be to use separate object which implements home factory methods.

    regards
      Evgen
  10. <quote>Finally, this pattern works under the assumption that a both the initial findByXXX call and subsequent business method call(s) all occur within a single transaction.</quote>

    You need to use size 96 font for this statement! All bets are off if you don't wrap the finder method and subsequent business method calls in the same method in the facade session bean.

    Also, how come your findByXXX method returns a PrimaryKey object, not a Collection? I assume it is a typo.
  11. Eric,

    I think your criticism is a bit unfair.
    The typo is aboultely forgiveable and it happens to all of us from time to time.

    ejbfindByXXX() and later on, setEntityContext( ec /* with the bulk PK */) and ejbLoad() will happen in the same transaction context.
    And it works also for ejbFindByPrimaryKey().

    Always happens this way, if you read the specs more carefully :)
    It doesn't matter what pattern or anti-pattern you use.

  12. Sure. These method you listed are executed in the same TX context, but not any business method. If you call a business method in a different TX context (i.e., from a different method in the facade session bean) other than the original one in which the finder method is called, there is no guarantee that the data retrieved by the finder method and cached in the entity bean is not out of sync with the database, and the ejbLoad() call that proceeds the business method execution has to hit the database, not the cache.
  13. If you have a handle to the same Entity, in a different Transaction Context the problem is how you got it ?

    If you think of it, a decent application server will always call ejbFindByPrimaryKey or ejbFindByXXX... in a transaction context before handling you any form of reference to an Entity (we're talking about existing entities, not new entities created by ejbCreate(...)).

    If they have other mechanism they risk breaking the transactional semantic of the database (after all another transaction may delete the object).

    One case that I can think of, where the app server will have a chance not to go through the ejbFind... is when you serialize an EJBHandle.
    Even in this case , the app server should always call ejbFindByPrimaryKey, before allowing you to get the reference to the remote object or before allowing you to call any business method.

    So I would be curious if you can find an example where an app server will cache the Primary Key, let you have it via setEntityContext() and call ejbLoad() therefore skipping ejbFind... .

    Even if there is an app server that does this, the pattern can be adjusted by simple things:
       - either set Primary Key's additional data to null in ejbLoad ()
       - or store a transient handle to the UserTransaction in the Primary Key, and check for it.

    Still, the problem that is adressed by the pattern is very real is one of the several flwas in the EJB design

  14. The EJB container does not call ejbFindXXX, the client does, before any entity bean business method can be called. The business method can be called in the same client (facade session bean) TX context as the finder, and here the "Fat Key pattern" can obvious avoid a database trip. However, if the client called ebjFindXXX in one method, saved the remote interface in itsn instance variable, and invoked the business method on the remote object reference in another TX context, then you run the risk of having data out-of-sync.
  15. Eric,

    As I told you it depends on the app server implementation, also on the design of the client.

    The session bean for instance, has no business of keeping references to an entity outside a transaction context.
    It's like in old client-server days where you know that data outside transaction scope has no value.

    But a sure way to prevent this thing to happen is to slightly modify the pattern:

    you declare a
    transient UserTransaction transaction;
     in the Primary Key class and set it in the ejbFind...

    Then in ejbLoad() you compare it with the current UserTransaction and decide to reload data from the database.

    So, now the pattern covers everything
  16. I'm reading through the forum, so sorry if my posts are broken into pieces :)
    Again, I'd like to point out a few errors in this message:
    With non-transactional clients, each business method call is independent. You don't have any such thing as a session. I assure you that all EJB servers will not call ejbFindByPrimaryKey before each business method. This is simply a violation of spec rules.
    There also seems to be some comfusion regarding the use of setEntityContext. setEntityContext is called only once, and with no transaction context. I suppose you ment ejbActivate.
    A UserTransaction object is not available for entity beans at all. It can only be used in session beans with bean-managed transactions. Besides, one UserTransaction object may represent many transaction. UserTransaction is in fact, a singleton. That is, in most EJB server and in the JTA spec.
    As for the second correction option, it is possible (with some changea) and I will cover how I think it can be implemented in a later post.

    Gal
  17. Hi Eric,

    > If you call a business method in a different TX context (i.e., from a different method in the facade session bean)other than the original one in which the finder method is called...

    But like I said in my pattern, use a Stateless Session Bean Facade (I'm not a big believer in Stateful Session Beans anyways, but that's a different story!) So a client request can only involve a single SSB method, and there's no "entity bean leakage", as I like to put it. IF you do need to "leak" an entity to the client such as storing it in HttpSession etc., then leak the primary keys. Subsequent client requests would pass these keys back to SSB, which would involve in findByPK lookup all over again. As long as you are careful and remain consistent, you won't experience dirty data!

    And thanks for catching my PrimaryKey ejbFindByXXX typo; it was indeed late in the night! ;-)

    Gene
  18. Gene:

    I am not against the pattern at all - I think it is useful for a well-defined situation, namely, an entity bean business method is always called within the same TX context as the finder call. I just want to highlight the important statement you made.
  19. However, if the client called ebjFindXXX in one method, saved the remote interface in itsn instance variable, and invoked the business method on the remote object reference in another TX context, then you run the risk of having data out-of-sync.


    This is one thing about EJB spec that worries me, which allows one to find and save an EJBObject reference in one txn and evoke its business method in a subsequent txn. Even without my Fat Key pattern this sounds like a bad idea because what if this bean/row was deleted by another client in between the 2 transactions? Then when the first client evokes a business method in the second txn, I believe the container will call ejbLoad and throw an ObjectNotFoundException (at least that's what WL does). I can guarantee you most of us do NOT expect to handle ONFE when evoking a business method!

    Hence whenever I need to save an EJBObject reference for possible inter-transaction usage, I always store either its Primary Key or getHandle(). If I store the PK, I will need to call findByPK in subsequent evocation. If I store the Handle, I will need to call Handle.getEJBObject() (and I think it's HERE where Costin hints a server may call ejbFindByPK on the PK serialized within the Handle to retrieve the EJBOject). In either case, I will be prepared to catch and handle a FinderException if this bean was deleted by another client.

    Costin, tying UserTxn to the FatKey is a good way to track transactions and prevent dirty-data, but since I'm using WL Container Managed Txn, exposing UserTransaction would involve WL API. I think if one always store Handle or PK references instead of EJBObjects, txn tracking would not be necessary at all.

    Finally, Eric, I don't mind your constructive criticisms at all; they just spur further intellectual discussions which benefits everyone.
  20. Gene, getting the transaction will always work.

    You get it from the context object.
    context.getUserTransaction()

    It will work for any compliant app server

  21. OK, guys. Now the pattern seems to be complete. The exact circumstance under which the pattern can be safely used is properly defined, and more logic is added to make the pattern more robust. This whole practice really is Extreme Programming at its best.
  22. Unfortunately EJBContext.getUserTransaction works only for BMT Session Beans. If you tried this in entity or CMT beans, it will throw an IllegalStateException.

    http://java.sun.com/j2ee/j2sdkee/techdocs/api/javax/ejb/EJBContext.html#getUserTransaction()

    Like I said previously, WL does provide "workaround" accessor:

    weblogic.transaction.utils.TxUtils.getTransaction()

    But this will break portability.

    I believe my pattern, as it is, will never result in dirty-data because of the way Weblogic' primary key cache works: it is transaction specific, and is either resetted or cleared once a transaction is over. Since my Fat Key is stored in this cache, it's lifespan will last just that of the txn. But once again, this would also make my pattern non-portable because it depends on WL caching behavior. But I'm sure with some slight modifications this pattern can be coaxed to work on other servers.


  23. You're right.

    Then it is useful to set Primary Key's data to null just before ejbLoad() exits, and test for null on you enter ejbLoad().

    I think in this case you can't have any other problem. The only problem would be if some app server out there would be able to use lazy instantiation techniques so it would manufacture Remote References or EJBHandle solely based on primary keys, deferring the ejbLoad() until the first call by a client.
    And if the client doesn't call any method on the instance , saves the handle and calls it later in another transaction context, then we've got a problem.

    However this is a very unusual case, so is always better to optimize for the common case.

    On the other hand the data is transient so it can be stored only live it cannot be serialized, so the only place an app server would be able to let you keep dirty data is in the
    first primary key you created, probably in a hashmap.
    So the PK exists, while the object does not but a smart reference that will triger the activation of the object exists.

    While I was thinking to solve this case I found out another gray are of the EJB Spec: the developers are not mandated to override Object.equals () in PK class, they are only recomended to do so.


  24. Hi there.
    I'd like to point out that if you read the specs more carefully, you will see that setEntityContext executes with an *unspecified transaction context*.
    Also note that in most cases, the client is not what EJB calls a "transactional client" and so the container *must* allways first commit the finder transaction, then start a new transaction for each business method.
    As stated in the former message, one common exception to this is when wrapping the bean with a facade session bean.

    Gal
  25. Hi
    Could u please mail me code for a sample bean(BMP) using this pattern along with the Dtamanger class too.
    thanking u in anticipation
    Neeraj...
  26. sorry mu mail id is
    neeraj_nargund at yahoo dot com
    thanx
  27. Hi there :)
    I allready posted a couple of correction/comments and I didn't get a chance to say I think this is a very useful pattern. I was only trying to make sure people who use it won't get *surprising* results :)
    Here is a slightly different implementation strategy for this pattern that I think would comply better with the specs:
    Load the bean data in the finder methods into transient fields, just as you do now.
    In ejbActivate(), check the PrimaryKey class to see if the data field is not null. ejbActivate is called whenever an entity instance is associated with an "entity". This is where you want to use your preloaded entity data. If the primary key has a non-empty data field, the finder must have been called in the same transaction context. Otherwise, the EJB server will have gotten the PK from the client, meaning it got a serialized form which doesn't have this field.
    If you did load the data from the PK, turn a "usedCache" falg on.
    In ejbLoad, check if the "usedCache" flag is on. If it is, then you are currently up-to-date. Turn off the flag and return.
    If the flag is turned off, simply load data. The container may have reasons to call ejbLoad twice or more in the same transaction context, and you wouldn't want to interfere with it. Some containers is multiple ejbLoad calls to deal with local diamonds, for instance.

    This approach depends on implementation only for the optimization part. It will always work, though it might not be as optimized with different implementation.
    Also, beans written in this manner can be located using a handle, can be used without a single transaction context for finder and methods, and can generally do anything the spec supports.

    Any comments are welcome :) Take it easy.

    Gal
  28. Hi Gal,

    I see your concern, but I wonder why you need to involve ejbActivate or have a "usedCache" flag?

    In ejbLoad, just check if the pk's transient reference to ValueMap exists: 1) If exists, load bean with this map and set reference to null 2) If null, load the old-fashion way. I've had this condition check in my original pattern posted 5 months ago, I guess I left it out here... Besides, I think Costin filled me in by proposing this check in one of his posts.

    Yes, this pattern is absolutely useless outside a transaction scope. But then again, one should not expect ANY caching/bulk-loading pattern to work outside of a transaction scope, unless you don't mind dirty data or are willing to invest in a JMS cach-refresh mechanism! ;-) For this exact reason, Weblogic's CMP cache implementation also works only with a txn.

  29. Hi.
    I will first list a note on methodology and then a couple of practical problems with the suggested implementation.

    In my expirience, the best methodology to use when writing "patches" that were not intended by the spec is trying to keep as many spec rules as possible. This is generally true, and especially true with EJBs, which are highly managed components.
    Although your implementation keeps all the "plain" spec rules, it does break the semantic use of the callback methods. You intend to load the data once, after the bean has been located. I.e, when the bean is being activated. That is why I think that semantically, the loading should be performed in ejbActivate.

    As for practical problems with the existing spec, here are a few:
    1. You cannot assume that the EJB container passes you a direct reference to the bean's primary key. The container needs to be sure that it's copy of the PK doesn't change, and since it cannot be sure you don't have setter's that alter the PK, it might pass you a replication. Actually, since there is no place in the spec that directly forbids modifying the result of EntityContext.getPrimaryKey() (to my knowledge), a fully compliant spec *should* give you a copy. Ofcourse, this would cause the value object reference to never be set to null.
    2. Consider the following scenario:
    Client A calls findByPrimaryKey and starts invoking business methods.
    Client B calls findByPrimaryKey to find the same bean.
    Client A calls another method.
    If all bean instances share the same primary key in the container (which is probable), and the container calls ejbLoad on client A's instance after client B called the finder, ejbLoad will see a non-null reference. It will then take the data from the PK, which is the data as seen by client B's transaction, resulting in corruption of data.
    3. Again, assuming the container keeps only one instance of the PK for multiple bean instances representing the same entity, a race condition can occur. If two ejbLoad's are running concurrently when the value object reference is not null, then it is possible for both the if statements (the ones verifying the reference isn't null) to return true. Then, one thread will set the PK field to null and the other will attempt loading the cache from the null field. This may not happen (but could happen) if the field is not volatile, and can be prevented by synchronizing the accessors (which I did not see in your implementation).

    Note that the two latter problems are only true for container's using multiple bean instances to provide shared access to a bean (usually commit options B and C).

    I'm sure there are other such cases, but the point is that when you go out-of-spec, things can very easily go wrong. That's why I think you should try to preserve the spec's semantics as much as you can.
    by the way, I also think using ejbActivate leads to a more readable code, and makes the use of the pattern clearer.

    Best regards
    Gal
  30. Gal,

    For the app server to gice you the Primary Key it would mean to serialize it and deserialize it since the specification doesn't require the PK to be cloneable.

    This is an effort I think no App Server would undertake.The EJB spec is already too much overhead.


    Besides:
    1.)

    - it's stupid to modify the primary, once one does it, he's doomed anyway, and he can't put the blame on the App Server.
    - the spec doesn't prevent that
    - it doesn't need to be sure that the PK doesn't change, it needs to only to be sure that the equals() and hashCode() behave consistently, and this is a job left by the spec to the developer.

    Therefore a fully compliant spec need not give you a copy.
    I'd be very much surprised if they do.


    2. and 3.) Client A and client B calling the findByPrimaryKey, are in different transactions.
    So in general, application server need to provide separate transactions with totally separated data.

    The pattern is built on the assumption that the PK returned in ejbFindByPrimaryKey() will be the PK returned in the EntityContext, which is a natural assumption, but not granted by the specs (although they would better do that).
    If the app server doesn't honour this thing than the pattern doesn't do anything.
    If the app server is honouring this assumption, but also it is able to mix PK objects between transactions , well you're right the pattern is doomed (or you can go backa transaction ids issue ).

    Last, the ejbActivate is executed in unspecified transaction context, and it is guaranteed to be called only in passivation/activation scenario.
    I'm not sure even if you are guaranteed to find the primary key there! Anyway activation/passivation is best left alone with its controversial issues.

    Overall, I think the pattern is still valuable, because the n+1 database calls generated by the existing spec is pure and simple an anti-pattern.

    As to what regards the spec semantics, I'll quote again from SUN:

    "Workaround

    As a Java programmer, you have to assume that the specification
    is meaningless."
  31. 1) I did not say App servers will *do* that. I said the spec doesn't prevent them from doing that, and I don't see any reason for implementing something which is not compatible with the spec unless you have to. It's simply a matter of methodology. Violate the spec enogth times, and you'll bump into incompatabillities.
    Besides, I don't think you can safely cover all the implementation strategies and say servers will never pass you a copy. Maybe in a cluster, the server uses some central repository to direct requests to branches where the entity is allready loaded, and serializes the PK from the repository to the branch when invoking. Or god knows what.

    "it doesn't need to be sure that the PK doesn't change, it needs to only to be sure that the equals() and hashCode() behave consistently, and this is a job left by the spec to the developer."
    If you want to go to such formal lines, equals is defined to return true for two semantically identical objects. The container can't be sure that after any modifications you may have made, the objects are semantically identical.

    2,3) There is absolutely no problem with mixing the PK instances. The container has the assurance that they will not change anyway.
    If the container uses something like a Hashtable to store references to bean instances, it'll probably pass both beans the same PK. Again, I don't see why you have to take that risk.
    As I mentioned before, the transaction IDs solution is not good either. First of all, you can't do it in the entity. And even if you do it in a session bean, UserTransaction can be (and probably will be) a Singleton. This means you cannot use it as a reference to the transaction. What you want is the Transaction object itself, which is inaccessible for session beans.
    However, I don't think that this pattern is doomed. I think this implementation is doomed, but different implementations can implement this pattern.

    ejbActivate is called in an activation scenario, which means the picking of an instance from the pool and the assignment of an entity identity to it. The primary key is available from EntityContext there, as you can see in Table 4, section 9.1.6 of the EJB1.1 specification.
    IMHO, there are not "controversial" issues with ejbActivate. I have never heard of such an issue, and can't see how one could exist (considering the spec). If there is a specific issue to which you are referring, please list it so we can all know what to look out from in the future.

    I definately agree with you on the last paragraph. The spec's efficiency is a very book issue, and this pattern does a good job going around this. I'm merely suggesting a different implementation strategy, while it's obvious that the idea of the pattern remains the same.

    Regards
    Gal
  32. 1. You didn't get my point.
       If the App Server does use serialization (athough within the same VM it should be stupid), the pattern is absolutely safe, because the cached data is transient.
       If it does not, then it has no way of getting a copy.

    About activation, you're right, the pattern works also in the ejbActivate.
    But I don't see how moving the code in ejbActivate is going to solve any problem.

    As to what regards the controversial issues regarding the activation/passivation thing, is that it doersn't work for the intended scope.
    If you think of activation/passivation only as a object recycle feature (initialize/clear as opposed to constructor/finalizer), it works ok.

    The problem is when the spec sells us the story that under heavy load the container might passivate the the EJB to reactivate it when another call comes, thus saving memory and so on so forth.
    This doesn't work . Just IMHO.If you want maybe we can have a separate discussion, although theserverside.com doesn't have an anti-pattern section yet.

    What also is wrong in the spec about this issue is they say an Entity Bean (they refer here to the PK identity) has a pool of instances. Suggesting that there could be several instances all corresponding to the same entity.

    Luckily, both these issue are probably avoided by good app servers. But they show quite ugly in the spec.

    In the end, the best solution is that the pattern should be incorporated in the spec itself.
    My interest in the subject is purely theoretical because practically, I'm very happy to stay away from the EJB issues, until Sun hopefully manages to get them right.

    Maybe Gene cares to send it to ejb-comments at sun dot com.


  33. I think we can all agree that "optimization" and "standardization" are almost always diametrically opposing goals. Vendors are always guilty of striving towards the former while dismissing the latter. I'll give 2 examples, as there are many more:

    1) Weblogic (and some others) by default use parameter pass-by-reference in EJB calls within its container. This is in blatant violation of RMI specs, which states ALL calls need to be pass-by-value. Yet we not only tolerate but appreciate this optimization because its a time and resource saver!

    2) Borland App Server uses GIOP byte chunk serialization for doing PrimaryKey equals and hashcode comparisons; user are not allowed to overload these functions at all. Advanced users may gasp at this irreverence to EJB, if not Java, specs, but beginners may find not having to deal with with complex hashcode algorithms a blessing.

    Hence if vendors are at liberty to overlook the specs every once in a while to produce the fastest and most optimized containers, I don't see why use bean creators can't do the same.

    Bean-portability is a holy grail, just like Java's original mantra of "write-once-run-everywhere". I'm sure we all have optimizations in our beans that cater to the container it was developed and tested in. We implement these optimizations because they are practical and germane to the project.

    I say my Fat Key is relevent because it works on Weblogic, which currently holds a significant market share over other vendors, and that my pattern CAN be tweaked slightly to work on other vendors. And I'm glad my pattern is spurring all these wonderful debates; in the end we are all owners of the EJB specs and its evolution is directly affect by our contributions.

  34. "in the end we are all owners of the EJB specs "
    Maybe indirectly, if you hold Sun's stocks.
    Which is not an envyable position anyway.

    Just kidding. But I would be very curious what happens if you send this pattern to Sun.

    Why don't you do that ?
  35. Hey Gene,

    I've used your Fat Key Pattern in a system I'm developing. The pattern seems to work quite well, but I am having one problem with it. It seems that one the first time I call findByPrimaryKey() everything works fine, the PK object is created with the data and I can extract the data in ejbLoad(). However the problem occurs when I call findByPrimaryKey again later on. The method appropriate retrieves the data from the database and populates the fat PK. However when ejbLoad is called again, it obtains a reference to the original PK object ! So I've now got the old data for the entity bean. From what I can work out, it seems that WebLogic on the first finder call, caches the PK object in its PK cache... and then on subsequent calls to another finder.. it realises that it already has that PK object in its cache so it does replace its cached value with the newly created PK object.

    Have you noticed this happening to you?
    Do you have a work around for this.

    I'm using WebLogic 5.1

    Cheers,
    Keith.
  36. Hi Keith,

    That's right, as of WL 5.1 sp8 or sp9, BMP primary keys are finally cached like CMP pks for finders. Hence to prevent stale data, you will need to sever ValueObject or "skinny-fy" the FatKey after an ejbLoad call.

    Furthermore, I belive even with the new WL BMP cache, ejbFindByPrimaryKey will still be called at the beginning of each new txn. This is to insure the row itself isn't removed by another server in the cluster in between txns. Hence if you return a newly refreshed FatKey for each ejbFindByPrimaryKey, WL should stick this new key in its cache.

    Gene
  37. Thanks for replying Gene.

    What you said is not happening. We don't call ejbFindByPrimaryKey, we've got another finder method that returns a collection of primary keys. We populate these primary keys with the fat data, but these primary keys are NOT being replaced by WebLogic. Hence when ejbLoad is called we get the previously cached primary key object, which has already been "skinny-fied". Have you got any ideas for a solution?

    Cheers,
    Keith.
  38. Seems like CR29007 introduced in sp9, a request many people, including myself, asked for has ironically nullified my pattern!

    http://www.weblogic.com/docs51/classdocs/README2.html#CR29007

    Not only does the container now optimize the BMP cache for findByPK, but it also no longer switch the primary keys in the cache after a finder method! Hence with this new cache behavior, my pattern is not necessarily broken, but just not optimized after the initial finder call.

    I am currently evaluating 6.1 and porting over our application. I am pretty sure this version has a new cache mechanism, and I will see what modifications I will need to make with my pattern to comply.

    My primary goal of this pattern is to allow BMP developers the same caching luxury as CMP developers. However, if CMP 2.0 and its vendor implementation delivers what it promises, most BMP'ers will migrate to CMP and allow the container to worry about performance issues. Until then, I will continue to fine-tune this pattern, so stay tuned!

    Gene
  39. Gene:

    Did you ever get the chance to look into using your pattern with WebLogic 6.1? I would like to use it for my current project but want to make sure I know all the traps, if there is any. You insight on this is greatly appreciated.
  40. I'm currently developing purely SLSBs and MDB/JMS on WL 6.1, so I have not testing entity beans of any sort on this version. I would suppose BMP PK cache hasn't changed much between 5.1 and 6.1, since most of the development attention was focused on CMP 2.0; hence my pattern should still work for 6.1 BMP.

    I'm afraid you will have to be the 6.1 beta guy for my pattern! :-) Let me know how it goes.
  41. I think,
    if the container is caching pk's via the serialization/deserialization way, which means that your cached fat-key data are always nulled when pk's are reused, then you should have no problem.
    Otherwise some simulation that nulls pk's before they are reused, should be implemented.

    An other point about the pattern is that it is a great pattern but I see that it is trying to improve things that can always be better from the container perspective.
    What do I mean ?
    In the preface of this pattern is said that the container can't have this behaviour by default, because it have no control over the sqls. It's OK. But if we think that the most benefit is coming by reducing the transactional overhead then (always from the container perspective)nothing is stopping the container to execute an ejbLoad() behind the original findByXXX() transaction.
  42. Gene said: Hence if you return a
                        newly refreshed FatKey for each ejbFindByPrimaryKey, WL should stick this new key in its cache.

    It is not correct in WLS sp8. You can compare the pk in ejbLoad() and ejbFind(), they are different, unless you change the equals() to include the beanData.
    It works because the last ejbStore() will refresh the pk. But for cluster server, it will break. With DB-is-shared set to true, the ejbFind() is called, but in ejbLoad(), the pk actually is different, not the one you populated in ejbFind(). So the data will be stale.

    And, can you tell me how many times this pattern hit DB for ejbStore(), if initially findByXXX() returns n row data, and all n row are modified in one Tx?
    Can you post the ejbStore() of your pattern? Do you still have a dataManager for that? Seems you are mixing with DAO pattern?

    minjiang
  43. We have also used FAT Key with WebSphere 4.0 and are facing the same out-of-sync problem because the Container does not reply the "new fat keys" in its cache. As a result, we keep getting the old data even though it has changed in the database.

    Although, I think by setting the "fat key data" to null will solve this problem but it will then take away all the benefits of fat keys. Any future finder methods that are supposed to return a collection will again go through the inefficient Entity Bean mechanism of going to the database over and over again for each row.

    And, if we keep in mind that the fat key will work only for the first "client" then in a high-traffic application, it will very quickly become useless.

    If there were some way of refreshing the primary key object, this would definitely solve our problem.

    Any thoughts?
  44. I implemented a similar pattern on my own. The major difference with my implementation is that I expose the data directly to the client via the primary key class; i.e. the data is not transient and it has a public accessor method. I do not use the data if/when the bean is loaded, it always goes back to the database. So I not only avoid the overhead of n database calls, but also n calls to the EJB server, as all the data is returned in one shot back to the client. Another optimization I've made is the client is able to ask for specific fields (the default is none). The only drawback I can think of is the added complexity of the API to the client. It doesn't have the problem where the fat key always stays the same, at least on our container (JBoss 2.3 / 2.4).
  45. CMP optimizes the ejbFindByXXX and ejbLoad into a single operation. But you can easily do the same with primary key finds: just return the requested primary key from ejbFindByPrimaryKey and don't bother with database lookups. If the requested object is not found during the ejbLoad phase, simply throw a NoSuchEntityException to the container. This doesn't address the problem with ejbFindBySomeParameters but for menu/selection driven UIs the most common seeks are with primary keys anyway.
  46. Iqbal, we have the FatKey Pattern working with WAS V4. You have to make shure that (esp. when using complex primary keys) the hashCode()-method always returns the same values for the same primary key. For our application, having the FatKey Pattern working is a must as we access backends (CICS and IMS) and can not afford to access them twice (once in findByPrimaryKey and again in ejbLoad) to instantiate a BMP.
  47. I agree with wang minjiang about the needing of refresh
    of pk in the ejbStore(). When doing it, there is no more
    posibility of inconstent data when we call the business method outside the transaction with the find. I tested it and I didn't find any drawback with this method ( expect abour clusters ) but I didn't try it yet in production. Is there any problem I haven't seen ?
  48. Simulate a nulling of fat data after the first time (or before the 2-nd time).
    You can always force it by adding :
    ...................
    pk.fat_row =null;
    ...................

    at the end of ejbLoad() (it's better inside of some finally clause).
     
  49. Gene,

    I had this idea in november 2000, and posted an article then in the "Performance and Scalability Forum" called "Cache-based solution to N+1 database calls problem". First I thinked about using the primary key to store the data (your "fat" primary key) but then I changed my mind to use a cache. My article was posted in november 28, 2000.

    I think it confirms this is a good idea.

    Best regards,

    Guilherme
  50. Hi Guilherme,

    I originally posted my FatKey idea in sun's ejb newsgroup. Almost immediately I got a response from another person, Ana, who has also come up with a similar pattern on his (her?) own. As it turns out, this pattern was quite controvertial and spawned off a spate of interesting discussions from various bean and container developers. You can see all the posts here:

    http://archives.java.sun.com/cgi-bin/wa?S2=ejb-interest&q=FatKey&s=&f=&a=&b=

    This one big weakness of BMP has bitten many bean developers, and some like you and I have come up with a interim solution until the spec can be fixed so this caching responsibility is pushed back to the container!

    Gene
  51. I realize the original posting was done a while ago but perhaps there is still some interest in this thread.

    Gene, thanks for posting the pattern, I find it very interesting. The pattern combined with the resulting discussions from everyone else made for very stimulating reading.

    I have a question that I was hoping that some of the previous posters could address. The question is, since it is so obvious that the BMP Entity Bean spec is flawed, and that the workaround presented, though thorough, might not be completely portable or might even 'break' as a result of application server vendors optimizing their container implementations, why use BMP at all?

    Instead of using BMP one could use Session Facades (using stateless Session Beans) and some derivation of the DAO / Value Object patterns. Using this approach one should be able to create a portable solution with a similar feature set of BMP.

    I am not asking this question to be antagonistic. I am just genuinely interested to hear other peoples arguments for why BMP should be used at all.

    It would seem to me that by using session facades, one would get access to all the services provided by the container. If one has to resort to the kind of workarounds presented by the Fat Key pattern just in order to make BMP useful at all, then why bother.

    Again, I fully appreciate the thought and effort that has gone in to the pattern. I am not saying it is not a good pattern, it is, but if this is what we have to resort to in order to use BMP, what other benefits are provided by BMP, as compared to using Session Facades and DAO / Value Objects, to make it worthwhile?
  52. I guess some people still have faith in the evolution of the EJB specs, and their thinking is "If I stick with BMP, even though it's implemented badly right now, I will be rewarded down the line when Sun gets it right".

    I also like a homogenous framework philosophy: if I'm architecting in EJB, I want to use all EJB, and not EJB/DAO hybrid.

    But that's just my 2 cents...


  53. I implemented a bean managed entity bean to load bulk data. the funny thing is that the content in the member variable will be erased every time the container call ejbLoad. Is there any one know why. How to keep the state of BMP between the ejbLoad() calls.

    Apprecaite