Performance and scalability: Coarse Grained Entity Beans vs. Fine Grained Entity Beans

  1. Allright, its time for the showdown. Lets battle it out and find out the truth behind the debate about which is faster, coarse grained entity beans or fine grained entity beans. :)

    First of all, I define fine-grained entity beans as small objects that do not qualify as dependant objects. That is, they have their own lifecycle outside of other entity beans, and have business value specific to your deployment. An example is a MessageBean for a forum messaging system, this has business value for my needs, whereas an "Address" is just a dependant object. I further only refer to the use of fine grained entity beans with other performance enhancing patterns like wrapping with session beans, and using details objects.

    I think people have gotten too gung ho about the coarse grained pattern. I don't like the overuse of this pattern because its:

    1) harder to implement
       - can't find objects with finders
       - fine grained object business logic has to bunched up in the coarse grained parent

    2) in some situations its not as fast as fine grained
       - there is a reason ejbLoad is called for all N entity beans after a finder is called. It is so that an app. server can use cached instances of the entity beans rather than go to the database. When using a coarse grained approach, data is gathered from the database everytime and there is no caching going on (except in the db).

    3) just plain unintuitive. data model object maps to entity bean is easy.

    I like a fine grained entity bean approach because:
    1) its intuitive and easy to implement
    2) When properly used (wrapped by entity beans, and using details objects), it can actually be a lot faster than the coarse grained approach, due to app. server caching.

       The performance issue with coarse grained vs. fine grained comes down to one thing:

    - is the data used in your system often read, or even read only?

       80% (my arbitrary estimation) of all web transactions is just listing data, often the same data. For applications like this, a fine grained approach is always faster than a coarse grained approach because it is always faster to get data from an in-memory entity bean cache than making a database call to a DB usually on another machine. The coarse grained approach would require frequently accessed data to be read and re-read all the time.

       For business cases where you don't have the same data being frequently read, then yes, fine grained entity beans can be slow, because you will require n+1 database calls when finding n entity beans (see the problem with loading many entity beans), and it would make sense to utilize the bulk loading ability of a coarse grained strategy, but for reading in frequently accessed data (which IMHO is a more common situtation than the former), fine grained entity beans are faster, when utilizing app. server caching.

        What do you guys think? I havn't read the EJB2.0 spec, does any one know if the n+1 DB calls associated with finding entity beans was addressed in EJB 2.0?

  2. I agree with Floyd. Well-written application servers (such as Inprise I hear) provide declarative flags to improve the performance when accessing entity beans. Vendors do this by making the point-of-interception be a very lightweight affair. This reduces performance issues when making entity beans fine-grained.

    This is attractive to me because it means we can still use the same EJB coding paradigms irrespective of granularity. Furthermore, if caching is used (as Floyd eluded to) the performance can actually increase.
  3. I don't agree on some points:

    >>1) harder to implement
       - can't find objects with finders

    with course grain you don't use finders rather you have a method that will return the child objects that are in the eb.

    I don't understand how the course grain enity bean can be slower.

    The objects that whould be entity beans in a fine grained model, (for example order line items of an order) would instead be normal java classes that are attributes (in a collection) of the main EB in a course grain model. (for example order line items are java classes that are used inside the Order eb)

    With course grain for example when an order is loaded from the database into an order eb, all the order line items are also loaded up as java classes and are stored inside the eb.

    This means one eb in a the cache as oposed to serveral (one order bean and n order line items beans) with fine grain.
  4. with course grain you don't use finders rather you have a

    >method that will return the child objects that are in the

        Yes but how do you do searches for particular attributes of those child objects? With entity beans, you just do findByXXX, and you get the appropriate objects returned, with coarse grained persistence, it seems much more complicated, you would have to manually check each object you are storing if they match those attributes, you would have to check both your objects and perhaps some in the datastore. For example, lets say you have 30,000 child objects, you should never have to have them all stored in a vector in the parent EJB at a time, you would only have partial sets of objects as required (lazy loading). Cases like this simply querying for child objects seems very complicated.

    >The objects that whould be entity beans in a fine grained
    >model, (for example order line items of an order) would
    >instead be normal java classes that are attributes (in a
    >collection) of the main EB in a course grain model. (for
    >example order line items are java classes that are used
    >inside the Order eb)

      I see, so you would basically maintain the equivalent of an in-memory cache of coarse grained entity beans, which internally would cache fine grained objects.

      I suppose that in this case it would be faster than entity beans. My comments about difficulty in implementation from above apply to this point as well.


  5. just like you'd have to write the ejbFindByXXX method, you'd
    have to write this other method and expose it in your
    business interface, so there's no difference (assuming
    BMP here).

    I always think of enterprise beans (both session AND
    entity beans) as CORBA services. No CORBA developer in his
    right mind would turn every line item in an order into a
    full fledged CORBA object and make it remotely accessible.
    That has been tried in the early DST (distributed Smalltalk)
    days too often and has consistently failed because it doesn't scale and it doesn't perform.

    An Enterprise bean has as much overhead as a CORBA object
    if not more. So you write an OrderService that deals with
    Order and OrderLineItem objects. Whether you make Order
    an Entity bean or not can be debated, but I don't think
    there can be any debate about the LineItem objects not being
    Entity Beans since they are completely encapsulated by Order
    and fully depend on an Order for their existence.

    Entity beans are a failed attempt to achieve cross platform
    object-to-relational mapping IMHO. Use them sparingly...


    Frank Sauer
  6. We used a very similar approach in my previous project as the one described by Stuart. We had a lightweight EJB framework since the querying and BMP persistence (heavyweight stuff) was delegated to "JDBC data servers". We maintained polymorphism and reuse in our code. However, instead of using key-value mappings, we used model objects specific to each type of Entity. Therefore we had to maintain specific EJB classes due to the EJBHome not supporting polymorphism, but these were very small in size. Our database schema wasn't expected to change for many a good year! ;)

    A lot of work was spent on the BMP stuff so I'm looking forward to seeing how good the CMP is in EJB 2.


  7. I've been mulling this one over in the context of a current inventory system I'm developing with Weblogic. The inventory items are quite large as a lot of economic data is associated with the items (200+ fields)...

    From a design perspective, we're using an XML hub to delegate business transaction calls into session or entity beans. Our goal was to have "set based" operations to be performed in session beans that delegate to stored procedures, and to have "row level" operations to be performed in the entity beans.

    This implies that data-reads were encapsulated in session beans, and data-updates were encapsulated in entity beans. We avoided entity bean find-methods to work around the n+1 database calls problem.

    But, this also implies that all row-level data updating occurs on a fine-grained entity bean. In order to accomplish this, we've used the coarse-accessor method to eliminate "bursty get/set method" problem. We pass in a Map of key-value pairs for updating. We also use this for accessing. What's useful about this pattern is that it's relatively easy to generalize this design to a "generic" uber-bean that can leverage a data dictionary to allow translucent schema changes.

    The results are a very clean design where performance remains sub-second even given dynamically-generated SQL statements on the inserts & updates.. the trade-off being that our entity beans are "simple", i.e. no relationship traversal. Great for an inventory or data-driven application.

  8. Dear Floyd,
                 I want more clear meaning of Coarse Grained Entity Beans as well as Fine Grained Entity Beans, help me to understand them more in detail.

  9. Sorry, but I think "somewhat fine grained" EBs are nearly always faster than completely coarse grained ones, not only for read operations (but especially there, but not only for caching reasons).
    The reason (as I think) is the following:
    Consider an EB A with 100 subobjects of type B
    A (1)->(*) B
    A also has a description (which is a dependent object).
    if I make B a java object (not an immutable dependent one, that's what we are talking about) the collection of Bs will be serialized, together with all Bs themselves.
    Now if I access A's description and it is not cached, the AppServer will have to load (deserialize) A, the collection of Bs and all 100 Bs... everything de-serialized. Nice thing.
    If I have a collection of Handles to Bs (and therefore make B an EB) only the handles will be deserialized, the B objects will be "lazy fetched", only if I access them... and if I never access them A will be thrown out of the cache while the 100 Bs will never get loaded/deserialized.

    With write operations it is the same, but e.g. Inprise AppServer detects which fields have changed, so it does not apply there, but I wonder if WebLogic et al do the same?

    Yes and BTW: Don't think you can implement your own caching mechanism for objects and expect it to be better than the appserver's one... this is nearly impossible unless you are god or you are willing to spend one year tweaking it (and most people don't). The AppServer vendors normally know what they are doing (actually that's their job). At least Inprise/Borland and Orion does.

    Any comments appreciated, maybe I got something wrong?

  10. I haven't understood one thing here .
    Why is there no caching for Coarse Grained Entity Beans.
  11. because we design them as independent of the entity bean and they are taken to be normal java class having other dependent objects if any so need for caching.