Author: Adam Browning
Applications often need to perform a combination of listing behavior and single row modification behavior. Neither entity EJBs nor session EJBs provide a particularly good solution to this problem, since session beans require
developers explicitly perform all transaction management and entity beans have poor performance for listing behavior.
There are four primary players in this pattern:
- The librarian, a value object which holds metadata along with the primary key for the record
- the searcher, which performs the search and builds/returns the value objects
- the entity bean, which updates the data store
- the client
The client makes a request of the searcher to find all of the entries in the data store meeting a specified set of criteria. The searcher performs the query against the data store, and for each entry creates a librarian and provides it with the appropriate data and its primary key. Once the librarians are created, they are returned to the client which takes any actions appropriate for displaying the data, deciding which entries to update, etc.
When the client decides to update an entry, it calls a method on the librarian to see if it can update the entry. At this point, the librarian checks to see if it already holds the remote interface to the entity bean and if so it immediately returns an affirmative status code. If the librarian does not hold the remote interface, it acquires it ("checks it out") and refreshes its data. If any data changed during the refresh, the method returns a status code indicating that the data is dirty and should be represented to the user prior to updating.
Once the client is ready to update the librarian, it calls setter methods on it until it is finished updating, at which point it calls save. After save is called on the librarian, it passes itself back to the entity bean to bulk update the entry. (Note that at this point, an optimization is possible which reduces unnecessary network traffic. By passing a structure indicating which fields have been modified along with those fields, we need not pass unnecessary data across the network.) The entity bean modifies its internal state appropriately and informs the EJB container that its entry should be updated in the data store. This process of viewing the data, deciding which entry to modify, locking the entry, performing actions on it, then unlocking it is very similar to the process of finding a book in a library, checking it out, reading it, then returning it, hence the name of the pattern.
- Use of this pattern strongly reinforces the concept that interactions with EJBs should be of a coarse-grained nature.
- Listing behavior can display acceptable performance, while still allowing the EJB container to deal with most of the issues of transactions and data consistency.
- CMP cannot be used to its fullest, since the only search query performed by entity beans is the findByPrimaryKey method.
Data Access Object
I am not so happy with the mechanism describing the save. The librarian passes itself back to the entity bean. This suggest that the code of the entity beans refer to the code of the librarians. I rather have the entity beans in an completely independent java package.
The librarian should be or should use a Data Access Object to access the entity bean. One of the methods on this DAO may have a argument signature that assumes which of the fields are changed and need updated.
Using the memento pattern to send data back to the entity bean for persisting had been considered and was decided against due to the overhead involved.
For sake of separation, the important point to keep in mind is that the entity bean serves effectively as a gatekeeper for the data, rather than as the retrieval mechanism. When it receives the Librarian to persist, it will see the Librarian as nothing more than a value object, totally void of business logic.
You shouldn't be using remote references to entity beans, and if you are, why bother with the Librarian at all? Just use the entity bean.
This really doesn't seem much different than Data Transfer Objects anyway....
The answer to your question is that entity beans incur tremendous overhead for listing behavior. If you're returning 1000 items, for instance, you'll end up doing 1000 SQL queries for entity beans. Needless to say, that can take a long time.
On the other hand, you could use DTOs exclusively, but then you have to do a lot more work to deal with concurrent update issues. If users A and B both want to modify a single record in the database, and A modifies half the fields, B modifies the other half and B submits his changes after A, then A's changes are lost and no one is ever the wiser. The entity bean is there to provide a single authoritative ( and caching ) answer as to whether the data has been changed, and if so, in what way.
The purpose behind having the Librarians which check with an entity bean is to take advantage of the benefits that entity beans can provide in terms of dealing with concurrent access, without having to pay the tremendous overhead that has lead many to make the blanket statement that entity beans should be avoided completely.
I like the decoupling of the search and client-side update from the persisted/shared lock and update.
I think the core concept is removing the token/lock that gives one particular client control of the data from the persistence mechanism (entity bean), which still gives transactional control within the data source. (?)
The performance advantage is not to be sniffed at either.
Now a question: can you propose a mechanism (or recommend a technology) for producing the equivalent of CMP in the searcher? CMP removed the need for basic data source interaction code in the EntityBean, but under this pattern the developer now needs to code it back into the searcher.
Is this statement always true?
"The answer to your question is that entity beans incur tremendous overhead for listing behavior. If you're returning 1000 items, for instance, you'll end up doing 1000 SQL queries for entity beans. Needless to say, that can take a long time."
Don't some containers do eager prefetching?
While some containers do eager pre-fetching, you usually have to supply a hard-coded number of rows to be pre-fetched to the container in the deployment descripter. This approach fails to scale. It also fails to answer the problem of ad-hoc queries. A user interface that enables the user to select search criteria can result in result set of 1, 100, or 1,000,000 rows! So eager-prefetching is a kludge, at best, to make up for a primary deficiency of the CMP model: poor search speed.
Also consider that a pattern, like the Librarian, that presents a container-independent method of solving the CMP search issue is a worthwhile alternative to the dependency and portability issues involved with container-specific behavior.
The Fastlane Loader pattern is for doing tabular queries and operations and does not incurr speed penalties like CMP. I admit that this pattern is circumventing the container a bit but if done properly is actually very potent. Remember that this pattern is only for tabular reads and not for updates. You still need to go through the regular session facade for updates if you want to retain the transactional aspects of the updates.
Just a thought... in trying to optimize an app that does a great deal of search -> list -> select 1 record -> update, we discovered that the SQL overhead from the container can be HUGE. The searched entities have a relationship to several others, each of which has an m-to-n relationship to several others, etc. All in all a large oject graph. we count > 80 SQL queries against the DB for each record, and are currently looking at Fastlane style loading, as the searches often return 1000+ records.