July 28, 2004
Every body shuns CMP entity beans be either they try to blame J2EE or get some
political mileage for their unveiled agenda. This happens almost every thread
at least in community forums like TheServerSide.com or JavaLobby. However I
was surprised to see a houseful when I presented “Optimizing Container
Managed Persistence with Relational Databases” with Mike Keith in JavaOne
this year. I was expecting very few attendees with ongoing slaughter of CMP
entity beans in the community forums and the fact that EJB is having a quantum
loop in the EJB 3.0 spec. And almost everyone stayed in the room. After the
presentation, I spoke to few attendees and almost every one of them use CMP
entity beans in their production application servers such as Weblogic, OC4J
or Websphere. Most of them are happy with their the scalability requirements
of their applications.
Before I took over the responsibilities of EJB container for OC4J about one
and half years back I always thought EJBs and especially CMP entity beans are
useless and no body uses them at all. But soon realized, there are thousands
of Oracle customers who use EJBs and CMPs in production. After working with
a lot of developers those complained about performance of CMP entity beans do
not understand the impact of entity beans on the databases. In this article,
I will provide a guideline how you can tune optimize the CMP entity beans with
relational databases like Oracle.
First advice is to examine that you really need CMP entity beans for your applications.
Following are guidelines that you can use to optimize the performance of your
CMP entity beans:
- Reduce Database operation
- Tune Database/SQL Access
- Caching in the middle-tier
- Use Right Locking/Isolation Mode
Let’s examine each of these sections and see how you can effectively
use to improve the performance of CMP entity beans.
Reduce Database operation
Reducing database operation or number of SQL generated by CMP entity beans
is easier said than done! You have to look for vendor specific parameters how
your container optimizes the SQL generation.
Enforce primary key constraint at database level
Several EJB containers cannot rely on the existence of primary key constraints
in the underlying database table and executes a SELECT statement to make sure
the insert does not lead to a duplicate entry.
For example, you can disable this extra constraint checking by OC4J by setting
do-select-before-insert=”false” for
your CMP entity bean.
Defer EJB creation to end of transaction
Creating an entity bean with a relationship is a two-step procedure,
by default this involves two SQL statements: an INSERT statement in ejbCreate()
to persist the entity bean, and an UPDATE to establish the relationship. Some
containers defer persistence of the entity until the end of the transaction.
This means that the row is inserted into the database after the ejbPostCreate()
method instead of ejbCreate(), avoiding an extra UPDATE.
If you are using OC4J, make sure that you have this setting for your entity
bean data-synchronization-option="ejbPostCreate"
Use DELETE CASCADE at database level
When an entity with relationships is deleted with multiplicity ONE, the
container is responsible for deleting all dependent entities. The container
generates and executes the required SQL code to remove all of the related child
records from the database. The native cascade delete facility in the Oracle
database is much faster as this avoids multiple delete statements when removing
the dependent records. The database removes all related records when the CASCADE
DELETE option is specified for the table.
Batch update operations
Calling a set method on a bean field modifies the bean value and requires
the new state to be stored in the database. The EJB container does this by calling
the ejbStore method and issuing an UPDATE to the database. Good EJB container
implementations provide a way to batch bean field updates into a single UPDATE
operation when multiple set methods are called within a transaction. Reducing
the number of calls to the database improves application performance.
Avoid Association Tables for maintaining relationship
Association tables are maintenance nightmare and require unnecessary
SQL statements and JOIN between the entity table and association tables to maintain
relationship
Set pre-fetch size to reduce roundtrips to the database
Oracle Database JDBC drivers have extensions that let you set the number
of rows to pre-fetch to the client while a result set is being populated during
a query. This reduces the number of roundtrips to the server, and can drastically
improve the performance of finder methods that return a large number of rows.
You can set the pre-fetch size for your finder method in OC4J as follows in
the vendor specific deployment descriptor:
<finder-method query="" prefetch-size="15" >
<!-- Generated SQL: "select * from EMP where EMP.ENAME = ?" -->
<method>
<ejb-name>EmployeeBean</ejb-name>
<method-name>findByName</method-name>
<method-params>
<method-param>java.lang.String</method-param>
</method-params>
</method>
</finder-method>
Exploit eager loading of relationships
Some EJB containers allow the eager loading, or pre-loading, of related
entity beans when an entity bean is loaded. You can enable this feature by setting
a parameter for your EJB in the vendor-specific deployment descriptor. This
may improve performance if the related entity bean is used in the same transaction.
In this case, the EJB container uses a single SQL statement to load the related
entities. As a rule of thumb, use this feature only if your application needs
to use the related entity bean in the same transaction. Otherwise, you’ll
cause extra data operations that aren’t required.
Use Lazy Loading with caution
Use lazy-loading only when retrieving large number of objects or entity contains
a large field such as BLOB.
Tune Database/SQL Access
Make sure your finder methods use indexes
Queries should always use indexes for optimal performance. Avoid full-table
scans when possible. For CMP entity beans, finding the SQL statement that does
a full-table scan is a challenge, since the container generates the statements
that are executed. To ensure that the EJB container is generating efficient
queries when executing finder or select methods, make sure the tables being
accessed are indexed correctly. To review the SQL statements generated by the
EJB container, you can use open-source tools like P6spy (http://www.p6spy.com)
to intercept and log the generated SQL statements. With these statements, you
can use the Oracle database “explain plan” facility to determine
how the query is executed, and whether any indexes can be created for the tables
to improve database access.
Avoid findAll on entities based on large tables
When the findAll method is used, the container tries to retrieve all rows of
the table. Avoid using this method on entities based on large tables, as it
will slow down your database.
Exploit statement caching
The Oracle database provides a feature that caches SQL statements. This reduces
the parse time when the SQL statement is issued to the database multiple times.
For example in OC4J you can enable statement parsing for your data-source by
adding the statement-cache-size attribute:
<data-source>
...
stmt-cache-size="100"
</data-source>
Define your own SQL Finder method
If you think you are really good in SQL, you can make a difference. Several
containers like OC4J and TopLink CMP engine provide a way to define your own
SQL for the finder method or an ability to override the SQL statement generated
by your container. You have to use this in caution as this may have adverse
effect on your application. You can write your highly optimized SQL statements
for your finder method with Appropriate JOINs and SQL Hints
Use the Right Concurrency Mode and Locking Strategy
Choose the right isolation level for entity beans
EJB containers support the specification of isolation modes for entity beans.
In general, it’s better to avoid performing this work at the EJB container
level and leave isolation management to the database. Oracle databases support
COMMITTED and SERIALIZABLE [u1] isolation levels. DBAs will probably lean toward
maintaining assured data consistency; thus, they’ll want to use the SERIALIZABLE
option for mission-critical systems. However, COMMITTED performs better.
Choose the right locking mode
In addition to EJB isolation modes, containers usually provide a locking mode
for concurrency control of CMP entity beans. For better performance, choose
either an OPTIMISTIC or READ-ONLY locking mode. Using a PESSIMISTIC locking
mode, which locks a database row for the duration of a transaction, can adversely
affect performance.
Avoids database calls by using read-only patterns
For entity beans that are rarely modified, most application servers support
READ-ONLY and READ-MOSTLY patterns to avoid repetitive loading of the same beans
from the database. By identifying where you can use this approach and configuring
the specific beans accordingly, you’ll reduce database accesses and improve
performance.
Cache in the middle-tier
You have to check whether your container provides support for caching of entity
beans and whether these are applicable to your applications. You can use flexible
caching options to ensure you get maximum performance. Following are few advantages
of caching:
- Performance is boosted because frequently accessed data does not have to
be constantly built and rebuilt.
- Memory footprint and garbage collection kept low because multiple readers
share objects
About the author
Debu Panda
debabrata.panda@oracle.com
Blog: http://radio.weblogs.com/0135826
I'm Debu Panda (debabrata.panda@oracle.com) and I'm the Product Manager responsible for EJB Container in the Oracle Application Server development team. I've around 13 years of experience in IT mainly in the Database and Application Server technologies. I've authored articles in several journals and presented in several conferences.
|