Performance and scalability: CMP 2.0 vs JDBC for bulk loading
With a good CMP 2.0 implementation (JBoss) is it better to use the entity layer or direct JDBC for listing purposes?
- Posted by: Cameron Zemek
- Posted on: December 25 2002 02:50 EST
Eg. if I was listing 1000 products, 50 per page, would it be better (performance-wise) to use the entity layer or to have a session bean directly access the database with JDBC?
If I use the entity layer then the results are cached. But how much does this buy (in performance improvements) if products are being updated hourly? daily? weekly?
- CMP 2.0 vs JDBC for bulk loading by Mark Hills on December 29 2002 22:14 EST
- CMP 2.0 vs JDBC for bulk loading by E. A. Graham Jr. on December 30 2002 00:27 EST
- Investigate database specific loader instead... by Derek Ashmore on December 31 2002 09:17 EST
- EJB-QL's shortcomings are the real trouble by Thomas Nagel on January 07 2003 10:59 EST
You may want to try checking this to see, but I would assume that JDBC would be much faster. Databases will also cache data pages and queries, and you won't have the overhead of constructing a large number of entity beans that you may never actually manipulate (update, etc).
Nobody can answer that for you - *you* have to do performance tests and analysis before the question can be answered. All I can say, with any certainty, is that I've gotten very good performance from JBoss with a small user group and a smallish data set using EntityBeans.
Most databases provide a loader utility that can bulk load data much faster than any application could. I'd avoid rolling your own if you could avoid it.
Derek C. Ashmore
(Architect for ThreadWorks)
I am facing the same dilema. I have run the tests and using JDBC from a session bean is much faster. The problem I see is that you are now coding for a specific database implementation and bypassing the descriptors. Isn't there a way to define some way of getting a resultset via the descriptor instead of coding JDBC in a Java object?
Why is using JDBC such a bad thing if you've established it performs better? Why is "coding for a specific database implementation" necessarily bad? How likely is that you'll change database--or change to a database that can't run the same SQL?
I think a lot of J2EE developers assume that database portability is always essential. In my experience, it's an unusual business requirement, not a given.
Rod Johnson, author of Expert One-on-One J2EE
in my eyes, the real problem you will eventually see is that you usually will not need to convert/print/etc. the content of the whole table at once, but only a small part, but the whole table will be activated as entity beans if the table ordering is not the one you need.
So, the performance of CMP is not the only problem. In order to be comparable to JDBC in speed you also need to set filter and sorting criteria. It is proposed by Sun to use "finder" methods for this porpuse, and to define the query by "EJB-QL". This, however, does not help really.
The sorting problem is only solved in EJB-QL 2.1, it has the 'ORDER BY' clause. A volume restriction is, however, unknown to EJB-QL.
So, a possible task of getting the first X rows of a table sorted according to the column Y is not completely solveable with CMP without sacrificing some performance for the activation of beans you dont really need. With JDBC you would simple break the loop when reading through the result set when you have enough data to display.
Hope this helps.