I have read a lot about the advantages of using a CMP entity bean like persistance being managed by the container and hence a faster development time. Also beans are inherently remote objects and lend thensleves to a distributed architecture etc. So, we have gone ahead designing and implementing an ASP (Application Service Provider) kind of an application using Entity Beans, Session Beans and servlets on IBM WebSphere Advanced Edition.
But we have some basic issues with transaction management and very painful issues related to WebSphere's performance.
Related to transaction: Since ours is an ASP-based accounting package, the clients are small-medium sized companies. Each of them require to have thier own data secure. For this, we first decided on having multiple databases/schemas with separate set of tables for each client. So we will have to map the entity beans to each of these tables at run-time, based on the company that is logging in. We found that this is not possible with the features in Entity BEans Deployment on WebSphere (am not sure if it is possible wiht any other server). Since we can have an Entity bean mapped only to one table, we decided to merge all the companies' data into a single large table with an extra primary key denoting the company id. This worked fine with the logic. However, when it came to transaction management, for certain processes like balance sheet creation, we require to use a table-level lock (transaction_serializale as the isolation level) for getting a consistent balance sheet. However, since all the companies are mapped to the same table, till one client finishes generating his balance sheet, the other clients cannot even update it. So we seem to be into sort of a dead lock about whether to make the transaction_serializable blocking all updates from other companies or take the risk of inconsistent data?
Because of the above problem, I have a fundamental doubt about why should one use an entity bean at all when, the said persistence, transasaction managemnet for concurrent user etc. are features provide by the underlying database.
One reason could be that in-memory object represetation of data is far more efficeint than doing database calls. But how well is this concept implemented byt he containers. Would not RAM requirements be huge in a case where we deal with large databases? When would one basically go for an entity bean?
Coming to WebSphere performance: We have had a harrowing experience trying to just successfully deploy the 600 servlets we have developed and the 120 EJBs. (50 of which are entity beans). Only after increasing the JVM heap size of the App Server to 512 MB, could I successfully complete the deployment. But for the Appserver to Start, it was another ordeal. Only after changing the server from a single CPU Solaris to a twin CPU machine could I sucessfully start. with just the App Server process requiring about 65% CPU at one point during start up. Once, the server has started, then the CPU utilization and RAM usage drop considerably. This being the deployemnt experience, I have some probelm with its performance for more than 25 concurrent users on a 2GB RAM Solaris machine. It seems to work in a single-thread paradigm. Could this be due to using Entity Beans in such large nos.?
What is the kind of design change, any of you would suggest for such an ASP application? Should only session beans be used with the persistence and transaction managed by JDBC API?
Regarding your original plan of having a table per company that registers with your ASP, have you:
1) Asked IBM what the problem is? You are paying for their tech support after all.. do they claim to not support your paradigm at all with CMP?
2) Considered switching from CMP to BMP? If you use BMP, you should be able to control everything. It sounds like you're running into a limitation with the WebSphere persister. Remember that you can use entity beans without CMP. Personally we use BMP for this web site (TheServerSide.com) and it was very easy once code templates were established (mostly just copy-paste work).
Your next problem was about having to lockup the whole table when doing a query to avoid phantoms. Here are a few questions for you:
1) Have you done benchmarks for this particular query and validated that it is indeed a bottleneck using real-world usage data?
2) Have you considered breaking apart the data contained within this table into smaller tables, where each smaller table contains a subset of the information in the larger table?
3) Have you engaged a DBA to advise you on optimizing this schema? What about using views?
Finally you speak of performance problems deploying WebSphere. A few questions for you:
1) Have you been through WebSphere training from IBM on how to tweak/tune their app server?
2) Have you engaged a WebSphere specialist from IBM to help you who has done this before?
3) Have you read some of the IBM RedBooks on optimizing WebSphere?
1. India IBM said that they do not support. (Though I am not sure if they are right). But, even I think that you have to map a particular bean to a particular table. Would anyone support a dynamic choice of the table to which an entity bean has to map to? Probably, BMP would have been a better choice. But, since the entire persistence has to be managed by us, I just wondered what is the advantage of using an entity bean at all. (I am still not able to find one solid reason as to why we should use and entity bean, when coding with JDBC from a session bean turns out to be far simpler)
2. It is when we considered breaking the table into smaller tables (based on companies), that I realised, I need the same entity bean to talk to any of these tables or I need to create entity beans on the fly, to talk to the newer tables that are being added.
If we were to use views, how do we use entity beans? can entity beans be mapped to views? or should we access views only using JDBC? As far as I know, Websphere does not support mapping entitybeans to views.
We are in the process of benchmarking the query with a real-world scenario
3. I have just attended the basic WebSphere course from IBM. They have not been able to arrange the performance tuning session (for want of demand) in India. Nor are any IBM WebSphere experts available here. (tried the length and breadth of IBM in my country, literally). I have been through all the IBM WebSphere Redbooks. And I find that there is only one on actual performance tuning (SG245657). Some of the other redbooks, I have gone through are sg245460, sg245471, sg245429, sg245754. (though some of them are for the 2.0 of webSphere).
I am extremely new at this, but why not map data pools to your schemes then your entity beans would just use the data pool.