From the database a record needs to be selected and then the same record needs to be updated. This is a single record only for the whole application. I have written a method in EJB (tested with both entity and session bean), which selects the record from the database, and update it. I use two queries inside this single method. It works absolutely fine if one client repeatedly calls this method number of times. The problem starts when multiple clients access at the same time.
What happening is that, for few clients the selects are happening first and update is happening afterwards. This leads to failure in our application. We want always, select then update sequence. If I able to make this method as synchronized method, then multi user scenario there will not be any problem because at a time only one thread will be allowed to execute this method. Because of EJB restriction, I am unable to use synchronized method. I tested with all different type of database isolation level for this method. Nothing is helping me.
Simply my requirement is, I want to have a synchronized method feature in EJB. Could you please help me in this regard?
Thanks a lot in advance.
Try doing a SELECT FOR UPDATE. This will get an exclusive lock on all of the rows that match the WHERE criteria of the SELECT. You could use this with either a BMP entity bean or a session bean. The exact behavior of the locking depends on the database in use and the transaction isolation level, but the three possible behaviors are that the second (colliding) SELECT will either fail, block until the first transaction has finished, or succeed immediately but fail to commit if both transactions modify the data.
Keep in mind that this will limit the scalability of the application.
The SELECT FOR UPDATE is one way to go, although I would not want to do this, and this is not supported by all databases. A simple approach that I have used on all selects and updates is to implement a simple timestamp check:
Add a timestamp field or some uniquely system generated number field to the tables involved. Select the timestamp field or unique field, then when you update the record update where the timestamp=the timestamp selected and update the timestamp to a new timestamp. This way any other selects that retrieved the record at the same time will fail. Have your bean throw an error and make the application deal with it, by either retrying it or having the user reneter the data.
I typically do all of this in a base class and all of my DAO objects implement this class.
Hope this helps,
Both pessimistic locking (the SELECT FOR UPDATE) and optimistic locking (as suggested by Rich) will provide the transactional integrity needed. Rich is right to generally avoid SELECT FOR UPDATE, as some databases lock at the page or table level (look for the feature "row level locking").
If a lot of concurrency is expected, the extra overhead of pessimistic locking may be warranted. Also, when an optimistically locked system reaches saturation shorter transactions have a greater chance of succeeding, which can have odd effects.
There exist analytical models to predict which scheme will work best given certain parameters, but the best thing would probably be to benchmark both methods under a realistic peak load.
If you decide to use optimistic locking, you might not have to code it yourself. Borland AppServer does optimistic locking for entity beans with CMP, and my understanding is that some of the object mapping tools provide it as well.
Hi Nathan and Rich,
Thanks a lot.
The application we develop needs to support Oracle, DB2, Sybase and SQL Server databases. So, I cannot use SELECT FOR UPDATE, which is specific to Oracle and DB2.
There is lot of chances for concurrent access in our application. As suggested by Rich, I can have a Timestamp field and retry until it get succeeded. But, I fear the retry may go into loops. I will try this and give you the feedback.
May be I am missing something. If the select and update are within the same ejb method, then all you need to do is to wrap the database record with a transactional entity bean with proper transaction isolation level.
The "Serializable" isolation provided by a snapshot based database (such as Oracle) guarantees that your transaction will be consistent in itself, but it doesn't guarantee that the data you read in your transaction hasn't been changed in another transaction. The snapshot keeps you from seeing the altered or added data (which prevents nonrepeatable reads and phantom reads), but the final results of your computation may not reflect the status of the database at commit time.
To quote from "Advanced Java 2 development for enterprise applications", pages 416 and 418: "Some databases claim to have repeatable reads by implementing a technique in which they present the user with a snapshot that remains static for the duration of the transaction -- even though the data represented by the snapshot may be changed by another transaction. In effect, your snapshot represents an old version of the database. The usefulness of this way of achieving 'virtual' repeatable reads depends on whether an application needs to have up-to-date data at all times." ... "A pessimistic approach can be implemented if all transactions obtain at least read locks on data they read, and write locks on data they write, and hold those locks until they commit. This approach is not the default update mode of many products, however. For example, Oracle has two kinds of isolation modes, which they call 'Read Committed' and 'Serializabe.' Neither mode guarantees correct and serializable transactions, unless queries occurring in update transactions use the 'FOR UPDATE' construct, which enforces a write lock policy."
Consider a somewhat contrived example, with two transactions. The first transaction queries the database to determine the current "banking day" for some sort of financial transfer, then adds an entry to an "account" table to record the transfer. The second transaction is used to roll from one baking day to the next by changing the current banking day, then computing the interest and fees for the previous day. (Note that there is only a single account to keep it simple.)
SELECT b_bankingday FROM bankparams;
// result of banking day query is used to construct next
// JDBC call, which adds a transfer to the current day
INSERT INTO account VALUES ('2001-08-08', 45.20, 'transfer from ACT123');
UPDATE bankparams SET b_bankingday = '2001-08-09';
SELECT * FROM account WHERE a_day = '2001-08-08';
// result of orders query is used to construct next
// JDBC call, which adds an adjustment for interest
// and banking fees
INSERT INTO account VALUES ('2001-08-08', 0.02, 'interest');
INSERT INTO account VALUES ('2001-08-08', -45.10, 'fees');
Imagine that the first transaction (call it X) is begun just before the second (call it Y). X will have a snapshot of the database that has a banking day of '2001-08-08', so it will add a transfer whose day field is set to the 8th. Y will have a snapshot of the database that is identical to X's snapshot, because the snapshot was taken before X committed. This means that the daily fees and interest computed in Y will _not_ include the transfer that X is adding. Because there is no data that is changed by both transactions, they will not be detected as colliding, and both will commit successfully.
This is a fundamental drawback of MVCC (Multi-Version Concurrency Control). Data that you only _read_ during your transaction might have changed in the database before your commit, so to really get serializability you must lock all of the data that you read or write.
Sorry this got a little long.
Karthik, maybe you can check out JDBC2, which has the ability to flag a ResultSet as ResultSet.CONCUR_UPDATABLE. If this type of ResultSet is implemented, the returned result set has either a write lock or an automatic optimistic mechanism, and the result set can be modified directly. I haven't used it personally, and I don't know how widely it is supported.
Thanks for shedding some light in this grey area.
If the data is accessed exclusively from the entity ejb, then it should work. That is because the access to the entity beans is serialized by the ejb contanier. When the client calls an entity bean method, the following sequence of events will not be interleaved with another client call.
ejbLoad() -- the SELECT
ejbStore() -- the UPDATE
Different EJB implementations handle this differently. Many EJB containers will create two instances of the same entity bean (different Java objects, same primary key) in memory at the same time to service simultaneous remote calls, and rely on some sort of optimistic locking mechanism.
Even if you play strictly in the entity EJB world with a container that does pessimistic locking (only a single instance per entity bean) you can still have problems, because with an MVCC persistent store the results of a finder method reflect the reality at the beginning of the transaction, not the reality at commit time, and any data obtained from iterating over the results of the finder may be stale before it is ever committed to the database.
Assuming that the container uses pessimistic locking with entity beans i.e., one instance per unique key, then it should work. Because, at the beginning of the entity bean method call, the container will call ejbLoad().
However, if the container adopts the policy of fully populating the beans in the findBy..() method call and not call ejbLoad() during the method call, then there could be problems if the database could be updated from outside the container. This external application could update the database between the findBy..() and ejbLoad() calls.
If you think about the impact of local interfaces in EJB 2.0 then I think pessimisticly optimistic servers are a thing of the past. Consider this:
Remote Session Bean 1 has reference to Local Entity Bean (PK=1), on server 1.
Remote Session Bean 2 has reference to Local Entity Bean (PK=1), both on server 2.
In both cases, the local entity is on the same machine (read VM) as the session bean. It has to be, the spec says so.
So, you have got to have more than one instance of PK=1 bean, on two servers.
If you don't buy that one, then consider this.
Local Entity 1 has reference to Local Entity 2. Both on server 1.
Remote Session Bean 1 has reference to local entity 1, on server 2.
Once again, you have to have more than one instance. My own thoughts about it lead me to conclude that not supporting multiple instances, will lead very quickly to a heavy concentration on a single server.
Imagine this. You are using an entity bean to represent an entry in your database of DVDs which you sell on the net.
American Pie 2 (Excellent BTW!) is launched on DVD.
Your session bean shopping carts are load balanced across a cluster. But they all have local references to the DVD entry for AP2. One of two situations occurs:
1) All session beans are forced to be on the same server as the AP2 local entity bean. Quickly kills performance in this relatively real world example.
2) You have multiple instances of the local entity.
This is not to say that the bean will not behave pessimistically. If it sits on top of Sybase then it will be pessimistic, due to the pessimistic concurrency of Sybase. However, in effect the concurrency model becomes completely delegated to the underlying database repository. The server has little choice but to behave in a somewhat optimistic manner.
Long I know, but am I missing something huge here or does it make sense?
Ooops. First line.
^pessimistically optimistic^pessimistically concurrent
Thanks a lot for the good discussion happened here.
We found a slightly different solution what we have discussed here. Instead of doing "SELECT and UPDATE", we did "UPDATE then SELECT and UPDATE". The first update is a dummy one and not doing anything on the record. But it locks the record. After that, I select the value using "SELECT" and with the required data I update it back. Finally I do commit. Since, the record is locked by the first update, other clients are not allowed to do update immediately until one transaction (UPDATE,SELECT,UPDATE) is over. This way we solved the problem.
If you find this as a not good solution, please suggest me something more. I tried most of the above solutions mentioned but at the concurrency time many of them are not working properly.