Discussions

News: Avoid Java transactions pitfalls with Spring

  1. Avoid Java transactions pitfalls with Spring (28 messages)

    Transaction processing should achieve a high degree of data integrity and consistency. This article, the first in a series on developing an effective transaction strategy for the Java platform, introduces common transaction pitfalls that can prevent you from reaching this goal. Using code examples from the Spring Framework and the Enterprise JavaBeans (EJB) 3.0 specification, series author Mark Richards explains these all-too-common mistakes.

    Threaded Messages (28)

  2. Misleading title[ Go to top ]

    Should be: "Understanding transaction pitfalls when using JPA or Spring".
  3. Very informative. Good tutorial[ Go to top ]

    I've had my problems with JPA. Unable to store objects etc.. This article explains it very well.
  4. The original title is "Understanding transaction pitfalls".
  5. great article[ Go to top ]

    everybody should read it. thanks
  6. Excellent article[ Go to top ]

    One of the best articles posted here. Very informative and I have seen people get confused over issues surrounding transactions, even in production!
  7. It's always a good thing to educate developers on the importance of transactions, and this article does a good job describing some common issues. However, I do not agree with everything Mark Richards has to say here. Esp. this part seems dangerous advice in general: "Why would you need a transaction if you are only reading data? The answer is that you don't. Starting a transaction to perform a read-only operation adds to the overhead of the processing thread and can cause shared read locks on the database (depending on what type of database you are using and what the isolation level is set to). The bottom line is that the read-only flag is somewhat meaningless when you use it for JDBC-based Java persistence and causes additional overhead when an unnecessary transaction is started." For some applications it might be OK to forego consistency guarantees by doing individual reads. However, in many applications it's very important that all reads in a read-only use case are performed under a single transaction, as it ensures that you're seeing consistent data. How consistent is determined by the isolation level you're using: READ_COMMITTED will only give you cursor stability, i.e. the guarantee that you won't have dirty reads. Higher isolation levels -- REPEATABLE_READ or even SERIALIZABLE -- provide stronger guarantees: no unrepeatable reads or phantom reads, respectively. Some databases implement this by using locking, and in that case you can indeed see lots of locking occurring. 'Fixing' this by not using transactions at all is typically not the right solution, you should lower your isolation level (for which you need a transaction) and handle things like phantom reads at the application level (effectively turning pessimistic locking into optimistic locking). BTW, notice that the default isolation level for a locally managed JDBC transaction is in fact not READ_COMMITTED as the article suggests, but is the default isolation level from your JDBC driver. That means it varies by DBMS or even driver: on DB2, for example, it's REPEATABLE_READ and not READ_COMMITTED. Also, the effect of using a read-only transaction with JDBC depends very much on the JDBC driver that's being used. Setting a connection to read-only mode is just a hint to the driver that it can optimize for the read-only case. It's not guaranteed to disallow updates and it's not guaranteed to have a positive effect on performance, but it never hurts and can sometimes give a small boost. Using read-only with Hibernate, however, can cause a significant speed increase if you've read lots of data that didn't change. The article states: "However, the propagation mode of REQUIRED overrides all of this, allowing the transaction to start and work as it would without the read-only flag set." I've used Spring and Hibernate with read-only transactions, and AFAIK this is simply not true: Hibernate will not flush its changes to the database on commit with the flush mode set to MANUAL (NEVER, as stated in the article, is the deprecated FlushMode variant). If this behavior is seen with JPA, it should be filed as a bug against Spring, as the whole point of having a read-only transaction with Hibernate is to skip the expensive checking for changes against snapshots of the read data. We (i.e. one of my SpringSource collegues) will try to contact Mark to see if we can get these issues fixed in the article. -- Joris Kuipers SpringSource Senior Consultant
  8. It's always a good thing to educate developers on the importance of transactions, and this article does a good job describing some common issues. However, I do not agree with everything Mark Richards has to say here. Esp. this part seems dangerous advice in general:
    "Why would you need a transaction if you are only reading data? The answer is that you don't. Starting a transaction to perform a read-only operation adds to the overhead of the processing thread and can cause shared read locks on the database (depending on what type of database you are using and what the isolation level is set to). The bottom line is that the read-only flag is somewhat meaningless when you use it for JDBC-based Java persistence and causes additional overhead when an unnecessary transaction is started."

    For some applications it might be OK to forego consistency guarantees by doing individual reads. However, in many applications it's very important that all reads in a read-only use case are performed under a single transaction, as it ensures that you're seeing consistent data. How consistent is determined by the isolation level you're using: READ_COMMITTED will only give you cursor stability, i.e. the guarantee that you won't have dirty reads. Higher isolation levels -- REPEATABLE_READ or even SERIALIZABLE -- provide stronger guarantees: no unrepeatable reads or phantom reads, respectively. Some databases implement this by using locking, and in that case you can indeed see lots of locking occurring. 'Fixing' this by not using transactions at all is typically not the right solution, you should lower your isolation level (for which you need a transaction) and handle things like phantom reads at the application level (effectively turning pessimistic locking into optimistic locking).

    BTW, notice that the default isolation level for a locally managed JDBC transaction is in fact not READ_COMMITTED as the article suggests, but is the default isolation level from your JDBC driver. That means it varies by DBMS or even driver: on DB2, for example, it's REPEATABLE_READ and not READ_COMMITTED.

    Also, the effect of using a read-only transaction with JDBC depends very much on the JDBC driver that's being used. Setting a connection to read-only mode is just a hint to the driver that it can optimize for the read-only case. It's not guaranteed to disallow updates and it's not guaranteed to have a positive effect on performance, but it never hurts and can sometimes give a small boost.

    Using read-only with Hibernate, however, can cause a significant speed increase if you've read lots of data that didn't change. The article states:
    "However, the propagation mode of REQUIRED overrides all of this, allowing the transaction to start and work as it would without the read-only flag set."
    I've used Spring and Hibernate with read-only transactions, and AFAIK this is simply not true: Hibernate will not flush its changes to the database on commit with the flush mode set to MANUAL (NEVER, as stated in the article, is the deprecated FlushMode variant).
    If this behavior is seen with JPA, it should be filed as a bug against Spring, as the whole point of having a read-only transaction with Hibernate is to skip the expensive checking for changes against snapshots of the read data.

    We (i.e. one of my SpringSource collegues) will try to contact Mark to see if we can get these issues fixed in the article.

    --
    Joris Kuipers
    SpringSource Senior Consultant
    i fully agree with your point. i hope you can contact Mark and correct this article. You spring guys are good ! :)
  9. I have not read the article, so I don't know what Mark has said about some of these things, but a couple of comments on some of the things that Joris said: If Mark said that you *never* need a transaction for reads then I would say that might be a little bit of a stretch, however, if he said that you don't need one as often then he would be correct. It is simply the exception that you need the kind of consistency across reads that a transaction gives you, and the cost of doing so can have dramatic effects on the scalability of your application. This is one of the main reasons why we added non-tx reads to JPA. Secondly, setting the transaction isolation to the correct level is obviously important, but I would not say that it is always the best strategy for solving every read consistency problem. Again, the transaction cost can be very high depending upon the duration, and you can often solve the problem with optimistic locking (and optimistic RR) without incurring nearly nearly the same DB locking hit. Lastly, Mark is right if he said that READ COMMITTED is pretty much the de facto standard default, and is the one that makes the most sense given the consistency/performance trade-off. Last time I checked, DB2 had an isolation of Cursor Stability (more or less equiv to READ COMMITTED). Note that *Websphere* at one point had a default of REPEATABLE READ (equiv to SERIALIZABLE), I think, but I am not sure what they do right now. If you have read-only data then anywhere that you can configure for read-onlyness is going to be a win. Knowing and being able to optimize for immutable data can only help your cause :) -Mike
  10. I have not read the article, so I don't know what Mark has said about some of these things, but a couple of comments on some of the things that Joris said:

    If Mark said that you *never* need a transaction for reads then I would say that might be a little bit of a stretch, however, if he said that you don't need one as often then he would be correct. It is simply the exception that you need the kind of consistency across reads that a transaction gives you, and the cost of doing so can have dramatic effects on the scalability of your application. This is one of the main reasons why we added non-tx reads to JPA.

    Secondly, setting the transaction isolation to the correct level is obviously important, but I would not say that it is always the best strategy for solving every read consistency problem. Again, the transaction cost can be very high depending upon the duration, and you can often solve the problem with optimistic locking (and optimistic RR) without incurring nearly nearly the same DB locking hit.

    Lastly, Mark is right if he said that READ COMMITTED is pretty much the de facto standard default, and is the one that makes the most sense given the consistency/performance trade-off. Last time I checked, DB2 had an isolation of Cursor Stability (more or less equiv to READ COMMITTED). Note that *Websphere* at one point had a default of REPEATABLE READ (equiv to SERIALIZABLE), I think, but I am not sure what they do right now.

    If you have read-only data then anywhere that you can configure for read-onlyness is going to be a win. Knowing and being able to optimize for immutable data can only help your cause :)

    -Mike
    Thanks Mike - you are spot on in your response. To say you *never* need transactions on a read would indeed be a foolish statement. As I indicated in the article (and in my transaction book), there are certainly times when you need a transaction on read-only data, the primary example being when you need access updated or inserted data not yet committed during a read operation (in this case SUPPORTS is used). You are correct - READ_COMMITTED is pretty much the de-facto standard and that was my intention in writing that statement. This may vary between database vendors, but in a 3,000 word article I was not about to go into those sort of details - it isn't a relevant point. Also, yes, there are subtle differences between JPA and Hibernate; maybe I should have made it more clear in the article that I was using JPA. I will make sure I indicate that in subsequent articles to avoid confusion. It is very unfortunate that the gentleman who did the original TSS post got the title of the article wrong. This was *not* an article about Spring; rather, it was an article about transactions with *examples* in Spring and EJB3. Just because I primarily used Spring in my examples certainly doesn't imply there is anything wrong with the way Spring handles transactions; on the contrary, it handles them very nicely. It's the pilot error I was really focusing on in this article; that and the impact of not having an effective transaction strategy.
  11. I have not read the article, so I don't know what Mark has said about some of these things, but a couple of comments on some of the things that Joris said:

    If Mark said that you *never* need a transaction for reads then I would say that might be a little bit of a stretch, however, if he said that you don't need one as often then he would be correct. It is simply the exception that you need the kind of consistency across reads that a transaction gives you, and the cost of doing so can have dramatic effects on the scalability of your application. This is one of the main reasons why we added non-tx reads to JPA....
    If you have read-only data then anywhere that you can configure for read-onlyness is going to be a win. Knowing and being able to optimize for immutable data can only help your cause :)

    -Mike


    Thanks Mike - you are spot on in your response. To say you *never* need transactions on a read would indeed be a foolish statement. As I indicated in the article (and in my transaction book), there are certainly times when you need a transaction on read-only data, the primary example being when you need access updated or inserted data not yet committed during a read operation (in this case SUPPORTS is used).
    Mark, maybe this is maybe what you meant, but this is not really what you said. To quote:
    The odd thing about the read-only flag is that you need to start a transaction in order to use it. Why would you need a transaction if you are only reading data? The answer is that you don't. Starting a transaction to perform a read-only operation adds to the overhead of the processing thread and can cause shared read locks on the database (depending on what type of database you are using and what the isolation level is set to). The bottom line is that the read-only flag is somewhat meaningless when you use it for JDBC-based Java persistence and causes additional overhead when an unnecessary transaction is started.
    This is a pretty clear indictment of read only transactions (and of using transactions in general for reading data), without much in the way of any qualifiers, and implies that if you are only reading data, you really don't need a transaction. In fact, if you are only reading data (in single or multiple steps), and somebody else is potentially writing at the same time, you have no ability to gurantee ACIDity on that read date (ASSUMING you do care about it) without that transaction in place. I do think this is an important point. It may be pretty clear to you when you need a read transaction, and that this is a subset of the various read cases, but I've had dozens of discussions during Spring trainings with course attendees who could simply not understand that a transaction was _ever_ useful if you are only reading date. Wording like the above doesn't help... I think the inaccuracy about the flush behaviour in Hibernate that Joris mentioned is also pretty important too. If I need a transaction, and I am only reading, I definitely don't want Hibernate trying to flush for no reason, so that alone is enough reason for setting it read-only. I guess there is also the edge case of catching accidental writes via the read-only flag. I'd personally rather go for (non-buggy code) that never modifies the objects in the first place for the read-only use cases, but the non-flush behaviour also allows you to easily ensure read-only cases really stay read-only. Regards, Colin
  12. Mark, maybe this is maybe what you meant, but this is not really what you said. To quote:


    The odd thing about the read-only flag is that you need to start a transaction in order to use it. Why would you need a transaction if you are only reading data? The answer is that you don't. Starting a transaction to perform a read-only operation adds to the overhead of the processing thread and can cause shared read locks on the database (depending on what type of database you are using and what the isolation level is set to). The bottom line is that the read-only flag is somewhat meaningless when you use it for JDBC-based Java persistence and causes additional overhead when an unnecessary transaction is started.

    This is a pretty clear indictment of read only transactions (and of using transactions in general for reading data), without much in the way of any qualifiers, and implies that if you are only reading data, you really don't need a transaction. In fact, if you are only reading data (in single or multiple steps), and somebody else is potentially writing at the same time, you have no ability to gurantee ACIDity on that read date (ASSUMING you do care about it) without that transaction in place.

    I do think this is an important point. It may be pretty clear to you when you need a read transaction, and that this is a subset of the various read cases, but I've had dozens of discussions during Spring trainings with course attendees who could simply not understand that a transaction was _ever_ useful if you are only reading date. Wording like the above doesn't help...

    I think the inaccuracy about the flush behaviour in Hibernate that Joris mentioned is also pretty important too. If I need a transaction, and I am only reading, I definitely don't want Hibernate trying to flush for no reason, so that alone is enough reason for setting it read-only. I guess there is also the edge case of catching accidental writes via the read-only flag. I'd personally rather go for (non-buggy code) that never modifies the objects in the first place for the read-only use cases, but the non-flush behaviour also allows you to easily ensure read-only cases really stay read-only.

    Regards,
    Colin
    Colin, You made some good points here. While I agree with you regarding some of the use cases involving the use of transactions for read-only operations, in my experience those use cases (a majority of them being edge-cases) must be weighed against the possibility of introducing shared locks and deadlocks in the database. While I certainly agree there are use cases for transactional read-only operations, in my experience it has caused throughput issues, deadlocks (depending on the database of course), and a host of other issues not even related to performance (such as the lack of a transaction strategy in the first place). That said, your point is clear; there are times when you may want to use a transactional read-only operation, but those cases are not common in my opinion. Regarding the Hibernate points, I stated in my article that the Hibernate flushmode will be set to NEVER. In any debate, always go to the source (no pun intended). If you look at the Spring 2.5.4 HibernateTransactionManager doBegin() method, you will see why I stated this: if (definition.isReadOnly() && txObject.isNewSessionHolder()) { // Just set to NEVER in case of a new Session for this transaction. session.setFlushMode(FlushMode.NEVER); } if (!definition.isReadOnly() && !txObject.isNewSessionHolder()) { // We need AUTO or COMMIT for a non-read-only transaction. FlushMode flushMode = session.getFlushMode(); if (flushMode.lessThan(FlushMode.COMMIT)) { session.setFlushMode(FlushMode.AUTO); txObject.getSessionHolder().setPreviousFlushMode(flushMode); } } I may be missing something here, but if Spring sets the hibernate flush mode to NEVER yet the update still happens, then I can only infer that Hibernate is ignoring the flush mode setting. If you execute the code in the article, you will see this behavior when using Spring 2.5.4 with JPA and Hibernate. It could very well be an issue with the combination of these frameworks (particularly JPA). In any event, I still don't see the optimizations or advantages to using the readOnly flag when using JPA. Good discussion Colin - it really gets into the nitty-gritty of transaction processing! Thanks, Mark
  13. Mark, In your last comment, I believe you are referring to listing #7?. When testing that, are you invoking the method in an already existing Transaction? If so, the flush mode will not be set to NEVER since the second part of the condition will not evaluate to true: txObject.isNewSessionHolder(). The 'readOnly' setting only takes effect if the method containing it is the beginning a new transactional boundary. Regards, Mark
  14. Mark,

    In your last comment, I believe you are referring to listing #7?. When testing that, are you invoking the method in an already existing Transaction? If so, the flush mode will not be set to NEVER since the second part of the condition will not evaluate to true: txObject.isNewSessionHolder(). The 'readOnly' setting only takes effect if the method containing it is the beginning a new transactional boundary.

    Regards,
    Mark
    Mark, Yes, Listing 7 was what I was referring to. I am not invoking it in the context of a transaction, so that method *does* start a new transaction boundary (REQUIRED propagation mode). Thus, from what I can tell the txObject.isNewSessionHolder() will evaluate to true and therefore set the flush mode to NEVER. I may be wrong as I am not sure of the criteria for determining the boolean value of the isNewSessionHolder() method, but as you stated, I was assuming it was when the transaction was started. Interesting, but looking at the Spring code I pasted in the earlier post closer, if in fact that "if" statement we are talking about returns false, it appears as though the flush mode will be set to the default value, which is AUTO. Sounds like a good case for a debug session to find out for sure... Thanks, Mark
  15. Mark, I created a simple test based on Listing #7 and verified that the FlushMode is indeed set to NEVER, and that the update does not happen. In the article, you mentioned that Hibernate sets "the flush mode of the object cache to NEVER, indicating that the object cache should not be synchronized with the database during this unit of work. However, the propagation mode of REQUIRED overrides all of this, allowing the transaction to start and work as it would without the read-only flag set." With my example, that last part is not true; the changes are *not* written to the database (I am using Hibernate directly, and not through the JPA API). Are you verifying that the data is actually updated... perhaps executing a simple JDBC query immediately after the Transaction boundary is passed? Regards, Mark
  16. Mark,

    I created a simple test based on Listing #7 and verified that the FlushMode is indeed set to NEVER, and that the update does not happen.

    In the article, you mentioned that Hibernate sets "the flush mode of the object cache to NEVER, indicating that the object cache should not be synchronized with the database during this unit of work. However, the propagation mode of REQUIRED overrides all of this, allowing the transaction to start and work as it would without the read-only flag set."

    With my example, that last part is not true; the changes are *not* written to the database (I am using Hibernate directly, and not through the JPA API).

    Are you verifying that the data is actually updated... perhaps executing a simple JDBC query immediately after the Transaction boundary is passed?

    Regards,
    Mark
    As I have stated *numerous* times, I am using JPA in my examples and in the article. The listing headers indicate that I am using JPA, the text indicates I am using JPA. I am not sure how I can make that point any more clear. Mark
  17. Sorry Mark. I definitely did not mean to imply that was an apples-to-apples comparison. The only reason that I used the direct Hibernate implementation is that I already had an example ready to go. I did not have a JPA-based equivalent ready, and I was too busy to tackle that at the time. Based on previous experience, I expected the Hibernate JPA implementation to behave in the same way. To verify this now that I had some time, I just refactored my example to use the JPA API, and I am using Hibernate as the implementation. Of course, with JPA 1.0 there is no FlushModeType option equivalent to 'NEVER' (only AUTO and COMMIT). However, when using the HibernateJpaVendorAdapter in Spring, it does perform the same as FlushMode.NEVER. I just verified this behavior with the refactored sample: the updates are not written to the database for a 'readOnly' transaction. As I mentioned (maybe not clearly) in my earlier comment, the 'readOnly' flag does only apply if a transaction is already active. This means even when using Spring's AbstractJpaTests, the flag is ignored, because those test methods are run within a transaction that is rolled back after the method completes. I hope this clears things up a bit. Regards, Mark
  18. Mark, After a bit more analysis, I have discovered what I think is triggering inserts in your example. Please let me know if this describes the situation. The behavior that I explained above applies for an "update" (when the Entity already exists in the database but is being modified in a 'readOnly' transaction). In those cases, when using JPA and Hibernate as a provider, the updates will not be flushed. The Hibernate flush mode is set to MANUAL internally (on the Session), and that is not overridden by the propagation setting. The flush mode does not always apply for an "insert" however. Whether or not the flush mode does apply for an insert depends on the ID generation strategy. I assume that you are using GenerationType.IDENTITY (or AUTO) for you primary key field? In that case, Hibernate does apply the insert in order to establish the ID to be used for maintaining that Entity in the session. If you use a different ID generation strategy such as "SEQUENCE" in Oracle or "TABLE" in MySQL, the corresponding ID value will be created in the SEQUENCE or TABLE, but no row should be inserted for the Entity in question. There is an open issue in Hibernate for this (http://opensource.atlassian.com/projects/hibernate/browse/HHH-2439). So, it appears as though the behavior may change in the future, but they did not want to introduce this change in a point release. Regards, Mark
  19. Mark,

    After a bit more analysis, I have discovered what I think is triggering inserts in your example. Please let me know if this describes the situation.

    The behavior that I explained above applies for an "update" (when the Entity already exists in the database but is being modified in a 'readOnly' transaction). In those cases, when using JPA and Hibernate as a provider, the updates will not be flushed. The Hibernate flush mode is set to MANUAL internally (on the Session), and that is not overridden by the propagation setting. The flush mode does not always apply for an "insert" however.

    Whether or not the flush mode does apply for an insert depends on the ID generation strategy. I assume that you are using GenerationType.IDENTITY (or AUTO) for you primary key field? In that case, Hibernate does apply the insert in order to establish the ID to be used for maintaining that Entity in the session. If you use a different ID generation strategy such as "SEQUENCE" in Oracle or "TABLE" in MySQL, the corresponding ID value will be created in the SEQUENCE or TABLE, but no row should be inserted for the Entity in question.

    There is an open issue in Hibernate for this (http://opensource.atlassian.com/projects/hibernate/browse/HHH-2439). So, it appears as though the behavior may change in the future, but they did not want to introduce this change in a point release.

    Regards,
    Mark
    Mark F., Excellent observations and analysis. I reran a whole battery of tests myself, and confirmed your results. However, we should probably point out for the benefit of the development community that this is entirely version and provider specific. This behavior was added in version 2.1 of Spring; prior versions will always do the insert as well as the update. I also confirmed that, even with the latest version of Spring and TopLink, TopLink will do the inserts as well as the updates when the readOnly flag is set to true. Regardless of these results, the point is that developers probably shouldn't be setting the readOnly flag to true for database updates anyway, particularly since no exception is thrown indicating your updates are not taking place. However, as we all know, cases like this are always fun to do in a "what if I did this..." scenario :-) Thanks, Mark
  20. Mark, Would you be so kind as to post a synopsis of the tests you ran. Kyle
  21. typo correction[ Go to top ]

    As I mentioned (maybe not clearly) in my earlier comment, the 'readOnly' flag does only apply if a transaction is already active.
    Ah, the irony... in attempting to clarify I accidentally produced a typo. Sorry... what the above should have said is: "the 'readOnly' flag does *not* apply if a transaction is already active." Another interesting subtle point: when using SUPPORTS and not joining an existing Transaction, Spring will still scope the Hibernate Session according to the boundaries of the @Transactional annotation. Therefore, it can perform multiple reads with the same Session rather than using a new Session for each read (and it does prevent flushing as well). Regards, Mark
  22. If we are using lazy loading in Hibernate we need to have a transaction session even if it is a read-only one right? Otherwise you get this infamous LazyLoadException. Also do you guys know a good way of setting the flush mode to NEVER without marking the transaction read-only? We are using declarative transaction definition. Reason for that is Oracle doesn't allow to set the connection as read-only causing exceptions showing in our logs. Thanks, ali.
  23. I do think this is an important point. It may be pretty clear to you when you need a read transaction, and that this is a subset of the various read cases...
    Sure, I think we all agree that there are cases when you need transactions for reads, but Colin, this is what concerns me: If people are told disproportionately about the cases when read consistency and pessimistic transaction locking is necessary then they get worried and just do it every time because they don't want to take any chances. The vaccine is actually putting them at a greater risk than getting the disease, since their scalability will suffer in non-trivial ways and I've had dozens of discussions during Spring trainings with course attendees who could simply not understand that a transaction was _ever_ useful if you are only reading date. Wording like the above doesn't help... I guess that's why they are taking courses... so they can learn why Repeatable Read was defined :). Knock 'em dead, Colin, just don't scare them too badly, please!
  24. I do think this is an important point. It may be pretty clear to you when you need a read transaction, and that this is a subset of the various read cases...

    Sure, I think we all agree that there are cases when you need transactions for reads, but Colin, this is what concerns me:

    If people are told disproportionately about the cases when read consistency and pessimistic transaction locking is necessary then they get worried and just do it every time because they don't want to take any chances. The vaccine is actually putting them at a greater risk than getting the disease, since their scalability will suffer in non-trivial ways and
    I've had dozens of discussions during Spring trainings with course attendees who could simply not understand that a transaction was _ever_ useful if you are only reading date. Wording like the above doesn't help...

    I guess that's why they are taking courses... so they can learn why Repeatable Read was defined :). Knock 'em dead, Colin, just don't scare them too badly, please!
    To tell you the truth (and this is part of the big picture, on which I'll get to below), I would actually go so far as to say that I think most Java developers doing database work actually probably worry too little, not too much, about read consistency. For that hypothetical 10% case when they really need it, then they really need it, and potential negative consequences of not getting it may be quite a lot more serious (monetary cost, human or other danger, etc) vs. the consequence of using it when it's not needed (increased resource utilization in a system which is not fully utilized anyway, so real impact at all). So my whole point in this area is that Mark made the whole read-only argument too one sided. It's not a black and white thing (i.e. "never do read only transactions"), which is what the naive reader might think with that wording. I don't want to scare anybody, but thinking through their use cased and applying the appropriate transactionality, isolation, and locking is entirely appropriate, and the time is very well spent. In general (and this is what I mean by the big picture), this is not about the read-only case, but about doing what is right for each case. And hopefully here we're in agreement; that people need to properly understand and think this stuff through. My pretty subjective perspective, but one formed after teaching hundreds of students, and also interviewing probably 100+ people for various dev/consultant roles over the years, is there is a somewhat shocking level of ignorance and lack of care about transactionality/ACID principles and how all this stuff is appropriately used and applied. I would stand by the basic principle that people should spend the (completely reasonable) time to learn how transactions work, when they need them and when they don't, how isolation works, how locking works, etc., and then spend the (completely reasonable) time to think about the appropriate use of these for the actual software and use cases they are implementing. That time invested will be well paid back to them... Colin
  25. That's interesting. I agree that people do not give enough brain time to consistency in general, although I find it rather common for folks to just go ahead and put transactions around stuff, thinking that will guard against whatever concurrency or consistency monsters they think may be hiding ready to bite them. The result is the same, I guess. Like you say, people just don't take the necessary time to understand this stuff. It isn't that hard, and failing to do so will eventually end up hurting (if not one the one side then on the other!) PS Maybe that should be the topic of our *next* JavaOne talk together :-)
  26. I was going to point out the "transaction is no good when read only", then I saw your reply, yes, absolutely right. Even a single read in auto-commit mode, actually it requires read statement consistency lock and an implicit transaction with READ-COMMITTED isolation level for most databases.
    It's always a good thing to educate developers on the importance of transactions, and this article does a good job describing some common issues. However, I do not agree with everything Mark Richards has to say here. Esp. this part seems dangerous advice in general:
    "Why would you need a transaction if you are only reading data? The answer is that you don't. Starting a transaction to perform a read-only operation adds to the overhead of the processing thread and can cause shared read locks on the database (depending on what type of database you are using and what the isolation level is set to). The bottom line is that the read-only flag is somewhat meaningless when you use it for JDBC-based Java persistence and causes additional overhead when an unnecessary transaction is started."

    For some applications it might be OK to forego consistency guarantees by doing individual reads. However, in many applications it's very important that all reads in a read-only use case are performed under a single transaction, as it ensures that you're seeing consistent data. How consistent is determined by the isolation level you're using: READ_COMMITTED will only give you cursor stability, i.e. the guarantee that you won't have dirty reads. Higher isolation levels -- REPEATABLE_READ or even SERIALIZABLE -- provide stronger guarantees: no unrepeatable reads or phantom reads, respectively. Some databases implement this by using locking, and in that case you can indeed see lots of locking occurring. 'Fixing' this by not using transactions at all is typically not the right solution, you should lower your isolation level (for which you need a transaction) and handle things like phantom reads at the application level (effectively turning pessimistic locking into optimistic locking).

    BTW, notice that the default isolation level for a locally managed JDBC transaction is in fact not READ_COMMITTED as the article suggests, but is the default isolation level from your JDBC driver. That means it varies by DBMS or even driver: on DB2, for example, it's REPEATABLE_READ and not READ_COMMITTED.

    Also, the effect of using a read-only transaction with JDBC depends very much on the JDBC driver that's being used. Setting a connection to read-only mode is just a hint to the driver that it can optimize for the read-only case. It's not guaranteed to disallow updates and it's not guaranteed to have a positive effect on performance, but it never hurts and can sometimes give a small boost.

    Using read-only with Hibernate, however, can cause a significant speed increase if you've read lots of data that didn't change. The article states:
    "However, the propagation mode of REQUIRED overrides all of this, allowing the transaction to start and work as it would without the read-only flag set."
    I've used Spring and Hibernate with read-only transactions, and AFAIK this is simply not true: Hibernate will not flush its changes to the database on commit with the flush mode set to MANUAL (NEVER, as stated in the article, is the deprecated FlushMode variant).
    If this behavior is seen with JPA, it should be filed as a bug against Spring, as the whole point of having a read-only transaction with Hibernate is to skip the expensive checking for changes against snapshots of the read data.

    We (i.e. one of my SpringSource collegues) will try to contact Mark to see if we can get these issues fixed in the article.

    --
    Joris Kuipers
    SpringSource Senior Consultant
  27. Another pitfall[ Go to top ]

    This pitfall should be avoided even when it's just simple sample code, purely for illustration. From the article:
    String stmt = "INSERT INTO TRADE (ACCT_ID, SIDE, SYMBOL, SHARES, PRICE, STATE)" + "VALUES (" + trade.getAcct() + "','" + trade.getAction() + "','" + trade.getSymbol() + "'," + trade.getShares() + "," + trade.getPrice() + ",'" + trade.getState() + "')";
  28. Setting the record straight[ Go to top ]

    For all of you reading this rather entertaining (but exhausting) thread, as the author of the article let me please set the record straight as to what I would recommend as an expert in transaction processing regarding database read operations and transactions. In my book "Java Transaction Design Strategies", as well as over 40 conference speaking engagements on Java transaction processing around the world, I have always stated and cautioned folks against the use of starting a transaction for database read operations. In my own professional experience at actual client sites, starting transactions for read operations has led to numerous problems, not the least of which are deadlocks, poor response times, and poor throughput. It has always been an indicator to me that the application lacks a clear and effective transaction strategy. As Mike Keith, JPA co-spec lead, correctly pointed out, "It is simply the exception that you need the kind of consistency across reads that a transaction gives you, and the cost of doing so can have dramatic effects on the scalability of your application. This is one of the main reasons why we added non-tx reads to JPA". Of course there may be cases where you may want to start a transaction for a read operation. However, these are occasional edge cases, and should only be considered if you have a darn good reason for doing so. My professional advice is *by default* you should not start a transaction for read operations. Period. End of discussion. If you don't agree with me, that is perfectly fine. I am only trying to help folks avoid some of the common pitfalls and increase the efficiency and throughput of their applications. In my humble opinion as a transaction expert, anyone who would advise you to always start transactions for database read operations should probably learn more about how transactions work in the Java platform. Better yet, get a free PDF copy of my transaction book on InfoQ (www.infoq.com) or read Dr. Mark Little's book "Java Transaction Processing". Both are excellent references for how transactions work in the Java platform. Happy committing! Mark
  29. jdbc transactions[ Go to top ]

    as far as i know each jdbc statement(including select statments) will trigger a database transaction start, if their is non yet. as defined in sql standard.