-
Demystifying Caching in Hibernate (42 messages)
- Posted by: Alois Reitbauer
- Posted on: February 19 2009 07:56 EST
Hibernate uses various cache implementation to optimize data read/write performance. This can lead to substantial performance gains. If used the wrong way it can however result in serious performance problems. As seen in many real world use cases, Hibernate caches are often used inappropriately due to missing understanding of their inner workings and dynamic behavior. A current series on the dynaTrace Performance and Architecture Blog discusses this topic and provides deep insight into Hibernate Cache behavior. Currently the following posts are available: - The Hibernate Session Cache: http://blog.dynatrace.com/2009/02/16/understanding-caching-in-hibernate-part-one-the-session-cache/ - the Hibernate Query Cache: http://blog.dynatrace.com/2009/02/16/understanding-caching-in-hibernate-part-two-the-query-cache/ Upcoming topics are the Second-Level and concurrency and cache behaviorThreaded Messages (42)
- Not much flesh... by Karl Banke on February 19 2009 08:33 EST
- Reality tells a different story ... by Alois Reitbauer on February 20 2009 04:17 EST
- Re: Demystifying Caching in Hibernate by Billy Newport on February 19 2009 11:08 EST
- Re: Demystifying Caching in Hibernate by Billy Newport on February 19 2009 11:10 EST
-
Re: Demystifying Caching in Hibernate by Alessandro Santini on February 19 2009 11:45 EST
-
Re: Demystifying Caching in Hibernate by David McCoy on February 19 2009 04:22 EST
-
Re: Demystifying Caching in Hibernate by Tomi Tuomainen on February 20 2009 01:05 EST
-
Re: Demystifying Caching in Hibernate by Jürgen Lind on February 20 2009 04:50 EST
- Re: Demystifying Caching in Hibernate by Alessandro Santini on February 20 2009 05:01 EST
-
Re: Demystifying Caching in Hibernate by Mileta Cekovic on February 20 2009 05:42 EST
-
Re: Demystifying Caching in Hibernate by Guido Anzuoni on February 20 2009 07:00 EST
-
Re: Demystifying Caching in Hibernate by Alessandro Santini on February 20 2009 07:29 EST
- Re: Demystifying Caching in Hibernate by Guido Anzuoni on February 20 2009 08:14 EST
-
Re: Demystifying Caching in Hibernate by Karl Banke on February 20 2009 08:14 EST
-
Re: Demystifying Caching in Hibernate by Guido Anzuoni on February 20 2009 08:25 EST
- Re: Demystifying Caching in Hibernate by Karl Banke on February 20 2009 09:57 EST
-
Re: Demystifying Caching in Hibernate by Alessandro Santini on February 20 2009 08:44 EST
- Re: Demystifying Caching in Hibernate by Alessandro Santini on February 20 2009 08:47 EST
-
Caching versus a centralized database by Cameron Purdy on February 22 2009 02:42 EST
- Re: Caching versus a centralized database by Mark N on February 24 2009 10:20 EST
-
Re: Demystifying Caching in Hibernate by Guido Anzuoni on February 20 2009 08:25 EST
-
Re: Demystifying Caching in Hibernate by Tomi Tuomainen on February 20 2009 12:04 EST
-
Re: Demystifying Caching in Hibernate by Jürgen Lind on February 20 2009 12:56 EST
-
Re: Demystifying Caching in Hibernate by Guido Anzuoni on February 20 2009 06:10 EST
-
Re: Demystifying Caching in Hibernate by Jürgen Lind on February 20 2009 06:31 EST
-
Re: Demystifying Caching in Hibernate by Guido Anzuoni on February 20 2009 06:50 EST
- Re: Demystifying Caching in Hibernate by Jürgen Lind on February 21 2009 03:49 EST
-
Re: Demystifying Caching in Hibernate by Guido Anzuoni on February 20 2009 06:50 EST
-
Re: Demystifying Caching in Hibernate by Jürgen Lind on February 20 2009 06:31 EST
-
Re: Demystifying Caching in Hibernate by Guido Anzuoni on February 20 2009 06:10 EST
-
Re: Demystifying Caching in Hibernate by Guido Anzuoni on February 20 2009 05:53 EST
-
Re: Demystifying Caching in Hibernate by Jürgen Lind on February 20 2009 06:16 EST
-
Re: Demystifying Caching in Hibernate by Guido Anzuoni on February 20 2009 06:37 EST
-
Re: Demystifying Caching in Hibernate by Jürgen Lind on February 21 2009 04:07 EST
- Simpler ORM ... by rob bygrave on February 21 2009 08:39 EST
-
Re: Demystifying Caching in Hibernate by Jürgen Lind on February 21 2009 04:07 EST
-
Re: Demystifying Caching in Hibernate by Guido Anzuoni on February 20 2009 06:37 EST
-
Re: Demystifying Caching in Hibernate by Jürgen Lind on February 20 2009 06:16 EST
-
Re: Demystifying Caching in Hibernate by Jürgen Lind on February 20 2009 12:56 EST
-
Re: Demystifying Caching in Hibernate by Alessandro Santini on February 20 2009 07:29 EST
-
Re: Demystifying Caching in Hibernate by Guido Anzuoni on February 20 2009 07:00 EST
-
Re: Demystifying Caching in Hibernate by James Watson on February 20 2009 11:11 EST
-
Re: Demystifying Caching in Hibernate by Tomi Tuomainen on February 20 2009 02:17 EST
- Re: Demystifying Caching in Hibernate by James Watson on February 20 2009 03:10 EST
- Re: Demystifying Caching in Hibernate by James Watson on February 21 2009 09:15 EST
-
Re: Demystifying Caching in Hibernate by Tomi Tuomainen on February 20 2009 02:17 EST
- Re:What I would really need is lightweight... by Joe Clarke on February 20 2009 02:13 EST
- But no annotations, no xml by wal rus on February 21 2009 08:13 EST
-
Re: Demystifying Caching in Hibernate by Jürgen Lind on February 20 2009 04:50 EST
- Re: Demystifying Caching in Hibernate by Alessandro Santini on February 20 2009 04:07 EST
-
Re: Demystifying Caching in Hibernate by Tomi Tuomainen on February 20 2009 01:05 EST
-
Re: Demystifying Caching in Hibernate by David McCoy on February 19 2009 04:22 EST
-
Re: Demystifying Caching in Hibernate by Alessandro Santini on February 19 2009 11:45 EST
- Re: Demystifying Caching in Hibernate by Billy Newport on February 19 2009 11:10 EST
- Re: Demystifying Caching in Hibernate by Luca Masini on February 23 2009 08:01 EST
- Re: Re: Demystifying Caching in Hibernate by Alois Reitbauer on February 24 2009 14:53 EST
- Re: Demystifying Caching in Hibernate by William Louth on February 24 2009 07:50 EST
- ORM - what for? by Stefan Schubert on February 24 2009 16:14 EST
-
Not much flesh...[ Go to top ]
- Posted by: Karl Banke
- Posted on: February 19 2009 08:33 EST
- in response to Alois Reitbauer
...in these articles. When using any database access layer one should figure out how its basic paradigms work. For Hibernate this would be the session and its caching behavior. The real interesting part is second level caching along with concurrency. And I still have my doubts if these make sense in most real world scenarios, where a database is neither mostly read only nor accessed through a single channel. The only place I can see where second level caching makes real sense is where data is either mostly read only - for example in a product catalog - or where the (write) transaction rate is so high that storage access becomes a physical bottleneck. Most times I see second level caching used, direct database access will not only be more consistent but also about as fast as reading from the caching. -
Reality tells a different story ...[ Go to top ]
- Posted by: Alois Reitbauer
- Posted on: February 20 2009 04:17 EST
- in response to Karl Banke
Karl, I agree that the session and the caches are basics and should be well understood by developers. So my posts might really explain what is basic for you. However we see in reality a lot of problems as people do not understand these principles. I have seen numerous cases where exactly a lack of knowledge in this area led to significant problems. Also at conference when showing some of the effects of O/R mapper misuse, I see people having not realized what is happening under the hood. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Billy Newport
- Posted on: February 19 2009 11:08 EST
- in response to Alois Reitbauer
From my selfish point of view, the main issue with caching for Hibernate and other OR mappers is that they do not have an efficient cache plugin interface. They are designed to work with local hash maps OR a fully replicated distributed HashMap. The plugin API is too fine grained and results in to high an overhead for the RPC to pull data from remote caches when you try to use them with something like WebSphere eXtreme Scale or gigaspaces or Coherence. If someone wants to use a 50GB shared partitioned cache then the current cache plugins are pretty unattractive from a performance point of view due mainly to the lack of batching APIs in the cache plugin. I know we're trying to improve OpenJPA and I'm assuming, but don't know, that eclipselink is being similarly improved for Coherence. Hibernate needs to be improved in this area also as for now, all you can do is use caches which completely fit in the free memory of a single JVM. If Hibernate does actually have SPIs for plugging in partitioned caches then I'm all ears as obviously, it's something I'd want to take advantage of. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Billy Newport
- Posted on: February 19 2009 11:10 EST
- in response to Billy Newport
A fully replicated distributed hash map means, that a set of JVMs are caching 200MB of data in total and whenever a member JVM changes its local data then it's copied to all the other JVMs. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Alessandro Santini
- Posted on: February 19 2009 11:45 EST
- in response to Billy Newport
A fully replicated distributed hash map means, that a set of JVMs are caching 200MB of data in total and whenever a member JVM changes its local data then it's copied to all the other JVMs.
I am not a fan of Hibernate and ORM in general (I prefer iBatis for instance). In iBatis you can assign a SqlMap to a specific cache region (i.e. a specific "Map" with its own contents and eviction strategy). I am pretty sure that Hibernate supports this as well, but I might be wrong. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: David McCoy
- Posted on: February 19 2009 16:22 EST
- in response to Alessandro Santini
It is interesting that the in 2009 the ORM debate continues...A fully replicated distributed hash map means, that a set of JVMs are caching 200MB of data in total and whenever a member JVM changes its local data then it's copied to all the other JVMs.
I am not a fan of Hibernate and ORM in general (I prefer iBatis for instance). In iBatis you can assign a SqlMap to a specific cache region (i.e. a specific "Map" with its own contents and eviction strategy). I am pretty sure that Hibernate supports this as well, but I might be wrong. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Tomi Tuomainen
- Posted on: February 20 2009 01:05 EST
- in response to David McCoy
It is interesting that the in 2009 the ORM debate continues...
Yep, after struggling with Hibernate for some years now, I'm thinking to drop it in my next project. Mapping not trivial table structures takes too much effort and ends up with performance issues. Queries last seconds. Then we start wiring up caches. This is not just my bad skills but happens in every project in my customer, with different coders and vendors. All this feels mindless, with basic sql I could easily write fast enough queries without hibernate. Writing XML/annotations, configuring, caching, tuning takes about 10 times longer. Spring handles my connections and transactions. What I would really need is lightweight, simple reflection based ORM (one jar), that generates error-prone CRUD sql for simple one row POJOs. But no annotations, no xml. I don't know if there is such library already. If not, I'm writing it. In 2009, this is crazy. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Jürgen Lind
- Posted on: February 20 2009 04:50 EST
- in response to Tomi Tuomainen
I fully agree as 80% of the persistence code could be covered with very simple mechanisms. Some classes for this task and maybe some helper methods for dealing with say, collections, and that should do it. Persistence is not that difficult after all if you do not try to find the silver bullet... Please drop me a note if you find such a framework (or if you have written one yourself). J. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Alessandro Santini
- Posted on: February 20 2009 05:01 EST
- in response to Jürgen Lind
I do not know if I correctly understood your requirements. However, ibator (ibatis.apache.org) generates simple DAOs that fit most cases. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Mileta Cekovic
- Posted on: February 20 2009 05:42 EST
- in response to Tomi Tuomainen
What I would really need is lightweight, simple reflection based ORM (one jar), that generates error-prone CRUD sql for simple one row POJOs. But no annotations, no xml. I don't know if there is such library already. If not, I'm writing it.
That's why I am still using my in-house lightweight ORM framework (it is not a library as it DO have restrictions on how you write POJOs, but that is OK for in-house framework) where I can enable/disable instance and query cache per entity type. And framework do NOT use reflection, but rather needs writing strategies to read from ResultSet and write to PreparedStatement. And yes, I tried Hibernate several years ago and have similar problems the other posters mentioned: to much time spent on configuring and tuning complex mappings and inefficient caching being the most notable (note that these are not Hibernate specific problems, but general problems all full-blown ORM mappers will inherently have).
In 2009, this is crazy. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Guido Anzuoni
- Posted on: February 20 2009 07:00 EST
- in response to Mileta Cekovic
I think that using ORM techniques to map single row objects as "bag" of attributes is a nonsense. So no surprise for negative experiences posted here. More, in my experience I have spent a relevant amount of time in mapping "strange" relational models rather than complex object models (rather the contrary I would say). About inefficient caching. It would be better to avoid such blanket statement. Why is caching inefficient ? Are you talking about a specific cache implementation ? Do you believe that doing select * from TABLE where id = ? is faster that a lookup in a Map ? Do you have any benchmark showing that SELECT statement is faster than in memory map lookup ? GuidoWhat I would really need is lightweight, simple reflection based ORM (one jar), that generates error-prone CRUD sql for simple one row POJOs. But no annotations, no xml. I don't know if there is such library already. If not, I'm writing it.
In 2009, this is crazy.
That's why I am still using my in-house lightweight ORM framework (it is not a library as it DO have restrictions on how you write POJOs, but that is OK for in-house framework) where I can enable/disable instance and query cache per entity type. And framework do NOT use reflection, but rather needs writing strategies to read from ResultSet and write to PreparedStatement.
And yes, I tried Hibernate several years ago and have similar problems the other posters mentioned: to much time spent on configuring and tuning complex mappings and inefficient caching being the most notable (note that these are not Hibernate specific problems, but general problems all full-blown ORM mappers will inherently have). -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Alessandro Santini
- Posted on: February 20 2009 07:29 EST
- in response to Guido Anzuoni
I would like to add that these bad experiences are also the consequence of an incorrect approach at using ORM technologies. I have seen many projects where ORM et similia have been used without a real reason (e.g. simple CRUD applications and highly transactional applications). Reasons can be many: resume-driven development, bad understanding of the costs/benefits of ORM, over-engineering, etc. As to caching - nobody can state that a SELECT is faster than a Map lookup. But there are also implications like cache behaviour in a clustered environment that must be taken into account - and that is far from being trivial. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Guido Anzuoni
- Posted on: February 20 2009 08:14 EST
- in response to Alessandro Santini
As to caching - nobody can state that a SELECT is faster than a Map lookup. But there are also implications like cache behaviour in a clustered environment that must be taken into account - and that is far from being trivial.
But (good or bad) cache behaviour is that of a particular implementation. Don't forget that ORMs have, normally, 2 level of caching: one used to guarantee the uniqueness of a persistent object in a certain scope (hibernate Session, JDO PersistenceManager) and a so called level 2 cache to reduce DB round-trip when an object is not in level 1 cache. Level 1 cache is not clustered and it is hard to believe it is inefficient. I agree that Level 2 is hard in a clustered environment, but this has nothing to do with ORM. Guido -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Karl Banke
- Posted on: February 20 2009 08:14 EST
- in response to Guido Anzuoni
Why is caching inefficient ? Are you talking about a specific cache implementation ? Do you believe that doing
In my experience, performing the lookup in the database will be not significantly slower than using an in memory cache. Also the database can be tuned (indexed) in such a way that lookups will become faster even though they are not the primamy key, while the application is running. I find it strange that the people who have developed some caching framework should have done that much better than the thousands of people who have optimized database query and database internal caches. Finally, caching always creates potential inconsistency with the database as the cache is seldomly the only database client. This creates a new error class that needs proper handling. Caches have a point in very high write transaction rates and when using "legacy databases" with unfriendly isolation levels. For everything else all they will do is eat CPU cycles and increase complexity.
select * from TABLE where id = ?
is faster that a lookup in a Map ?
Do you have any benchmark showing that SELECT statement is faster than in memory map lookup ?
Guido -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Guido Anzuoni
- Posted on: February 20 2009 08:25 EST
- in response to Karl Banke
So you think that network latency is negligible in issuing SQL statements. So invoking get(key) on a HashMap is not that different from doing the same using some sort of highly efficient network protocol on a remote HashMap ? SQLNet is faster than a local call ?Why is caching inefficient ? Are you talking about a specific cache implementation ? Do you believe that doing
select * from TABLE where id = ?
is faster that a lookup in a Map ?
Do you have any benchmark showing that SELECT statement is faster than in memory map lookup ?
Guido
In my experience, performing the lookup in the database will be not significantly slower than using an in memory cache. Also the database can be tuned (indexed) in such a way that lookups will become faster even though they are not the primamy key, while the application is running.
I find it strange that the people who have developed some caching framework should have done that much better than the thousands of people who have optimized database query and database internal caches.Finally, caching always creates potential inconsistency with the database as the cache is seldomly the only database client.This creates a new error class that needs proper handling.
Yes, totally agree.Caches have a point in very high write transaction rates and when using "legacy databases" with unfriendly isolation levels.
Not sure to follow you on this. Guido -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Karl Banke
- Posted on: February 20 2009 09:57 EST
- in response to Guido Anzuoni
So you think that network latency is negligible in issuing SQL statements.
No network latency is a point of course. But this again depends on topology. The database might well be on the same physical device, the OS bypassing the network stack etc. Also networks can be a lot faster now than they used to be 5 years ago. The local call would be faster, if it is only a lookup in a hash map. But as soon as in process concurrency issues sneak into the picture, things might start to get worse. Some badly done synchronization can ruin the entire benefit of the cache.
So invoking get(key) on a HashMap is not that different from doing the same using some sort of highly efficient network protocol on a remote HashMap ?
SQLNet is faster than a local call ?
Caches have a point in very high write transaction rates and when using "legacy databases" with unfriendly isolation levels.
Not sure to follow you on this.
Guido -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Alessandro Santini
- Posted on: February 20 2009 08:44 EST
- in response to Karl Banke
I certainly concur with you in environments like z/OS where the DB/2 resides in the same LPAR of the application server. In that context, caching does not really help a lot. In other context network latency can introduce a performance gap. About the uniqueness of the database access point - you raise a good point which is mainly architectural - but sometimes you just cannot avoid that. I think that most ORMs enthusiast work in self-contained systems or can enforce external systems to access their data through the ORM. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Alessandro Santini
- Posted on: February 20 2009 08:47 EST
- in response to Alessandro Santini
DB/2
Oops - this should read DB2. Apologies. -
Caching versus a centralized database[ Go to top ]
- Posted by: Cameron Purdy
- Posted on: February 22 2009 14:42 EST
- in response to Karl Banke
Karl -In my experience, performing the lookup in the database will be not significantly slower than using an in memory cache.
Databases are fast. They have been optimized for many years, have short code paths, and make great use of caching. They are also incredibly tunable. Nonetheless, from my experience (having architected, helped with, or at least examined a few hundred very large scale environments), even if the database has zero latency when you built the system, it will still often be far slower for two reasons: 1) The application has to transform the results from the database into the domain model of the application, while the cache can store the domain objects "as is", eliminating the cost of transformation. (Likewise, applications that want to access information in relational format are better off avoiding distributed object caching for those use cases.) 2) The database becomes a single point of bottleneck as load increases, because the clients of the database scales linearly while the database does not. This means that when the load scales, the queries that took no time at all running on your development system (with a single user -- you!) now take seconds or even minutes -- even when fully indexed, optimized, and tuned. One reason given that often isn't true is that "the database is remote", which can be equally true for distributed caches (when used in a coherent fashion).I find it strange that the people who have developed some caching framework should have done that much better than the thousands of people who have optimized database query and database internal caches.
Your argument is perfectly valid for lightly loaded and/or single-threaded systems. Some applications have to scale, though.Finally, caching always creates potential inconsistency with the database as the cache is seldomly the only database client.
Very true. Among other things, that is why it is often called "caching" .. ;-)Caches have a point in very high write transaction rates and when using "legacy databases" with unfriendly isolation levels. For everything else all they will do is eat CPU cycles and increase complexity.
Going back to a presentation that I gave a few years ago, I stated that the goal of an architect for large scale systems was to design the systems to bottleneck on CPU and/or memory, and to do so in the earliest possible tier (i.e. as far in front of the system of record as possible). In other words, you are correct, and that is why Cache-Based Architectures work so well, because CPU and memory scale incredibly cost-effectively. I know that you think that "eating CPU cycles" is derogatory, but having worked on large scale systems, I appreciate the value of low-cost, low-power commodity hardware solving problems that the largest, fastest databases simply cannot handle. (And remember, I work for Oracle, where we make the largest, fastest, most scalable and -- by far -- the most popular databases in the industry .. and like you mentioned, we have literally 10,000 engineers working to improve our database :-) Peace, Cameron Purdy Oracle Coherence: Data Grid for Java, .NET and C++ -
Re: Caching versus a centralized database[ Go to top ]
- Posted by: Mark N
- Posted on: February 24 2009 22:20 EST
- in response to Cameron Purdy
Nice! -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Tomi Tuomainen
- Posted on: February 20 2009 12:04 EST
- in response to Guido Anzuoni
I think that using ORM techniques to map single row objects as "bag" of attributes is a nonsense. So no surprise for negative experiences posted here.
I'm not sure you get the point. I'm talking about ORM just like Hibernate, but simplified. I like object models, I want my ORM to handle one. But not the way Hibernate does it for me. I have remote client, I have limited possibility for lazy stategies. I know about eager fetching, sometimes it works, often not. We have to struggle mappings between parent-children because Hibernate generates circular queries. Hibernate instantiates its own Collection classes and my Swing client suddenly needs hibernate.jar, cglib plus all other shit. Maybe Gavin King knows all the answers for my customers problems immediatly, but usually we just have to struggle for too long. I just want JavaBean, that may include List of other JavaBeans. Then I just want fill attributes and say select/insert/update/delete. ORM generates the sql and knows all about joins. And returns pure JavaBeans and Collections. Nothing else.
More, in my experience I have spent a relevant amount of time in mapping "strange" relational models rather than complex object models (rather the contrary I would say). -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Jürgen Lind
- Posted on: February 20 2009 12:56 EST
- in response to Tomi Tuomainen
Again +1. Especially the lazy/eager loading approach with limited possibilities to tune it on a per-use-case basis is sometimes very annoying. A few years back, everybody talked about a clean layering where the UI does not know about the business logic or even the database. Today, we are using a fancy pattern called "Open Session in View" and then everything is fine again. I do not see that simply giving things a cool name justifies a break of layering. I want my Pojos to be real Pojos that can be handed across layer boundaries and not Pretend-to-be Pojos that are bytecode enhanced derivatives of my objects. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Guido Anzuoni
- Posted on: February 20 2009 18:10 EST
- in response to Jürgen Lind
Again +1.
I would say that layering for the purpose of shielding the upper layer from the internals of the lower layer is a good thing. But if you mean layering as an upper layer that must not use anything of the lower layer, I cannot agree. I don't see anything wrong in a storage layer working with an externally open transaction. Transaction boundaries are business decisions, so the responsibility of start/complete a transaction is in the business layer. Does it means that there is layer break ? Really I don't care to be so purist. And why should "Open Session in View" (really terrible name) break layering while you real Pojos flowing across will not instead ? And what has to do bytecode enhancement with layering break ? Guido
Especially the lazy/eager loading approach with limited possibilities to tune it on a per-use-case basis is sometimes very annoying. A few years back, everybody talked about a clean layering where the UI does not know about the business logic or even the database. Today, we are using a fancy pattern called "Open Session in View" and then everything is fine again. I do not see that simply giving things a cool name justifies a break of layering. I want my Pojos to be real Pojos that can be handed across layer boundaries and not Pretend-to-be Pojos that are bytecode enhanced derivatives of my objects. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Jürgen Lind
- Posted on: February 20 2009 18:31 EST
- in response to Guido Anzuoni
Obviously, the upper layer must use the abstractions provided by the lower layer. However, to tie the upper layer to the technical restrictions of the lower layer is in my view a break of abstraction since it is the idea of layering to shield upper layers from those technical aspects. Of course if you see transaction boundaries as part of the business case, one could argue that this does not break the technical encapsulation. I tend to see it differently.
I would say that layering for the purpose of shielding the upper layer from the internals of the lower layer is a good thing.
But if you mean layering as an upper layer that must not use anything of the lower layer, I cannot agree.
I don't see anything wrong in a storage layer working with an externally open transaction.
Transaction boundaries are business decisions, so the responsibility of start/complete a transaction is in the business layer.
Does it means that there is layer break ?
Really I don't care to be so purist.
And why should "Open Session in View" (really terrible name) break layering while you real Pojos flowing across will not instead ?
Well the Pojos are part of the interface between the presentation, business and database logic and clearly marked as such whereas the technical details of accessing a database are not.And what has to do bytecode enhancement with layering break ?
This kind of semi-transparent manipulation of the objects I think that I am dealing with when looking at the code leads to the problems we are discussing here. These objects are no longer the POJOs I have built, they are something different that needs to be treated differently. For example, accessing a property of such an enhanced POJO my cause a database access and thus need a transaction around it. Or, if I decided to use such a semi-POJO outside of my transaction context, I have to make sure that all relevant parts are loaded before leaving the defining context (another example for a horrible pattern name, have you heard of the "Preload" pattern?)
Guido -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Guido Anzuoni
- Posted on: February 20 2009 18:50 EST
- in response to Jürgen Lind
Well the Pojos are part of the interface between the presentation, business and database logic and clearly marked as such whereas the technical details of accessing a database are not.
You can always think (not pretend) at opening a session/managing a transaction as a setup of the storage layer rather than a knowledge of the internal technical details.
No, bytecode manipulation (AFAIK, used only in JDO as an optional feature) is not the source of the problem (if any).And what has to do bytecode enhancement with layering break ?
Guido
This kind of semi-transparent manipulation of the objects I think that I am dealing with when looking at the code leads to the problems we are discussing here.These objects are no longer the POJOs I have built, they are something different that needs to be treated differently. For example, accessing a property of such an enhanced POJO my cause a database access and thus need a transaction around it.
Well, you must have a way to get the data from the source when you need it and you don't have it yet. How would you do in this case ?Or, if I decided to use such a semi-POJO outside of my transaction context, I have to make sure that all relevant parts are loaded before leaving the defining context (another example for a horrible pattern name, have you heard
And your JavaBean ? Are they preloaded if you want to use outside the defining context ? What I'm trying to say is that what might appear as a not always desirable complication that can be avoided, in the reality is something that you would otherwise do in your application. Guido
of the "Preload" pattern?) -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Jürgen Lind
- Posted on: February 21 2009 03:49 EST
- in response to Guido Anzuoni
And what has to do bytecode enhancement with layering break ?
Guidothis kind of semi-transparent manipulation of the objects I think that I am dealing with when looking at the code leads to the problems we are discussing here.
No, bytecode manipulation (AFAIK, used only in JDO as an optional feature) is not the source of the problem (if any).
If I want some sort of lazy loading, then yes. However, if I could decide on a per use-case basis what I want loaded and what not (kind of "preload" pattern but it would be not build around a "feature" of the framework) before handing the POJOs out of my business layer, I would not have these kinds of problems in the first place.
Well, you must have a way to get the data from the source when you need it and you don't have it yet.
How would you do in this case ?
Or, if I decided to use such a semi-POJO outside of my transaction context, I have to make sure that all relevant parts are loaded before leaving the defining context (another example for a horrible pattern name, have you heard
of the "Preload" pattern?)
And your JavaBean ? Are they preloaded if you want to use outside the defining context ?
What I'm trying to say is that what might appear as a not always desirable complication that can be avoided, in the reality is something that you would otherwise do in your application.
Guido -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Guido Anzuoni
- Posted on: February 20 2009 17:53 EST
- in response to Tomi Tuomainen
Hmm, the discussion slowly shifts away from the initial post, but, ok, let's play the game ("you cannot control where a topic goes", as, more or less, has been recently said - laugh). I am a little suspicious when a read so many just, even if I agree that Hibernate has some peculiarities because of certain architectural choices. But I have to say that if you use a product it is obvious that you have to bring with you all its dependencies (maybe with hibernate they are an important number). You say you want a JavaBean that may include a List of other JavaBeans and a mean to set attributes in a JavaBean and say select etc. Why a List of other JavaBeans ? To manage a 1-many relationship ? And what about a many-to-many ? And why a List ? Why not a Set ? And when you ask for a select, what is the query language ? SQL ? If so, what is the relationship between column names in the query and JavaBean ? As you can see it easy to say "I don't need all this silly stuff, I need simpler thing", but it is easy to realize that a simpler thing is able to manage simpler abstraction. Nothing else. GuidoI think that using ORM techniques to map single row objects as "bag" of attributes is a nonsense. So no surprise for negative experiences posted here.
More, in my experience I have spent a relevant amount of time in mapping "strange" relational models rather than complex object models (rather the contrary I would say).
I'm not sure you get the point. I'm talking about ORM just like Hibernate, but simplified. I like object models, I want my ORM to handle one. But not the way Hibernate does it for me. I have remote client, I have limited possibility for lazy stategies. I know about eager fetching, sometimes it works, often not. We have to struggle mappings between parent-children because Hibernate generates circular queries. Hibernate instantiates its own Collection classes and my Swing client suddenly needs hibernate.jar, cglib plus all other shit.
Maybe Gavin King knows all the answers for my customers problems immediatly, but usually we just have to struggle for too long.
I just want JavaBean, that may include List of other JavaBeans. Then I just want fill attributes and say select/insert/update/delete. ORM generates the sql and knows all about joins. And returns pure JavaBeans and Collections. Nothing else. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Jürgen Lind
- Posted on: February 20 2009 18:16 EST
- in response to Guido Anzuoni
Guido, I totally agree with you that simpler frameworks only manage simpler abstractions. The point is, that in many cases, the simpler abstractions are more than sufficient. Only have I rarely seen a case where I need all features of a large ORM framework. Still, to use such a framework effectively, I have to learn about all those features. And since not all frameworks follow the rule of least surprise, I sometimes have to dig really deep to find out what I need to know even for simple problems. On the other hand, I can understand the guys who take the burden to write and maintain such a framework that they want to support as many use cases and platforms as possible. So there is no simple solution to this dilemma of abstractional power and user needs. Maybe a first step would be to move away from hiding the internals of a framework behind XML configuration files or, as it is currently en vogue, annotations, and to provide the framework users with an accessible API for using the framework in a more customizable manner. And before everybody starts beating me up, the last paragraph was not aimed at a particular framework (not even an ORM framework), I think this problem applies to any framework for a sufficiently broad problem base. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Guido Anzuoni
- Posted on: February 20 2009 18:37 EST
- in response to Jürgen Lind
Guido,
Not so sure. It depends on the framework. JDO for example has no such a drawback.
I totally agree with you that simpler frameworks only manage simpler abstractions. The point is, that in many cases, the simpler abstractions are more than sufficient. Only have I rarely seen a case where I need all features of a large ORM framework. Still, to use such a framework effectively, I have to learn about all those features.
And since not all frameworks follow the rule of least surprise, I sometimes have to dig really deep to find out what I need to know even for simple problems.
On the other hand, I can understand the guys who take the burden to write and maintain such a framework that they want to support as many use cases and platforms as possible.So there is no simple solution to this dilemma of abstractional power and user needs. Maybe a first step would be to move away from hiding the internals of a framework behind XML configuration files or, as it is currently en vogue, annotations, and to provide the framework users with an accessible API for using the framework in a more customizable manner.
It's a nice statement but I cannot completely figure out what you are talking about. Yet in JDO, now (spec 2.3) you have the possibility to specify metadata using an API, without any XML or annotation. But I don't know if this solves you problem Guido -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Jürgen Lind
- Posted on: February 21 2009 04:07 EST
- in response to Guido Anzuoni
On the other hand, I can understand the guys who take the burden to write and maintain such a framework that they want to support as many use cases and platforms as possible.
In fact, there is always someone who need more and is ready to blame your framework that is unable to do such and such.
I do not want to specify the metadata (or not only) but also manage the object graph itself. For example, if the framework provides me with some methods to manage collections of dependend objects (say, "loadList" and "saveList") it is easy for my to build my own DAOs on that API without having to rely on the inner workings of a framework. I think, my main point is control. *I* want to decide how to do things in my particular context. An I do not mind if some dirty details are exposed to me as a programmer, I do not require absolute transparency in e.g. ORM. J.
So there is no simple solution to this dilemma of abstractional power and user needs. Maybe a first step would be to move away from hiding the internals of a framework behind XML configuration files or, as it is currently en vogue, annotations, and to provide the framework users with an accessible API for using the framework in a more customizable manner.
It's a nice statement but I cannot completely figure out what you are talking about.
Yet in JDO, now (spec 2.3) you have the possibility to specify metadata using an API, without any XML or annotation.
But I don't know if this solves you problem
Guido -
Simpler ORM ...[ Go to top ]
- Posted by: rob bygrave
- Posted on: February 21 2009 20:39 EST
- in response to Jürgen Lind
Well I have been working on IMO a simpler ORM called Ebean at http://www.avaje.org ... so feel free to have a look at that and push for features that you would like to see etc. It uses JPA annotations for mapping... but IMO has a simpler API (no sessions) and simpler query language ... plus partial object support and better raw SQL support (ala Ibatis). Not everyone's cup of tea ... but some might like it. Cheers, Rob. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: James Watson
- Posted on: February 20 2009 11:11 EST
- in response to Tomi Tuomainen
What I would really need is lightweight, simple reflection based ORM (one jar), that generates error-prone CRUD sql for simple one row POJOs. But no annotations, no xml. I don't know if there is such library already. If not, I'm writing it.
I started a project a while back that might address what you are looking for. It's not an ORM per se but one could easily build one around it. It's scope is a littke more broad: basically allow you to execute queries and updates and get data without having to deal with managing resources and all the boilerplate of standard JDBC. It also could decouple your code from the persistence such that the source of the data could potentially be anything. It's based on some ideas I've been working on for years. I got started but have been distracted by some other projects, work, etc. But if you were interested in this and have any comments or can offer any improvements or other work I'd find some time to get back into it. http://sourceforge.net/projects/serene/
In 2009, this is crazy. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Tomi Tuomainen
- Posted on: February 20 2009 14:17 EST
- in response to James Watson
I started a project a while back that might address what you are looking for. It's not an ORM per se but one could easily build one around it. It's scope is a littke more broad: basically allow you to execute queries and updates and get data without having to deal with managing resources and all the boilerplate of standard JDBC. It also could decouple your code from the persistence such that the source of the data could potentially be anything.
Thanks James. I am not sure if this is what I'm looking for since I am a big fan of Spring concerning JDBC resource management. But I am checking the code. Do you have any documentation or examples of how to use Serene?
It's based on some ideas I've been working on for years. I got started but have been distracted by some other projects, work, etc. But if you were interested in this and have any comments or can offer any improvements or other work I'd find some time to get back into it.
http://sourceforge.net/projects/serene/ -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: James Watson
- Posted on: February 20 2009 15:10 EST
- in response to Tomi Tuomainen
Thanks James. I am not sure if this is what I'm looking for since I am a big fan of Spring concerning JDBC resource management.
The overall approach *should* allow the resources to be managed by Spring without any change to the core apis. The way you work with the queries and updates is completely independent from how the underlying connection (or whatever) is managed. If you are a fan of IOC, the way I normally use this is as if the queries (for example) are injecting data into my code. That is, I don't define my code based on the database, I design the class and then craft my queries to provide the data. There are a number of parallels to some features of SpringJDBC and iBatis but the basic approach comes from a different direction, IMO. I have some test code/examples at home that I failed to get into SVN. I'll get something out there that shows the basic approach and post here. If you are really interested in contributing (ideas are welcome) you can contact me through the forums on sourceforge. Thanks for your interest.
But I am checking the code. Do you have any documentation or examples of how to use Serene? -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: James Watson
- Posted on: February 21 2009 21:15 EST
- in response to Tomi Tuomainen
Thanks James. I am not sure if this is what I'm looking for since I am a big fan of Spring concerning JDBC resource management.
Here's a simple example of using Serene. If this seems interesting let me know. import java.math.BigDecimal; import java.sql.SQLException; /* for brevity */ import net.sf.serene.*; public class BasicTest { public static void main(String[] args) throws SQLException { Source source = new DerbyEmbeddedSource("test"); Manager manager = new PooledManager(source); Query query = manager.createQuery("select * from firsttable"); query.setWrapper("id", FieldWrapper.BIGDECIMAL_CONVERTER); final Column id = query.getColumn("id", BigDecimal.class); final Column name = query.getColumn("name", String.class); Handler h = new Handler() { BigDecimal total = new BigDecimal(100); public boolean process(Row row) { String n = row.getValue(name); BigDecimal i = row.getValue(id); total = total.multiply(i.divide(new BigDecimal(100))); System.out.println(total + ": " + n); return true; } }; query.execute(h); } }
But I am checking the code. Do you have any documentation or examples of how to use Serene? -
Re:What I would really need is lightweight...[ Go to top ]
- Posted by: Joe Clarke
- Posted on: February 20 2009 14:13 EST
- in response to Tomi Tuomainen
Mr. Tuomainen, you've described ThinkInSql, available under LGPL at http://www.independentreach.com/thinkinsql.htm - it doesn't try to traverse relations for you - xml is not used for config - only to externalize ad-hoc sql -
But no annotations, no xml[ Go to top ]
- Posted by: wal rus
- Posted on: February 21 2009 20:13 EST
- in response to Tomi Tuomainen
But no annotations, no xml. I don't know if there is such library already. If not, I'm writing it.
You may want to take a look at http://www.jdbcpersistence.org then, it's iBatis like, but, has interesting stuff of its own in store.
In 2009, this is crazy. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Alessandro Santini
- Posted on: February 20 2009 04:07 EST
- in response to David McCoy
It is interesting that the in 2009 the ORM debate continues...
I sincerely want to start another debate about ORM :) It was only to explain why I was giving iBatis as an example. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Luca Masini
- Posted on: February 23 2009 08:01 EST
- in response to Alois Reitbauer
"From my point of view who wrotes this two articles doesn't know very well what an ORM is and how is implemented by Hibernate because: 1) The Session Cache is not really a cache, is one of the invariants of an ORM, used to mantain the object identity between calls to the API during a Session and to make the difference on the initial snapshot during flush. Call it a cache is really not correct. Also the query example is not good, you should notice that the identity is the same and not the fact that the database is hitten twice !!! 2) The second level cache is always active because, starting from Hibernate 3, EhCache is the default cache provider that is mandatoring active. If you look at the cache code you can see that the cache is the same (the second level cache) and that it works only because if not specified is convention based, and all queries are cached in the same cache region. I think that TSS should reviews better articles before publishing." -
Re: Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: Alois Reitbauer
- Posted on: February 24 2009 14:53 EST
- in response to Luca Masini
Hello Luca, even Hibernate documentation refers to the Session as a cache (see http://www.hibernate.org/hib_docs/reference/en/html_single/#performance-cache). By default EhCache is used here - I agree. The second database statement in the example shows that query results are not cached by default forcing the application to hit the database again. Explicitly using Entitykeys helps here - I agree and that was what I wanted to show. -
Re: Demystifying Caching in Hibernate[ Go to top ]
- Posted by: William Louth
- Posted on: February 24 2009 07:50 EST
- in response to Alois Reitbauer
A much more efficient and less laborious way of performing a similar analysis without the need to journal every call and then to click and step throw each call sequence. Efficient Runtime Analysis of Hibernate http://williamlouth.wordpress.com/2009/02/24/efficient-runtime-analysis-of-hibernate/ William -
ORM - what for?[ Go to top ]
- Posted by: Stefan Schubert
- Posted on: February 24 2009 16:14 EST
- in response to Alois Reitbauer
I just want to tell you an example where you would need a ORM and would implement second-level caches. If a) The domain layer is rather thin (because most of what you do is CRUD, searches, alerts like with many web sites) and you want to avoid doubling its size through writing SQLs b) You have many a dozen of entities and hundreds of DB tables and generating or copying boilerplate-code from here to there just introduces much more code (hundreds of additional classes) to be a source for errors where you could instead just have one technology to understand and get right c) The app server environment is clustered, the Oracle database solves millions of database queries a day and management defines consistency much less important than availability d) You have some dozen developers producing code and try for rather smart than big solutions to make code reviews, sprint plannings and implementation time smaller - and to avoid sources for bugs e) You want to reduce total cost of ownership, because you have so many bug fixes, todos, features or projects each release that extend your platform, that you just don't want to do everything yourself, neither the tools nor the technology. You are so happy if you just have to struggle with understanding some concept or a thousand lines of configuration instead of just adding another 100'000 lines of code by triggering some code generation tool f) You want to reduce your code size in order to reduce build time because it takes an hour or longer to have your whole software built, depending on which project needs a rebuild I know this rarely happens in many many projects. And many of your arguments just show why Java is so often the second choice for serious web sites. Just look at all the modern Java web frameworks that are stateful!! Hardly anyone tries to implement smart pluggable frameworks. Everything is written by hand. But you can do it with Java. And the way you do it is to try to keep your code base small and manageable and to use technlogies to scale. The PHP guys use memcache. We are using an ORM and second level caches. And it just works fine. We are not programming cruise missiles, you know :-) We just have to deliver millions of dynamic web sites each in few hundred milliseconds. Kind regards Stefan