Discussions

News: Transparent persistency - Pains, Illusions and a fresh approach

  1. Early architecture and design decisions influence the whole application life cycle. Especially data access design is crucial for virtually every business critical application - and the best source for scalability, load and performance problems, high refactoring and integration efforts, project delays and finally lost business and customers. Funny how many people still base their projectsâ?? fate on solutions claiming to magically shield enterprise developers from the complexity of enterprise data access. We have seen too many projects running out of time and budget due to the unquestioned usage of data access frameworks, especially O/R mappers.

    We have faced following problems with O/R mappers:

    * The architects and developers have to know exactly when and how data is processed. Itâ??s dangerous to simply use OO data structures and let something handle the data transfer for them
    * O/R-mappers come with extensive configuration models and one has to become an expert to use them effectively
    * O/R-mappers are not located in the data access layer. They introduce aspects spanning many layers
    * The API abstraction idea conceals these problems: A specification may define interfaces but behind the interface/API are the dynamics and semantics. Different mappers behave completely differently, e.g. in case of a commit or rollback. The common façade helps you nothing here
    * For all those reasons, O/R mappers tend to be intrusive and lock you in. Have you seen many projects replacing the O/R mapper in the middle of a project?
    * One of the main O/R mappersâ?? advantages - object graph state, change tracking and transaction handling - does not work across tiers, i.e. in SOA systems. You have to keep track of the state or the transaction brace yourself.
    * You canâ??t use O/R-mappers for processing non persistent objects in a transactional manner
    * Most O/R-mappers are targeted at Request - Response applications, e.g. web applications but not at rich or smart clients

    So why don't we chose the most adequate data access for our respective project? Because of the lock-in and the lack of alternatives. If we need change tracking and transactions, what else should we use? A SQL-abstraction layer like Spring-JDBC or a pure row-object mapping like iBATIS? Or our home grown data access objects? If we did so, we would still have to provide the change tracking of our domain objects. Why don't we separate the concerns of change tracking, transaction handling and data access? Then we could chose the most suitable data access and transaction handling framework. And we could also replace it in case we need a more scalable architecture for instance. http://www.vramework.org/ tackles those requirements and provides the separation of concerns: It's a change tracker for object graphs, provides in memory transactions and pluggable data access strategies.

    Threaded Messages (12)

  2. Granted that the only aim seems to be announcement of vramework, there is some FUD spread around along along with a moderate dose obviousness .

    * The architects and developers have to know exactly when and how data is processed. Itâ??s dangerous to simply use OO data structures and let something handle the data transfer for them
    * O/R-mappers come with extensive configuration models and one has to become an expert to use them effectively

    I don't think that is possible to go far beyond a demo/prototype without professionally knowing the technologies we want to use.

    * O/R-mappers are not located in the data access layer. They introduce aspects spanning many layers
    Right, O/R mappers find their place mainly in IPC and UI.
    Maybe, data access layer is, normally, badly layered.

    * The API abstraction idea conceals these problems: A specification may define interfaces but behind the interface/API are the dynamics and semantics. Different mappers behave completely differently, e.g. in case of a commit or rollback. The common fa�§ade helps you nothing here
    Do you mean different implementation of the SAME specification ?
    AFAIK persistence arena is dominated by 5 technologies:
    1. JDBC-based data access
    2. JDO
    3. JPA
    4. Hibernate
    5. Pure OODBMS

    now the only specification that can be named so is JDO and claiming that different implemention behave differently is completely false.
    Products that fall within categories 1 and 5 are not specification-based and hibernate too is a product.
    JPA is a mistery, version 1.0 (and to some extent, 2.0) of the spec has many holes that allows different implementation to behave differently in crucial aspects.
    But a spec that leaves such a freedom to the providers can still be called a spec ?
    Maybe you have simply selected the wrong tool.

    * One of the main O/R mappersâ?? advantages - object graph state, change tracking and transaction handling - does not work across tiers, i.e. in SOA systems. You have to keep track of the state or the transaction brace yourself.
    This approach looks like the one used to design corba systems with IDL interfaces defining tons of setter and getters.
    Wise designers obviously ignored that each method was executed remotely, but they blamed the technology rather their own built-in neural network.
    SOAP like SQL*Net. Fantastic.

    * You canâ??t use O/R-mappers for processing non persistent objects in a transactional manner
    Well, ORM stands for object-relational mapping.
    Now, to achieve the result you can use an in-memory rdbms or write a memory datastore for JDO that is storage agnostic.
    If you insist to use top score in google search you are likely to get into trouble.

    * Most O/R-mappers are targeted at Request - Response applications, e.g. web applications but not at rich or smart clients
    Maybe they are not O/R mappers.
    You should take more care in you tools selection.

    Guido
  3. Hi Guido,

    thanks for your comment which shows you have a sound knowledge in data access technologies.
    First, you're right, the term "specification" is ambiguous. We should re-phrase it "leaky specification". I keep hearing "why bother with lock-in, we have JPA".

    I agree with many of your statements: A wise architect/designer/developer would know when to use which tool and there are various tools or technologies covering parts of the aspects we mean. That's exactly our point: Many technologies and/or frameworks can help in some areas. But why so many different approaches overlapping in many areas? We want to factorize the overlaps and separate the concerns of change tracking, identity-mapping, transactions and data access. Let's take Alex's post: He has written his own O/R mapper. And we understand that because we have seen it in many projects:  The implement there own data access. The easy part is the pure object-row mapping and wrapping JDBC in more convenient wrappers (or using iBATIS). But wherever you want to store your objects (SQL-oriented or web-service-oriented e.g.), something has to do the change tracking. At some point, we need the "change set" to determine how to store the data. So let's take that Unit of Work part and use it as an independent piece without tying to other concerns.  

    Let's look at some details:

    I fully agree with the SQL*Net like SOA. That's what many projects are still  developing. For the sake of simplicity let's call it "CRUD" or even setter/getter oriented SOA, degenerating to "tons of getters and setters or CRUDs" and ignoring the dislocation. That's exactly what we want avoid. The "Concepts" page on our web-site explains it in detail with a real life sample. And again: Many SOA projects are re-implementing their own version of a "change set" which is serialized and transferred to the business tier. (We could also mimic a pure request/response and just send around "screens" like in an HTML based app).

    Non persistent objects: That's also a real use case in high volume business apps. Calculate "something" in a transactional way with in memory objects only. Yes, I could use an in memory RDBMs or mock a DB but do we really want to do that just for transactionally processing Pojos? And what happens in a real high volume multi-user app? We multiply the number of objects (Pojos, O/R mapper proxies, in mem row and many oder DB-objects). What happens to the CPU, the garbage collector, the memory usage? I've seen it: GC trashing due to in-mem DBs.

    O/R mappers and non request-response applications, e.g. rich clients: First it looks straight forward to have an O/R mapper handle a transactional GUI. But if we look at projects (or mailing list and forums of O/R mappers) we see what it means in terms of multi-threading (multiple threads involved for a single user and potentially a single UI transaction), worker processes, "invokerLater()" patterns and lazy loading. 

    Thanks for posting!
    Kind regards
    Thomas
  4. Personally, I solved the problem by writing my own ORM, narrowly focused on our own tasks. And that was only after we were fed up with Hibernate issues.

    <rant_mode>
    Hibernate is a #*$*&#@^$*&@#^$*&^$#@. It's virtually unmaintained - look at its Jira! Bugs stay open for years ( even trivial ones like http://opensource.atlassian.com/projects/hibernate/browse/HHH-465 ).

    The internals of Hibernate are a total mess, it simply tries to do a lot and fails. Some dark corners of code (like lazy no-proxy mode) never really worked and have blatant bugs.

    And on top of it, it's SLOW. And I'm not talking about 'use fine-tuning for performance-critical SQL, blah blah blah'. Hibernate is slow in mapping results of simple "select blah from Some blah where blah.date<:dt" statements, which is no wonder if you trace all the stuff Hibernate has to do to instantiate one object.

    My ORM is about 10 times faster than Hibernate in result set mapping. And I don't think it's the limit.
    </rant_mode>

    So, IMHO, custom ORM for a large project is a way to go. It'd be nice to have a common set of building tools, though. For example, QueryDSL allowed me to write type-safe query support for my framework in just a couple of days. QueryDSL simply rocks - try it!
  5. the love of my it life is ibatis.   i've tried hibernate - and it worked.  but i (and n number of other developers) had to know hibernate.   i've tried jpa (openjpa) and it worked - but i (and n number of other developers) had to know openjpa.

    i hate relation databases.   like i really hate them.  by that i mean i would love if i woke up one day and all relational databases had disappeared.   it would be a day of joy.   i actively plot their destruction.   but as long as their around, sql is required.

    ibatis is clean and simple and doesn't dance around the problem of relational databases.
  6. Capitals?[ Go to top ]

    Just wondering, you do know that sentences ought to start with a capital letter right?
  7. Capitals?[ Go to top ]

    Just wondering, you do know that sentences ought to start with a capital letter right?
    Just wondering, replies should stay in the context of the thread, right?
  8. Capitals?[ Go to top ]

    Maybe his caps key is broken.
    Maybe he is using a "chiclet" keyboard.
    Maybe he idolizes e. e. cummings. 
  9. Nice[ Go to top ]

    That was awesome.
  10. I'm currently working on a prototype storage engine behind Software Transactional Memory. Since the STM already knows everything about the internal structure and already deals with concurrency control, persisting it is not that hard.

    http://pveentjer.wordpress.com/2010/03/21/simplifying-enterprise-applications-with-durable-stm/

    At the end of the post there is a small spring bean examples that makes sure that the bank (and all its dependencies) are stored on disk when an update transaction commits.

    I hope to release a good working storage solution in Multiverse 0.6 (planned after the summer). The goal is to make everything related to ACID much more transparent for the developer so he can focus on writing the business logic instead of getting swamped in unnecessary complexity. After the persistence issues have been solved, a distributed stm is the next logical step. I already have a vectorclock based prototype up and running.

    Peter Veentjer
    Multiverse: Software Transactional Memory for Java
    http://multiverse.codehaus.org
  11. Funny to see...[ Go to top ]

    I was a Smalltalk-Developer in the Mid-90s (thank God I learned my OO-lessions then) and changed to the Java-World at the beginning of the new millenium.

    It's funny to see how the same (very old but important) questions come to the Java community so many years later. In our Smalltalk applications we naturally used an O/R-mapper framework, and yes, we also hat image-transactions (commit/rollback-stack on non-persistent objects).

    Strange to me that for some problems it took about 15 years to arise in the Java world. Maybe this is because there is too much use of frameworks that are too biased on certain problems - especially on web applications with page-per-page-style and simple transaction needs.

    So maybe frameworks like Vaadin and the thoughts of this topic here will lead to a profound and sound application architecture on the JEE server in the nearby future.
  12. Trying to reinvent the wheel ?[ Go to top ]

    Reading Thomas original statement and the documents at vramework.org I'm getting the impression that the concept that you propose is not really a "fresh approach".

    The main idea is to have an in-memory transation manager that allows to handle units of works based on pojos.
    On commit of a transaction all changes made to registered objects are persisted via a pluggable persistence strategy.
    (On rollback all registered objects are rolled back to their initial state when ristered to the UOW)

    ODMG and JDO based O/R Implementations work according to this design for many years!

    Comparing your concepts for instance to those of the JDO reference implementation or the Apache OJB ODMG implementation I don't see anything new.

    regards,
    Thomas
  13. Querydsl details[ Go to top ]

    Querydsl provides a syntactic alternative for querying using JPA, JDO, JDBC, Lucene and Java collections.

    It is type-safe, concise and auto-complete friendly.

    For more information see the project page : http://source.mysema.com/display/querydsl/Querydsl

    Having a unified query layer makes testing alternatives and developing new ones much easier.

    In our own projects we use Hibernate and RDFBean, our own Object/RDF mapping layer : http://source.mysema.com/display/rdfbean/RDFBean

    RDFBean naturally uses Querydsl for querying ;)