Discussions

News: Using the JDO Query Facility and JDOQL to Find Your Data

  1. The JDO query facility consists of an API that can be used to programmatically locate information and a query language called the JDO Query Language (JDOQL) that defines the grammar for structuring queries. The 'Finding Your Data' chapter from the Core JDO Book-in-Review looks at the query facility, and how to retrieve and query the underlying datatstore using the API and JDOQL.

    Download and Review 'Finding Your Data'

    Threaded Messages (27)

  2. Why not?[ Go to top ]

    I am using meta descrition mechanism, so I do not use JDO, but it is good step for this technology.
  3. I'm (probably not the only one) a little bit frustrated about cmp-entity-beans as a persistency mechanism. Don't get me wrong, I think it's great technology that provides a lot of functionality and separation. The frustration is only the typing-work and the maintenance of the deployment descriptors.

    That is why I was delighted when I learned about JDO : an excellent standard for all O/R mappers that provided an alternative for cmp-entity beans. In the beginning there was a lot of hype about it. Now I see only smaller companies supporting and pushing this technology. The big players don't seem to be keen on the breaktrough of JDO. At least that is my humble opinion.

    Is there anybody that can explain this lack of enthousiasm from the big players for JDO ?

    Tom Baeyens.
  4. There are many O/R mapping tools that have been around for years and are supported by big vendors. Most of these tools rely on either reflection, or a base class that must be extended by domain model classes.

    JDO proposes a new (arguably better and more transparent) paradigm for detecting reads/writes to application classes, and delegating reads/writes to a vendor implemented PersistenceManager. This "automagically" keeps fields from the application consistent with the datastore. The paradigm is based on the "enhancement" of the application classes by a special compiler that inserts java bytecodes into the application classes. The new bytecodes replace any putfield or getfield bytecodes with calls to the vendor provided Persistence Manager.

    Because most big vendors have inertia behind a fundamentally different paradigm for object persistence they have spread a great deal of FUD about JDO. Speaking as one who has used EJB 2.0 extensively, I can tell you that JDO is a huge leap forward for server side java over CMP. As a developer I prefer the elegance of JDO over any scheme that intrudes into my domain model.

    JDO has to clear a double hurdle to win the hearts and minds of developers. The first is to convince developers that JDO is better than CMP. This should not be too hard for anyone who has used CMP, but the inertia behind present implementations of container managed CMP is huge. Secondly, JDO has to beat all the FUD being spread by entrenched O/R mapping vendors who feel the JDO spec does not let them leverage what they've got. Entrenched O/R mapping vendors also fear that a specification for transparent object persistence destroys the proprietary lock-in that they achieve.

    That said, there are a number of great JDO implementations commercially avaialable.
  5. Geoff,

    I agree 100%. You've put the case for JDO very well. If the technologies were considered purely on their merits, I can see very little reason to choose CMP over JDO. And I feel more comfortable with an open specification than a proprietary O/R mapping product, no matter how good it is.

    Rod Johnson, author of Expert One-on-One J2EE Design and Development
  6. <rod>
    And I feel more comfortable with an open specification than a proprietary O/R mapping product, no matter how good it is.
    </rod>

    That's one strange statement, Rod, considering that both CMP and JDO

    - Have an open specification
    - Have various implementations that are both open source and commercial

    Besides, why are you comparing a specification with a mapping product?

    --
    Cedric
    http://beust.com/weblog
  7. <cedric>
    That's one strange statement, Rod, considering that both CMP and JDO

    - Have an open specification
    - Have various implementations that are both open source and commercial

    Besides, why are you comparing a specification with a mapping product?
    </cedric>

    I may be reading this wrong, but I thought Geoff's comment made two separate points:
    Promote usage of JDO over CMP simply on merit
    Promote usage of JDO-compliant tools over proprietary, non-JDO O/R mapping tools, out of a desire to conform to open specifications.
  8. <Tom>
    I may be reading this wrong, but I thought Geoff's comment made two separate points:
    Promote usage of JDO over CMP simply on merit
    Promote usage of JDO-compliant tools over proprietary, non-JDO O/R mapping tools, out of a desire to conform to open specifications.
    </Tom>

    I agree completely with Geoff's first point. And yes, there are 2 points ;-) There's hardly any reason to choose CMP over any lightweight O/R mapping solution, not only over JDO, for that matter.

    Hence, the second point: There's isn't just FUD being spread by established persistence toolkit vendors. Gavin defends Hibernate wholeheartedly because he's convinced that reflection-based persistence is a viable approach too, and I have to agree with him. JDO's bytecode manipulation approach has its merits, but it's not the only viable approach. Efficient dirtiness detection (the main argument for JDO's method intercepting) isn't such a big deal in typical applications. I don't know how often I've already had to repeat that.

    <Rod>
    And I feel more comfortable with an open specification than a proprietary O/R mapping product, no matter how good it is.
    </Rod>

    Same here, although I'd rate an alive-and-kicking open source product with tons of nicely working features (Hibernate) equal to a mid-price commercial solution that adheres to an open specification and works nicely too - but has to offer proprietary addons to provide necessary functionality (e.g. Kodo JDO).

    IMHO, the value of the standardization isn't that great if the specification is flawed to a certain (debatable) degree. In any scenario with existing data models, you'll have to rewrite and re-tune the mapping descriptors when moving to a different JDO implementation - and all the proprietary addons too, of course. I'm convinced that this isn't a drop-in replacement thing in real life. Let's see if JDO 2.0 will address the latter issues (whenever it will come).

    All things considered, I'd love to see all viable lightweight persistence approaches compete. Migrating a well-architected application from JDO to Hibernate or vice versa isn't too hard anyway, if you ever need to. In reality, this will probably be decided on a per-project level in most cases. The biggest efforts would probably be migrating the queries and tuning the mappings (cascading update/delete, lazy loading, etc).

    Let's accept all contenders and see how they evolve.

    Juergen

    P.S.:
    I don't intend to turn this thread into another JDO vs reflection-based persistence discussion, actually I'm a bit tired of it. I've stated my points in nearly every JDO thread at TSS. And again, no, I'm not affiliated with any O/R toolkit vendors. But I'm quite a Hibernate fan, admittedly.
  9. queries[ Go to top ]

    I don't intend to turn this thread into another JDO vs reflection-based persistence discussion, actually I'm a bit tired of it. <

    Me too ;)

    But more importantly, "reflection" isn't Hibernate's main selling point anyway, particularly not now with Hibernate2. The #1 reason our users are choosing Hibernate over JDO is the query language. Our users actually DO need a full featured object/relational query language with support for notions like

    * outer joins
    * subqueries
    * grouping, counting, aggregating
    * returning multiple objects, or property values, or aggregates,
      in a single row of results

    and Hibernate delivers that with far, far less verbosity than the typical (trivial) JDO query.

    I can see a number of reasons why the JDO spec did NOT mandate OQL as the query language (OQL is very complex and has some notions that really do not map well into the object/relational world), but it really would have been better to take some kind of OQL subset and adjust the semantics slightly so that it could be efficiently implemented in both OODBMS and ORM worlds.

    I read the linked book chapter the other day and I just can't see how it could be argued that the JDO query API is nonugly! Take a look at how many lines of code are required to accomplish what would be doable in 1-4 LOC with the ODMG or Hibernate APIs.

    peace...

    P.S. No flames or reflexive accusations of "FUD" please. Lets debate this as adults, sticking to technical arguments. The Hibernate project (like other projects such as Cayenne) is composed of Java enthusiasts sharing their ideas and work with the community. Criticism of JDO is *not* just coming from businesses with vested interests, so please lets just drop that pointless ad hominem.
  10. queries[ Go to top ]

    I didn't read the other "JDO vs reflection-based persistence discussions"
    So thanks guys for making the effort of writing the arguments another time :-)
  11. queries[ Go to top ]

    If your application is just manipulating data (rows from tables), so you don't need JDO (in fact I'm not sure you need Java at all). JDO is useful when you need to deal with the business model through Java Objects.

    This is why, IMHO, there is no need to deal with all the SQL possibilities at the JDOQL level: the only result you can get from a JDOQL queries is a Collection of Objects.

    Now, if you sometimes need to manipulate raw data in your application, most JDO implementations (at least LiDo) allow you to use pure SQL-92 syntax instead of JDOQL, from within your JDO transaction.
    From my point of view, I don't really see a big added-value in reproducing within JDOQL all the low-level querying possibilities from SQL (aggregate...). SQL is perfect for data analysis and so on, there is no debate on this, why introducing a new query language.

    The goal of JDOQL is not to perform data analysis, and to mimic SQL, it is just to identify root objects in your business model, from which you'll be able to navigate and apply Java methods.

    The same way, if you just need to update few rows in the DB, without any intelligence (eg business methods), just use a typical SQL UPDATE statement, no need for JDO there.

    That said, it is possible that JDOQL will evolve in the future: Gavin you're welcome if you want to participate and promote your vision.

    Best Regards,
  12. queries[ Go to top ]

    This is why, IMHO, there is no need to deal with all the SQL possibilities at the JDOQL level: the only result you can get from a JDOQL queries is a Collection of Objects. <

    Correct, but that is a limitation of the JDO API and its Extents. ODMG OQL and HQL do not suffer this limitation.

    >> I don't really see a big added-value in reproducing within JDOQL all the low-level querying possibilities from SQL (aggregate...). SQL is perfect for data analysis and so on, there is no debate on this, why introducing a new query language. <
    Aggregation is "low level"?? Almost all applications need to do some kind of analysis which means sorting, grouping, ordering (all at the same time). It is NOT efficient to do this in the business tier.

    Why introduce a new query language? Well, quite simple really. I want my mappings defined in One Place. If I need to write too many queries in SQL, instead of a higher-level object/relational language, I end up replicating the mappings in each of the queries. This is much less maintainable. Secondly, the HQL query is much, much more readable than the translated SQL. It talks about objects and properties of objects. It understands associations, so I can express things using path expressions instead of having to explicily right down the join condition (every time). Etc.

    In fact, UPDATE, INSERT and DELETE are the easy part. You aren't going to really appreciate the enormous advantages of ORM until you come to retrieve data. Now, outer-join fetching configured on a per-association level can realize about 10% of the benefit. But what I've observed is that as the application grows, and classes become more re-used, you end up having to disable outerjoining on most associations. So the only way to take proper advantage of joining ends up being some kind of "fetching" API, or a decent QL.

    >> The goal of JDOQL is not to perform data analysis, and to mimic SQL, it is just to identify root objects in your business model, from which you'll be able to navigate and apply Java methods. <
    This is horribly inefficient and is just not how relational databases are intended to be used. This kind of approach causes killer n+1 problems once applications become sufficiently complicated. Relational data MUST be accessed using joins if high performance is required in a scalable (ie. distributed) application.

    Remember, joining is what relational databases *do*. It is essential to the future success of ORM that tools are able to take full advantage of the underlying relational technology. Hibernate's approach to this is an O/R query language designed as a "minimal extension to SQL" (which happens to be very close to ODMG OQL for many queries, but then adds features like LEFT JOIN, FULL JOIN, etc). This is not the only possible solution, of course.

    I have often wondered if JDO has aimed for a too-high level of abstraction here. The JDO API tries to abstract common notions from ORM and object databases. But these are fundamentally different technologies! However, I think there *are* ways forward on this issue.

    >> Gavin you're welcome if you want to participate and promote your vision. <
    Oh, I'd certainly *like* to :) Is there room for input from the open source community?
  13. Queries - Aggregations[ Go to top ]

    Aggregation is "low level"?? Almost all applications need to do some kind of analysis which means sorting, grouping, ordering (all at the same time). It is NOT efficient to do this in the business tier.


    Eric>>>>I don't say aggregation is a low-level need but aggregation is low-level from the Object perspective.
    In which business objects would you put the result of an aggregation query ? If you need aggregated results, you don't need objects, so go on with simple well-known SQL.

    Best Regards, Eric.
  14. Queries - Aggregations[ Go to top ]

    Aggregation is "low level"?? Almost all applications need to do some kind of analysis which means sorting, grouping, ordering (all at the same time). It is NOT efficient to do this in the business tier.


    Eric>>>>I really think that real, non-trivial, object models cannot deal with aggregation using SQL Group BY clauses. Aggregation is a business need that could be addressed in a better way by methods defined on Objects.

    Best Regards,
  15. Performance[ Go to top ]

    disable outerjoining on most associations. So the only way to take proper advantage of joining ends up being some kind of "fetching" API, or a decent QL.


    Eric>>>>I agree with the need of intelligent group read APIs (or alternate external mechanisms). This is something that will be added in JDO 2.
    The real need is to be able to define use cases where you define how groups of objects can be efficiently loaded into memory (whatever the underlying storage technology is).

    Best Regards, Eric.
  16. Performance[ Go to top ]

    This is horribly inefficient and is just not how relational databases are intended to be used. This kind of approach causes killer n+1 problems once applications become sufficiently complicated. Relational data MUST be accessed using joins if high performance is required in a scalable (ie. distributed) application.



    Eric>>>>If you need performance simply forget RDBMS !
    E-R models provide a simple and elegant model to build information from atomic data, but it is well-known as one of the less efficient data technology.

    Best Regards, Eric.
  17. Performance[ Go to top ]

    This is horribly inefficient and is just not how relational databases are intended to be used. This kind of approach causes killer n+1 problems once applications become sufficiently complicated. Relational data MUST be accessed using joins if high performance is required in a scalable (ie. distributed) application.

    >
    > Remember, joining is what relational databases *do*. It is essential to the future success of ORM that tools are able to take full advantage of the underlying relational technology. Hibernate's approach to this is an O/R query language designed as a "minimal extension to SQL" (which happens to be very close to ODMG OQL for many queries, but then adds features like LEFT JOIN, FULL JOIN, etc). This is not the only possible solution, of course.


    Eric>>>>>JOINS are efficient ??? This is what kills the CPU on DB servers.
    It really depends on your model. With really complex object model using inheritance, collections, interface you will simply kill your RDBMS optimizers if you only rely on JOINS.
    It only works for simple existing schemas, designed before object technologies.

    Best Regards,
  18. JOIN JDO Expert Group[ Go to top ]

    Gavin you're welcome if you want to participate and promote your vision. <>

    > Oh, I'd certainly *like* to :) Is there room for input from the open source community?

    Eric>>>>I suppose you can do what we did.
    Just join the JCP then request to be part of the JDO Expert Group.
    But don't forget that JDO is not an O/R mapping standard, it will remain a unified standard to access any datastore.

    Best Regards, Eric.
  19. Vision on future of ORM[ Go to top ]

    Remember, joining is what relational databases *do*. It is essential to the future success of ORM that tools are able to take full advantage of the underlying relational technology. Hibernate's approach to this is an O/R query language designed as a "minimal extension to SQL" (which happens to be very close to ODMG OQL for many queries, but then adds features like LEFT JOIN, FULL JOIN, etc). This is not the only possible solution, of course.



    Eric>>>>It is your vision.
    Another vision is that RDBMS will become commodities, limited to storage, and most of their "advanced" features will migrate to business layers.
    The first reason is that SQL syntax is too limited to express very complex constraints on rich business models. I don't believe declarative queries can address the needs of future complex applications.
    The second reason is that most advanced RDBMS features (triggers, unique, Foreign Keys...) are limited to a single-server vision, that doesn't fit with the need for applications accessing different distributed databases.

    RDBMS features are nice for client-server applications used locally.
    How can you use a unique index on 2 distributed Oracle databases ?
    How can you define a foreign key from an Oracle row to a Sybase one ?
    How can you imagine that non-trivial business cases are simple enough to manage "On Cascade delete" business rules simple adding a facet on a Foreign Key ?

    To me, the question is not O/R mapping future, but Enterprise Information Access future.


    Best Regards, Eric.
  20. JDO / OR MApping / ODBMS[ Go to top ]


    > I have often wondered if JDO has aimed for a too-high level of abstraction here. The JDO API tries to abstract common notions from ORM and object databases. But these are fundamentally different technologies! However, I think there *are* ways forward on this issue.

    Eric>>>We can discuss this. But before that, do you have real experience with ODBMS ? Why do you say it is so different ?
    The only big difference between an ODBMS and RDBMS is that an object reference is an atomic type within an ODBMS and you can retrieve an object by its reference, with no need to perform a query on its PK (that is much more efficient).

    The goal of JDO is not to make JDBC easier to use, it is to define how Java objects can be loaded into memory from any data source including RDBMS.
    It tries to decouple the Business Objects from the underlying datastore.
    It defines how objects can be loaded without any intrusion in the source code, and how updates are automatically tracked. This applies to any datastore technology, including RDBMS.

    For instance you can deploy a LiDO application into Oracle and then redeploy it later (or on a different site) on Versant (an ODBMS), just changing the mapping file.
    Or you can have the same application, accessing XML files, Mainframes and a Sybase DB, just using the same Java business objects.
    This is at least as important than simple O/R mapping, for big enterprise IS applications.

    Not all data are stored within Oracle or MySQL.
    A lot of data is still stored in mainframes, files (csv, proprietary...).

    Best Regards,
  21. Performance[ Go to top ]

    This is horribly inefficient and is just not how relational databases are intended to be used. This kind of approach causes killer n+1 problems once applications become sufficiently complicated. Relational data MUST be accessed using joins if high performance is required in a scalable (ie. distributed) application.


    Eric>>>>Performance is not limited to SQL tuning and JOINS.
    This is obviously an importnat point but performance is also a question of caches, java coding guidelines, architecture choice, design considerations, culture, internal knowledge on how database engines and associated network layers are designed...

    Relational Optimizers are much more limited than one might suppose.
    Trying to execute too much complicated SQL queries can result in nightmares, from the DB server point of view.
    You should not be so confident in RDBMS technologies.
    Just download an open source RDBMS an spend a lot of time to examine how it works. You'll see the keypoint is not on using JOINS (or you'll underrstand why it can kill your application).

    Using JOINS or other RDBMS features could be efficient in some cases, and not so efficient in different cases.
    The more your model is complex and the less advanced RDBMS features are efficient.
    What is true is that you must be able to configure all these options from a good O/R mapping tool.

    Best Regards, Eric.
  22. Performance[ Go to top ]


    Just download an open source RDBMS an spend a lot of time to examine how it works. You'll see the keypoint is not on using JOINS (or you'll underrstand why it can kill your application).
    >


    Great idea... PostgreSQL RedHat Edition (formerly RedHat Database) includes a visual query analizer (Visual Explain) that allows you to see what the server is doing when you execute a query.. There you find your CPU killers

    http://sources.redhat.com/rhdb/tools.html


    Best regards, Raul.
  23. joins[ Go to top ]

    It is simply not true to say that you will see better performance by using many round-trips to the database, particularly in a typical kind of clustered environment where the database server resides on a different machine to the application.

    It is just absurd to suppose that an enterprise-class relational database cannot perform joins faster, in its own process, than the Java application can by using many interprocess or even remote calls to the database.

    Yes, we certainly need human interaction to ensure that the generated joins are efficient ones - hence the need for the kind of query language or fetching API I am talking about.
  24. queries[ Go to top ]

    I'm a pretty simple guy. And I don't know what an "ad hominem" is, but I do know what a "flame" is, and I do know FUD when I see it. To the great credit of all the open source guys, the open source community has *not* been the source of the anti-jdo noise. It's mainly been from commercial O/R vendors, as I said.

    I think the diversity of object persistence projects is proof that people on the front lines want an alternative to CMP. And, I'd encourage people to try projects like Hibernate because they are relatively mature and performant.

    Now, returning to the technical merits...

    This is a bit on the outside, but I can see another really nice reason for JDO binary compatibility that I have not heard any discussion around. I can imagine tools from application server or DBMS vendors that could allow really slick deployments of applications that used JDO.

    A big barrier to deploying persistent "components" is that a deployer still has to go through the decidedly tedious step of setting up your schema based on DDL scripts that you provide. Because a JDO enhanced class contains a great deal of information encoded in static fields, it could be possible to do a "drag and drop" jdo deployment where the appserver could initialize a default schema.

    Maybe in JDO 1.x or 2.x the amount of information embedded in the static fields could be beefed up. Correct me if I'm wrong but I don't think the element-type meta-data for an enhanced Collection can be extracted from the enhanced class. I think with the addition of element-type information there would be enough information in the enhanced bytecodes that at least a simple default schema could be generated with no reference to the XML meta-data used at enhancement time.

    Anyway, I haven't thought it through too well, but it sure would be nice to be able to deploy a WAR file and just instruct the app server to "generate default schema".
  25. ad hominem[ Go to top ]

    An "argumentum ad hominem" is a fallacy where the speaker attacks the person making an argument or claim by questioning their motives/qualifications/intelligence/etc rather than addressing the argument or claim itself.
  26. eg[ Go to top ]

    So, for example, Ward Mullins might make all kinds of negative claims about the JDO specification, purely out of self interest. But that doesn't make the claims necessarily untrue.
  27. Profits[ Go to top ]

    Is there anybody that can explain this lack of enthousiasm from the big players for JDO ?


    Money. IBM & BEA are pushing EJB and if they support JDO, that would make a large part of their application servers not so useful. Then customers would re-evaluate purchasing the appserver if they're not going to use CMP. Many customers will do as I have and skip the appserver all together, in favor of a lightweight Tomcat/JDO solution.

    It's a shame really, because the big players are motivated by profits and are pushing their products while a technically superior solution goes unnoticed. I think an open source JDO implementation will help but we're not quite there yet. Word of mouth has to be used as much as possible to get JDO the exposure it deserves.

    Michael
  28. Profits[ Go to top ]

    OTOH, whichever one supports JDO _for_ their CMP (i.e. builds JDO in and makes it interop beautfully as part of their EJB solution) will definitely benefit. I figure you either fear good technology because of what it _could_ do to your business, or embrace it before your competition does. IMHO, WebLogic + JDO would be a pretty big win for developers, and it would force IBM and Oracle to follow suit. JDO isn't going away, and it's only going to become more dangerous and expensive for BEA, Oracle and IBM to ignore.

    $.02

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Easily share live data across a cluster!