Discussions

News: Object To XML Persistence Frameworks

  1. Object To XML Persistence Frameworks (15 messages)

    TheServerSide.com interviews Ilya Sterin on the need for an XML persistence framework. Sterin is working on a new open-source library for software developers named XQOM to map from Java objects into XML documents that are then stored in a native XML database.
    What are the application domain models where XML databases make sense? In addition to the scenarios I mentioned above, it makes sense to think in terms of persisting your data in its-own native format. What is the native format one might ask? Well, if you have an SOA application that integrates with some industry standard XML schema, that becomes its native storage format, in my opinion. I mean, if you spend 90% of your time dealing with that format, reading it, rendering it, etc. then why fight and store your data in some other representation? I see three scenarios where XML databases make sense:
    • You have a dynamic, hierarchical data model and a schema that might change with time.
    • Your current application domain is closely tied to some industry standard XML schema, where you have to support the data in the XML format for interoperability.
    • You have an XML based SOA architecture and the majority of your use cases involve moving the domain state in some XML format.
    What are the advantages to using a native XML database over simply storing XML data in a relational database? Well, the main advantage is getting rid of one level of impedance mismatch. I mean, you already have to worry about your domain model and how to marshal/unmarshal it to and from your xml schema. The last thing you want to do is add another level of complexity of now representing this domain model in a rectangular format. I also think that the impedance mismatch of XML to OO is less problematic than it is in the OR world. There is the big issue of cyclical dependencies, but for the most part it's a lot easier to represent your OO model in XML than it is in a relational schema.
    What do you think about XQOM?
  2. We're looking at storing our object model using XML with the help of XStream serialization. The database itself could be whatever that supports id->blob mappings, such as BLOB's in an RDBMS or Jisp/BerkleyDB/JDBM. If Fast InfoSet is used as the actual format it should be pretty speedy and compact as well. For querying it seems like RDF is the way to go. The model is good for handling strange queries over large graphs of connected objects. I kinda like what you can do with SPARQL, and with some basic OWL for inferencing on top it becomes really really neat. Being able to say "Field X in class A is a dc:title and Field Y in Class B is a dc:title" and then query for "all objects with dc:title='Hello World'" and not having to know what the class of the object is is very powerful. Since we use AOP introductions for our domain model the whole XML mindset with namespaces fits well, as each aspect can be mapped to a namespace. Having the data in XML also makes it trivial to do schema migration using XSL, either lazily or eagerly. So, using XML for persistence looks very promising to me, and as above, especially if you choose to use RDF indexing for the queries (and definitely NOT as the storage format!).
  3. Rickard, it sounds almost like what you're looking for is already in JCR? It has XPath support like that, and can use RDF-style namespacing if you so desire...
  4. JCR Xpath functionality is misleading[ Go to top ]

    We have been investigating both JCR and native XML databases. I have found that the so called XPath support in JCR is very limited. It is only at the storage node level. So you do not have any XPath capability on the actual content (the XML data). In fact you are usually limited to only text based queries on content. Another limitation with JCR is the the results are coarse-grained. With a native XML database and more importantly XQuery you can pick and choose data at a very fine grained level. We may end up implementing both to satisfy our customers. An Object to XML persistence framework would be wonderful to have and ideally it should be able to persist to both a JCR and a native XML database.
  5. Rickard, it sounds almost like what you're looking for is already in JCR? It has XPath support like that, and can use RDF-style namespacing if you so desire...
    Well, JCR is not an object store really, so it would be kind of forcing it to do something it wasn't supposed to do. Just doing a straight Object->XML with XStream and then RDFize it into Sesame2 for querying would be enough. What I like about RDF is that, unlike XPath and XQuery and SQL and most other QL's, it is better at allowing me to *describe* what I am after rather than having to explicitly know what I'm after. That is immensely powerful, especially in a dynamic environment where data and objects are merged from many different sources.
  6. I wonder if there is a possible role for JDO here, to sit above projects like XQOM as a sort of 'commons-persistence'? After all, JDO is a technology-neutral specification for object persistence. Indeed, there is at least one JDO implementation - Xcalia - that enables the use of XML for object storage. JDO allows pluggable query languages, so something more appropriate for an XML database could be included if JDOQL was considered inadequate in this context. There is also a great open source JDO - JPOX (the JDO 2.0 RI), that could be a basis for such a project. (Also, it seems kind of ironic that, so soon after the arrival of JPA, which - in contrast to JDO - focussed in on relational stores, interesting uses for non-relational persistence are turning up).
  7. interesting uses for non-relational[ Go to top ]

    Also, it seems kind of ironic that, so soon after the arrival of JPA, which - in contrast to JDO - focussed in on relational stores, interesting uses for non-relational persistence are turning up.
    how's that ironic? There have been use cases for XML storage all the time, and plenty solutions as well. The question you are referring to was whether it makes sense to define persistence abstraction as a common denominator of all the conceivable machanisms (which differ widely in capabilities, just look at XML vs Relational). I for my part am extremely happy that JPA focuses on the by far most important persistence technology, and supports almost all its fundamental features. this really is history by now, I'm not out to start a 20-page flame thread. Christian
  8. Also, it seems kind of ironic that, so soon after the arrival of JPA, which - in contrast to JDO - focussed in on relational stores, interesting uses for non-relational persistence are turning up.

    how's that ironic? There have been use cases for XML storage all the time, and plenty solutions as well.
    It is ironic because these are apparently now becoming more mainstream, so there might need for a more general persistence mechanism. As you say, the question is - is this reasonable and useful?
    The question you are referring to was whether it makes sense to define persistence abstraction as a common denominator of all the conceivable machanisms (which differ widely in capabilities, just look at XML vs Relational). I for my part am extremely happy that JPA focuses on the by far most important persistence technology, and supports almost all its fundamental features.
    The fact is that different developers have different answers to that question. Some think it does make sense to have a common abstraction. And, it is certainly possible: there are products that show that this common abstraction works without losing any ability to handle relational persistence well. Hopefully future versions of JPA will be adapted to make such versatility possible, just as JDO has been adapted to make it much more suitable for relational use (even though this has always been the main use case for JDO).
  9. Re: interesting uses for non-relational[ Go to top ]

    Hopefully future versions of JPA will be adapted to make such versatility possible, just as JDO has been adapted to make it much more suitable for relational use (even though this has always been the main use case for JDO).
    I don't think it would be that hard to morph JPA. The EntityManager and Query are generic enough to be layered upon other storage. What you would need is different mapping metadata (JAXB?) and a query language (XPATH or others that people mentioned here). Personally, I'm not a big fan of one-size-fits all persistence solutions, specifically generic query languages. Bill
  10. Re: interesting uses for non-relational[ Go to top ]

    What you would need is different mapping metadata (JAXB?) and a query language (XPATH or others that people mentioned here). Personally, I'm not a big fan of one-size-fits all persistence solutions, specifically generic query languages.

    Bill
    I think a good approach could be to allow pluggable query languages, as JDO 2.0 allows in the PersistenceManager#newQuery method. Perhaps the JPA createNativeQuery could be generalised for more than just SQL.
  11. XQuery support is on the roadmap of JPOX + persistence of XML data and aggregation of multiple data sources including RDBMS and XML. JPOX will enable persistence with JPA and JDO api in short term in RDBMS databases and enable projection of relational/XML data in multidimensional format (ROLAP). We are always open for new developers or teams like XQOM to join efforts in the XML area or others :)
  12. XQuery support is on the roadmap of JPOX + persistence of XML data and aggregation of multiple data sources including RDBMS and XML
    Excellent. Having this available via FOSS would be extremely useful.
  13. XQuery support is on the roadmap of JPOX + persistence of XML data and aggregation of multiple data sources including RDBMS and XML.

    JPOX will enable persistence with JPA and JDO api in short term in RDBMS databases and enable projection of relational/XML data in multidimensional format (ROLAP).

    We are always open for new developers or teams like XQOM to join efforts in the XML area or others :)
    This sounds great. Do you have a roadmap or some other documentation outlining the extent of this support? Also, you can contact me outside of this thread if you want to discuss a bit more. We can either do email and/or on an xqom/jpox forum or email list. Ilya
  14. Recently John Davies and his team at C24 announced support for XPath expressions that can be executed on the in-memory IO objects without being transformed to an intermediate result set first. This also leads me to believe that XQuery, which is a superset of XPath, might be rich enough to facilitate querying of object models
    As Frank points out C24 provide the ability to execute XPath on in memory objects. And C24's product Integration Objects (IO) is proof that XQuery is most definitely rich enough to query object models. C24 IO gives you the ability to execute XPath, XQuery and XSLT on in memory, bound, java beans. And what's more it's not just for XML. The same applies to proprietary file formats (bound to manually built models) and industry standards (bound to pre-built models such as SWIFT, FIX, CREST, FpML, ISO 20022 / UNIFI, TWIST, ACCORD, EDIFACT, HL7, etc. etc.) So within minutes you can be generating highly performant java code, capable of executing XQuery expressions on complex XML structures, running an XSLT against in memory data extracted from a plain text file, or converting SWIFT messages into XML with associated Schema for downstream processing by other systems. Simon Heinrich Product Development Director, C24
  15. Recently John Davies and his team at C24 announced support for XPath expressions that can be executed on the in-memory IO objects without being transformed to an intermediate result set first. This also leads me to believe that XQuery, which is a superset of XPath, might be rich enough to facilitate querying of object models


    As Frank points out C24 provide the ability to execute XPath on in memory objects. And C24's product Integration Objects (IO) is proof that XQuery is most definitely rich enough to query object models.

    C24 IO gives you the ability to execute XPath, XQuery and XSLT on in memory, bound, java beans. And what's more it's not just for XML. The same applies to proprietary file formats (bound to manually built models) and industry standards (bound to pre-built models such as SWIFT, FIX, CREST, FpML, ISO 20022 / UNIFI, TWIST, ACCORD, EDIFACT, HL7, etc. etc.)

    So within minutes you can be generating highly performant java code, capable of executing XQuery expressions on complex XML structures, running an XSLT against in memory data extracted from a plain text file, or converting SWIFT messages into XML with associated Schema for downstream processing by other systems.


    Simon Heinrich
    Product Development Director, C24
    Simon, the reason I mentioned C24, is because it becomes very interesting when dealing with caching and optimizations. As John Davies mentioned in some other thread, that databases are on their way out, and in-memory models are going to be more prevelant. I don't necessarily completely agree with the fact that databases are on their way out, rather I think that different storage models are going to become a lot more prevalent (i.e. in-memory data objects, xml databases, etc...). Executing XQuery on an object model though is a great optimization technique, though allowing to transparently communicate with the underlying XML schema, that is represented internally either as an in-memory object model and/or persisted in an XML store. What were some of the biggest reasons for this feature add for C24, was it the optimization of querying different schemas without having to transform them to a common format? Or where there some other underlying benefits? Ilya
  16. What were some of the biggest reasons for this feature add for C24, was it the optimization of querying different schemas without having to transform them to a common format? Or where there some other underlying benefits?
    C24 IO is all about abstracting data and data access. Some of our clients are happy to use our API to access the abstracted data, but others prefer to abstract the access mechanism as well. XPath, XQuery and XSLT allow them to do this. A common requirement we see from our clients (most of the biggest investment banks around the world) is to transform one messaging standard to another. Sometimes these standards are XML, sometimes not. For XML to XML transformations most people would use XSLT or XQuery. We realised that being able to use the same technology to transform to or from a CSV, flat file, SWIFT or FIX message as well would be seriously powerful. ... and it is! By abstracting the XML Schema meta-model into something that can be used to model (literally) any data structure, we've given our clients the ability to use an abstract, open language (XPath, XQuery, or XSLT) on an abstracted piece of data. ... in fact there's an open edition (free to download) that can do a lot of this stuff on flat files and XML.