Discussions

News: Roy Fielding explains JSR 170 content repository; Revolutionary?

  1. Roy Fielding of REST fame has written an indepth overview of the new content respository API (pdf) defined by JSR 170.

    The Content Repository for Java API (JCR) is a standard API that abstracts access to a content repositories such as those exposed by Portal/CMS products or Source Control systems. It has also defined a generic, hierarchical data model based on extensible node types and content properties.

    It's goal is "to abstract the details of application data storage and retrieval such that many different applications can use the same interface, for multiple purposes, without significant performance degradation. Content services can then be layered on top of that abstraction to enable software reuse and reduce application development time."

    Roy compares JCR with the underlying infrastructure of the web in that they both form a content-centric interface focused on the uniform nature of content rather than the specific controls of any given application. As a result, Roy concludes that JCR is "is poised to revolutionize the development of J2SE/J2EE apps in the same way that the Web has revolutionized the development of network-based apps."

    JSR 170 is currently in proposed final draft. JSR 170 was submitted by Day Software, makers of the Communique portal product. Roy is co-founder of Day, as is David Nuesheler, who is the JSR 170 spec lead.

    By having a standard data model for content as well as a healthy vendor and open source marketplace in support of it, it is possible that use of this API could go far beyond Portal/CMS scenario's and into mainstream application development as an alternative to the typical data-centric approach. Will this be as revolutionary as Roy claims?
  2. It has been quite a while since I have looked at JSR170, but it seemed complex. Hardly what I would compare to a REST interface. Also, it seemed to me that many of the areas that would need to be defined in order to be able to simply swap content management system back ends are not. Specifically I mean the organization of the content and metadata in general.

    Admittedly, I did not look at it in detail -- it scared me off right away. Perhaps my initial impression was off? I would be interested to hear what others (others that have read it in detail and are not a content management vendor) have to say.
  3. And why would you compare it to a rest interface? It has nothing to do with it. First REST is an architectural style, not an API, and second, it solves completely different issues.

    But I agree that the spec has been lacking the description of motivation behind its described features. Thus it is a hard read.

    I also agree that it is a pity that some parts have been left out. One area that hits us at magnolia [1](an open source CMS that leverages JSR-170 from its inception), is that security aspects are out of scope - in effect that means each repository has its own way of dealing with who is accessing which information. Now, while some repositories will deliver added value, e.g. user management, others will not. In effect your app either gets locked into a repository, or you will have to provide your own (user management) - which is what we do, or else it would not be possible to switch repositories.


    On the other hand, the uniform api allows us and our templaters to access any business information via the same API once its ubiquitious. Thats quite a nice outlook.

    And, speaking of enterprise integration aspects, if you need to integrate n systems, you will have to write (n*n-1)/2 connectors. With a standard api, this can be drastically simplified to writing n connectors (i.e. each system provides the JSR-170 api, and thats it)

    Sounds good to me.

    Regards
    Boris Kraft
    ---------------------------
    Magnolia Content Management
    http://www.magnolia.info/
    ---------------------------

    [1] http://www.magnolia.info
  4. I have looked at JSR170, but it seemed complex. Hardly what I would compare to a REST interface.

    +1.

    RSS seems better.

    .V
  5. I have looked at JSR170, but it seemed complex. Hardly what I would compare to a REST interface.
    +1.RSS seems better..V

    I disagree, I think tunafish is better...

    WTF are you talking about? These aren't the same at all...
  6. Rest and RSS are simple.

    JSR 170 is complex.
    It seems that Rest and 170 are done by a different guy.

    Tunafish tastes good.
    .V
  7. *blank stare*

    Well my hello world example is simpler than both REST and RSS, so clearly it's superiour.
  8. Well my hello world example is simpler than both REST and RSS, so clearly it's superiour.

    Some people belive this is goal of programming, you can see names you recognize:
    http://www.sandrasf.com/kiss

    Rest author argues for simplicty in one case.

    My belif is that RSS derivatives will be bassis for CMS.

    .V
  9. First Roy Fielding writes:

    "A traditional application uses multiple data stores during its operation. For example, a typical email application will store its configuration in a property list, its address book in a table, messages within indexed files (folders), message properties in separate tables, and search indices in a binary hash. In most cases, each of those storage formats would have their own interface. The application developer would spend a significant portion of the development effort designing, creating, and maintaining those interfaces.

    In contrast, a content repository API separates the issues of content storage and efficient retrieval. The application developer defines how the content is identified via the interface, writes the content, and then uses the built-in services of the API to perform efficient retrieval in a variety of modes: individual reads, traversals of related data, hierarchical search, and full database query. The real storage format is separated from the application interactions, allowing the most appropriate storage subsystem to be selected based on observing the actual performance of the application, rather than by making a premature decision early in the application’s design. The application developer doesn’t have to worry about parsing file formats, maintaining search indices for text content, managing transactions, or exporting data between applications; content services like those can be provided by a repository API without being specific to the application."

    A very nice description of how some parts of a database management system should work.

    But then he goes on to say:
    "Designing an API such that it can be independent from both applications and underlying data stores is a challenge. JCR has met that challenge through a generic, hierarchical data model..."

    Before about 1970 we had hierarcical and network based data models implemented in the database management systems of the time.
    Then came a rather smart man called Codd and showed us all that a relational storage model for data where far superior to both of the previous models (and all others known at the time). This was later backed up by a lot of academic work as well as practical products.

    Now in 2005 comes a Roy Fielding and writes that the hierarchical data model (albeith with some additional sugar) is revolutionary.
    Strange.
    Is it?
  10. JSR-170 is not about new way of storing _data_ - it's about storing _content_.

    Though content could be indeed stored in relational data storage on the layer below content repository, it differs from generic relational data in some aspects.

    JCR API is geared towards working with, you guessed it, content-rich websites and similar document-oriented systems (wikis, groupware, knowledge bases, issue trackers, CRMs).

    - Hierarchies in JCR are mapping 1:1 to URLs
    - Workspaces and their links are representation of staging vs. production paradigm.
    - It contains built-in search engine language.
    - Relaxed typing and mix-ins (multiple inheritance a-la Java interfaces) allow to define various document types and put constraints on them.

    Personally, I found JCR API to be quite convenient (as opposed to pure relational storage) for these typical content-management tasks:
    - given an /kind/of/structured/type/of/URL, retrieve a document
    - given a document, find among its children all documents corresponding to given criteria (e.g. last additions to a section of website)
  11. ...Before about 1970 we had hierarcical and network based data models implemented in the database management systems of the time.Then came a rather smart man called Codd and showed us all that a relational storage model for data where far superior to both of the previous models (and all others known at the time)....

    Relation is superior to hierarchial? In what way? Its efficient in terms of storing relationships, but piss poor in terms of navigation. I can't go from one object to its child without going through a relation. If relational were superior in every way, then instead of having objects with pointetrs to other objects in memory, we'd have relational memory models.

    I think the point of the revolution in the hierarchial data model is that the abstraction is hierarchial. When I work with PurchaseOrder 123, I should know it has a list of items. I sholdn't have to think "Let me find all items in the Item collection that have a PurchaseOrder id of 123".

    I don't think there's anything that says this can't be backed by relational databases, but hierachial abstraction is powerful.
  12. Relation is superior to hierarchial? In what way? Its efficient in terms of storing relationships, but piss poor in terms of navigation. I can't go from one object to its child without going through a relation. If relational were superior in every way, then instead of having objects with pointetrs to other objects in memory, we'd have relational memory models.I think the point of the revolution in the hierarchial data model is that the abstraction is hierarchial. When I work with PurchaseOrder 123, I should know it has a list of items. I sholdn't have to think "Let me find all items in the Item collection that have a PurchaseOrder id of 123".I don't think there's anything that says this can't be backed by relational databases, but hierachial abstraction is powerful.

    When you have independent document related only for navigation purposes, hierarchical model is good, but when you have real relation in data like book <-> author, then you need relational model.

    Unfortunately, in many applications you need both. If you start with relational database you may spend a lot of time to implement all standard CMS features, but if you start with CMS system, then you go right into the most advanced and difficult part of documentation to implement relations.


    So, the real issue is easy JSR 170/Relational mapping.

    Nebojsa
  13. Roy is a day Founder?[ Go to top ]

    Wow Since when did any one get a way to change history?

    Being an Ex-day employee I can safely say Roy was brought in as a Chief Scientist to help David Nueschler (CTO of Day)to be taken seriously by every one.

    Roy pulls his weight in the Apache foundation hence Day needs Roy.

    As for the API well I am yet to work on/with it so I reserve my judgement.