Opinion: What is this new Service Data Objects thing?

Discussions

News: Opinion: What is this new Service Data Objects thing?

  1. There have been a lot of people talking about the new Service Data Objects specification that IBM and BEA recently came out with. There have been many questions asked: "Does it mean we don't need JDO?", "How does it fit in with JSR 227?", and more. James Strachan thinks he has groked it, and writes up his thoughts on what SDO is, and how it integrates with other technologies (rather than making them obsolete).

    James says, "SDO is an ideal abstraction to represent blobs ('documents') of data that pass around a distributed system (e.g. the Data Transfer Object pattern often used in EJB / MOM). Its also a nice simple abstaction above XML Schema types, SQL DDL, UML and MOF stuff and the like - I'm suprised at how simple & clean the API is - nice job!"

    His view on how it fits into JSR 227 is: "JSR 227 could be a layer on top of this, providing standard UI controls & MVC layer for dealing with UI related issues but using SDO as the underlying model for data."

    This starts to make sense now doesn't it? James also takes us tip-toeing through the tulips as he explains how SDO can fit quite nicely with REST, "So SDO could work quite nicely together with servlets + REST as well as with web services & EJBs & SQL etc."

    He also has to mention his cool dynamic Java-ish language Groovy :)

    Resources

    Read James' first thoughts in: SDO rocks! Great job IBM & BEA
    Then read his response to others reactions to the spec in: SDO follow up
    Finally check out Groovy, the dynamically typed language for the JVM.

    Threaded Messages (28)

  2. Some parts of SDO are very similar to CarrierWave (http://carrierwave.sourceforge.net/). I will show how CarrierWave may be used on top of Hibernate in my presentation next week at JavaPolis, and also talk about the "Detached Objects" support that Hibernate had from the start. Not sure if I can also get some comparsion or overview about SDO in...
  3. Agreed. Quite a few folks have done similar things before: DynaBeans at Jakarta Commons, PropertySet at OpenSymphony etc. The cached RowSet is similar in concept - though lower level & SQL specific.

    Hopefully SDO can take off as the standard API to all of this stuff (arbitrary structured graphs of data). SDO could well become the 'data DOM' for those who don't need to access the XML InfoSet yet want to work with typed XML data.

    Then hopefully UI frameworks and maybe even web app frameworks can all build standard connectors/adaptors to SDO...

    James
    Core Developers Network
  4. my take:[ Go to top ]

    Here is my blog on the subject, pre-written for when our new blog goes live. (Sorry the tone is not appropriate for a TSS comment.)

    We are taking a close look at SDO. It's an interesting spec that comes a bit out of left field. My reading is that it provides a mechanism for manipulating and especially for externalizing graphs of objects or things that look sufficiently close to objects to be meaningfully represented as a "graph". For example, XML.

    This is naturally very important to us, since one of the significant things we have tried to achieve with Hibernate is to get away from the notion of "location transparency", and reinvent "distributed objects" as object graphs which may be moved between different processes. Especially, we are interested in the idea that a graph could be retrieved from the persistent store in one process, modified in another, and then have those modifications propagated back to the database in the first process, all with optimistic semantics.

    So, our biggest problem in all this is that tracking modifications to typesafe objects precisely is extremely difficult in Java without significant bytecode tricks (which we have been so far unwilling to adopt). SDO bypasses this problem by representing objects in a nontypesafe way (contrary to our notion of transparency). I'm not yet convinced that this is worth dropping the advantages of typesafeness for, though I recognize that the authors of the SDO spec are looking for an approach that abstracts away from POJOs, EJBs, DOMs, whatever.

    It has been suggested that Jakarta DynaBeans are a useful analogy when looking at this stuff, but I don't find this very useful at all. We are trying to figure out some kind of relationship to CarrierWave, which is also all about working with object graphs (Christian got excited about CarrierWave a while ago). Our biggest stumbling block so far is that SDO doesn't seem to address one of the main problems solved by CarrierWave: namely specifying where the object graph ends. My understanding is that this is left as an excercise for the reader, and for whatever native query language is in use, eg. HQL, XPath, etc. Anyway, whereas DynaBeans are a workaround for a specific limitation in the design of Struts, this is an approach to solving some of the more difficult problems in building distributed systems using domain models.
  5. my take:[ Go to top ]

    Thanks Gavin for your thoughtful comments on this. This is so much more useful than mindless "SDO Rocks!" posts with no substance. Looking forward to your blog.

    /T
  6. my take:[ Go to top ]

    Hi Gavin

    Its good to hear you're interested by SDO too


    > I'm not yet convinced that this is worth dropping the advantages of typesafeness

    FWIW SDO is typesafe - its just dynamically type safe rather than statically typesafe. e.g. in the same way that Ruby/Python/Groovy are dynamically typesafe rather than Java/C++ which are statically (compile time) typesafe

    As an interesting aside; inside a dynamically typed language for the JVM like Groovy / Jython / JRuby, there would be no difference between using SDO or using POJOs. They'd have the same look & feel and have the same typesafety - it'd just be a runtime rather than compile time check.


    > for, though I recognize that the authors of the SDO spec are
    > looking for an approach that abstracts away from POJOs, EJBs, DOMs, whatever.

    It gives people a choice. Use static or dynamic typing. Choice is good. Sometimes having a static set of beans for your data makes sense - particularly if you have heaps of business logic dependent on your persistent schema. Sometimes having a dynamically typed blob of data makes sense - where you have a relatively small amount of business logic but you have heaps of UI code or are working in a very distributed and loosely coupled system.

    Sometimes using both POJOs and SDO on the same system makes sense too (using SDO for the query/reporting part where there's little business logic, its mostly for generic queries, and using POJOs for your core domain model).

    SDO is useful when the exact schema is not that relevant or when the schema is dynamic. e.g. most UI code, when you're calling stored procedures, when you're using some legacy database or when you're using some external web service.

    To use Martin Fowlers patterns, SDO is useful when you need a Data Transfer Object pattern, a Table Module or Record Set abstraction or when you put most of your business logic in a Transaction Script. Where SDO really comes into its own is that a standard set of UI widgets can display & edit the data, rather like MS's ADO. So if you want a table/tree/master/detail UI, you'll get it for free with SDO - rather than having to write your own custom UI models/adapters for your POJOs.

    Another way to look at SDO is that it promotes simple java value objects without requiring code generation or O/R mapping. See this interesting discussion on representation versus encapsulation.


    > Our biggest stumbling block so far is that SDO doesn't seem to address one of the main problems solved by CarrierWave: namely specifying where the object graph ends

    But SDO is an API. The amount of state read from disk/file/persistent store per navigation is an implementation issue. I don't see how or why a 'fetch size' should be exposed to an API? Its like with many OODBMS or O/R mapping tools; you can configure how deeply objects are loaded & traversed as they are navigated to & what the 'fetch' size should be & what 'read ahead' should be set to etc.

    Configuring this stuff properly typically depends on the user's code. Different people traverse data in different ways for different reasons and the fetch policy should really reflect that for optimal performance. So which part of the graph is fetched may well depend on what navigations the user does (i.e. which kind of user you are).

    So this kind of thing is configuration detail - not something the the programmer needs to worry about in their application code so it shouldn't be part of the API. If a developer ever needs to know what the fetch-depth is, you could make this available as metadata on the SDO API.


    > It has been suggested that Jakarta DynaBeans are a useful analogy when looking at this stuff, but I don't find this very useful at all

    Really? I think the design of DynaBeans - it is very close to SDO; similar dynamically typed model, similar meta-data ability, similar navigation language etc. There's really not much difference between the two.

    e.g. from DynaBeans the DynaBean interface maps pretty closely to SDO's DataObject interface, the DynaClass interface is close to Type and in SDO and DynaProperty is very close to SDO's Property interface.

    They are both very similar in design; the main differences are that SDO provides typesafe helpers (getInt etc) and that SDO supports a ChangeSummary for capturing the changes to an entire graph. Also SDO adds DataGraph which is just a wrapper for the root of a tree. Other than that they are very similar.

    Be that as it may they both solve the same kinds of problems and have comparable designs.


    > Anyway, whereas DynaBeans are a workaround for a specific limitation in the design of Strut

    Not really; DynaBeans came around as a workaround for a limitation in the JavaBeans spec. Namely that you cannot create beans with an arbitrary set of property names at runtime (without bytecode manipulation). i.e. JavaBeans is statically typed - not dynamically typed like SDO / DynaBeans.

    To be precise, its actually a limitation of the JavaBeans Introspector, which is class based rather than instance based. i.e. the introspector just takes a class to determine what properties are available and their types, so you must create a new class to have a bean with different properties. If this limitation were lifted then DynaBeans maybe wouldn't exist - or at least DynaBeans would be JavaBeans.

    James
    Core Developers Network
  7. my take:[ Go to top ]

    Not really; DynaBeans came around as a workaround for a limitation in the

    > JavaBeans spec. Namely that you cannot create beans with an arbitrary set of
    > property names at runtime (without bytecode manipulation). i.e. JavaBeans is
    > statically

    Sounds like a problem much better addressed with a solid AOP framework rather than adding yet another useless API to the J2EE platform.
  8. SDO v DynaBeans[ Go to top ]

    Hi James :)

    >> FWIW SDO is typesafe - its just dynamically type safe rather than statically typesafe. e.g. in the same way that Ruby/Python/Groovy are dynamically typesafe rather than Java/C++ which are statically (compile time) typesafe <
    Sorry I was speaking imprecisely. This is a question of dynamic vs static typing, not strong vs weak typing. I love scripting languages, and I love SmallTalk even more. But the trouble with me is I have a shockingly bad memory for detail and I just *need* my autocomplete these days ;)

    Of course there are many times when dynamic models are great but, for now, I still prefer POJOs for domain models. I'm open-minded however.




    >> But SDO is an API. The amount of state read from disk/file/persistent store per navigation is an implementation issue. I don't see how or why a 'fetch size' should be exposed to an API? Its like with many OODBMS or O/R mapping tools; you can configure how deeply objects are loaded & traversed as they are navigated to & what the 'fetch' size should be & what 'read ahead' should be set to etc. <
    You are missing the fact the SDO promotes the passing of graphs between tiers and processes. In this context it is absolutely /critical/ that we are able to specify exactly where the fetched graph ends. (This is what is offered by CarrierWave.)

    In the context of ORM tools (thanks for picking that one hehe), it is absolutely NOT enough to specify lazy/eager fetching at the "configuration" level. This is why the state of the art tools provide very sophisticated ways of specifying the depth of the fetched graph in the API itself. You are thinking of something more primitive like .... ooohh .... entity beans ;) For example, Hibernate exposes this via the query language. Check it out ;)




    >> Really? I think the design of DynaBeans - it is very close to SDO; <
    I don't know anyone who is using DynaBeans to externalize graphs of data between processes. And if I did, I would tell them to *stop* - they are not appropriate for this problem ;) In particular they offer no good way to "remember" modifications made in a different tier and allow those modifications to be propagated back to the persistent store.

    Furthermore, I don't see any functionality in beanutils for transforming a DOM graph, or a graph of POJOs or entity beans to a graph of DynaBeans. Isn't that the whole point of SDO?

    So how are they similar, except on a superficial level?

    >> SDO supports a ChangeSummary for capturing the changes to an entire graph <
    Exactly. To me this seems to be the crux of what SDO offers.

    >> Be that as it may they both solve the same kinds of problems <
    Really? Moving graphs of objects around a distributed architecture is the same kind of problem as displaying and capturing data to/from a JSP page?

    I dunno, perhaps I misinterpret the intention of one or the other of these....


    >> Not really; DynaBeans came around as a workaround for a limitation in the JavaBeans spec. Namely that you cannot create beans with an arbitrary set of property names at runtime (without bytecode manipulation). <
    The JavaBeans spec "supports" java.util.Map. I do not see any reason on earth why Struts Action validation and Struts tag libraries could not also support Map. We recently ripped DynaBean support out of Hibernate because the community concluded that DynaBeans didn't offer anything useful that Maps don't.

    Now, OTOH, if, as is the intent with SDO, DynaBeans offered some functionality for representing DOMs, POJOs, EJBs via a unified interface, /that/ would be useful!


    peace
    Gavin
  9. Maybe both?[ Go to top ]

    FWIW SDO is typesafe - its just dynamically type safe rather than statically typesafe. e.g. in the same way that Ruby/Python/Groovy are dynamically typesafe rather than Java/C++ which are statically (compile time) typesafe <>

    > Sorry I was speaking imprecisely. This is a question of dynamic vs static typing, not strong vs weak typing. I love scripting languages, and I love SmallTalk even more. But the trouble with me is I have a shockingly bad memory for detail and I just *need* my autocomplete these days ;)
    >
    > Of course there are many times when dynamic models are great but, for now, I still prefer POJOs for domain models. I'm open-minded however.

    Why not have a JavaBean that implements the SDO spec? That way you get strongly typed code plus all the fancy SDO graph features and additional dynamic properties. (I did something vaguely like this before at joda.org, the problem was my implementation was poor and performed really badly :-()

    Stephen
  10. Maybe both?[ Go to top ]

    Stephen

    It should be fairly simple to create an SDO facade to any tree of POJOs. In fact this sounds like a useful helper class to me.

    The only problem with this approach is folks could change the POJO underneath the SDO facade; though some bytecode jiggery pokery could be used to ensure changes at the POJO layer are reflected up to the SDO wrapper.

    James
    Core Developers Network
  11. SDO v DynaBeans[ Go to top ]

    Hey Gavin

    > Of course there are many times when dynamic models are great but, for now, I still prefer POJOs for domain models. I'm open-minded however.

    Agreed. It depends on your application & requirements over how much POJO or SDO you use.


    > You are missing the fact the SDO promotes the passing of graphs between tiers and processes. In this context it is absolutely /critical/ that we are > able to specify exactly where the fetched graph ends. (This is what is offered by CarrierWave.)

    Agreed - though I repeat is it an API requirement? I'm not sure it is - though with the O/R mapping use case, maybe it is :).

    An implementation could decide how much of a graph to marshall from tier to tier irrespective of the API. e.g. the same graph could be marshalled in different ways depending on which tier you're in. Sometimes you might want to marshall more of a graph to certain clients for performace; so I can't help think that there might not be 'one' marshalling depth for all users of the same SDO graph.

    Though I guess having an SDO-standard way to describe this stuff is a good thing (maybe SDO's metadata could be used?). An end user (application programmer) probably doesn't care too much about this - maybe there's a need for 'service providers' API for folks writing the marshallers - which (rather like JNDI) could be a separate, lower level API?


    >> Really? I think the design of DynaBeans - it is very close to SDO; <
    > I don't know anyone who is using DynaBeans to externalize graphs of data between processes.

    I was really just talking about the API of DynaBeans and that its very close to SDO. SDO can be used for other things than to just externalize graphs of data between processes. e.g. it can be used like ADO from Microsoft or a facade to JDBC or web service client etc.

    Folks are using DynaBeans as an API to a graph of dynamically typesafe 'bean' graph. How that graph is fetched & marshalled is an implementation detail.

    However the reason I'm so keen on SDO is (i) that there's a change notification API which is good for O/R mapping tools and (ii) that I suspect there may be more momentum behind it to actually create standard marshallers / facades for EJB, SQL, XSD, WS etc.


    > And if I did, I would tell them to *stop* - they are not appropriate for this problem ;) In particular they offer no good way to "remember"
    > modifications made in a different tier and allow those modifications to be propagated back to the persistent store.

    Agreed. For O/R mapping-like problems (like you look at quite a bit :) then no they don't currently have any change-tracking features. Hence the reason why I'm so glad SDO came along :)

    However if you just want a dynamically typed graph of data (say from an XML document or bunch of SQL queries) DyanBeans are fine. Though now we have SDO I expect it to replace DynaBeans.


    > Furthermore, I don't see any functionality in beanutils for transforming a DOM graph, or a graph of POJOs or entity beans to a graph of
    > DynaBeans. Isn't that the whole point of SDO?

    Again - implementation detail :). DynaBeans comes with a helper class to turn JDBC result sets into DynaBeans and to provide a DynaBean API ontop of POJOs / EJBs. commons-sql provides a way to use DynaBeans to proxy a relational database.

    Noone's done a DOM / XSD to DynaBean tool yet AFAIK.


    > So how are they similar, except on a superficial level?

    The API is practically identical (apart from no API to track changes).


    >> SDO supports a ChangeSummary for capturing the changes to an entire graph <> Exactly. To me this seems to be the crux of what SDO offers.

    Agreed


    >> Not really; DynaBeans came around as a workaround for a limitation in the JavaBeans spec. Namely that you cannot create beans with an arbitrary set of property names at runtime (without bytecode manipulation). <
    > The JavaBeans spec "supports" java.util.Map. I do not see any reason on earth why Struts Action validation and Struts tag libraries could not also support Map.

    Map does not offer metadata introspection of the properties & types available. Thats what DynaBeans offers. If you don't need typesafety or metadata introspection, sure just use a Map.


    > Now, OTOH, if, as is the intent with SDO, DynaBeans offered some functionality for representing DOMs, POJOs, EJBs via a unified interface, /that/ would be useful!

    DynaBeans does offer this interface; it just lacks some implementations :) It can handle POJOs, EJBs and SQL today. It just needs work on the DOM/XSD interface.

    Anyways, with the change notification API, SDO does look a better API for us all to use.

    We just need a good open source implementation! :)
  12. Death to DynaBeans?[ Go to top ]

    Spotted this link today...

    http://sixlegs.com/blog/java/death-to-dynabeans.html

    looks an interesting alternative to DynaBeans. I guess this could be coupled with an SDO implementation too; so you could have static & dynamic type safety with both real bean & SDO facade.

    James
    Core Developers Network
  13. my take:[ Go to top ]

    It would be very exciting if Hibernate would extend its support for not typesafe objects.
    Java world usually uses some kind of JavaBeans for data mapping (EJB,JDO,JAXB).
    I have used Hibernate for custom application where data model does not change and it works perfectly. It is nice to have typesafe objects then and code syntax help in IDE
    But this approach simply fails for large systems where we want data discovery. It is not possible to create dynamic form for just created table. It is not possible to write intelligent code which will discover new table and make
    operation on it. And this features are very useful.
    When there is hundreds of tables such approach speeds up development. Any large database dependant application would need this.
    With type safe data framework I'cant create engine which will take some XML file describing form and will bind controls for this form to my dataset. It is simply impossible, becouse I have to create interfaces for this new table.
  14. In many ways SDO looks a lot like some RDF API. Well, any DLG probably does... After reading the spec, I think there are some good points in it and I hope it will be adopted.

    But apart from the very limited type system, my biggest problem is their notion of closure, meaning all references must eventually be resolvable within the graph. If I consider a particular SDO graph conceptually as a subgraph of a larger, probably even distributed, graph, then it cannot be assumed that this is always possible. The most important intended use-case of SDO is shoving data back and forth between data stores and some front end or remote system. Just as you sometimes get only foreign key values as part of a SQL query result you will have unresolved references in a graph as the result of most queries. I would find it extremely useful to add a notion of resolved/unresolved references.
  15. Except for the API, SDO looks pretty much like what hywy's PE:J have been having since 2001.

    Cheers
    Gopalan
  16. It looks rather like something C24's Integration Objects (IO) could do. It already creates Java models from a number of different sources (GUI, XML Schema, Databases etc.) and has XPath navigators, regardless of original source. Without having read the SDO spec in its entirety I guess it would just need to implement an SDO interface.

    -John-
  17. http://www.c24.biz/products.htm

    How many times does "open source" have to be mentioned on a product page?

    "used to generate open Java source code"
    "The deployed components are open source Java components"
    "Deploy financial services integration solutions using Java open source code generation and components based on industry standard"
    "using open technologies." <insert>they forgot to say open source technologies</insert>
    "We provide this as an open source component generated by the C24 IO toolkit"

    What is the difference between "generated Java source code" and "generated open Java source code"? Intelligent answers please and not the obvious, ;-).

    Regards,

    William
  18. Hi William,

    Re: http://www.c24.biz/products.htm

    What is the difference between "generated Java source code" and "generated open Java source code"? Intelligent answers please and not the obvious, ;-).

    I'm not sure there is a "non obvious" answer to your question. In our market place (whole sale banking) most of the competition sell "engines" that work on proprietary data. This means that when something goes wrong or you're not sure how it works you've got to pay a lot of money to find someone who either knows the product inside out or get the supplier to tell you what you need.

    Companies like Mercator, Tibco, IBM, Helograph, SeeBeyond, Mint etc. are all such companies. We get over 80% of our service revenue from supporting products like Mercator, our product (IO) generates a lots of questions for the first few days and then often nothing for months once they get going. We therefore see producing "open" source code as an important distinction from our competitors. We produce a tool for generating Java Bound components and provide the source code with it, the engine can be J2EE, pure J2SE or even Jini's JavaSpaces as we used in MarketConnect.

    You also point out that we have failed to mention the "open source technologies" used in our product. Well I can tell you that we use JAXB, Castor, quite a bit of Apache and a few others. The reason we don't mention it (sort of obvious) is because someone who doesn't know these tools might decided to do some research only to find that Castor, for example, will do just the job they need and we loose the sale before they've even asked about the other things we do.

    I appreciate the feedback about the page and I will have another read through its readability and "openness".

    We have thought about making the tool free for open source projects, it's already used by Robin Roos for modelling XML in the JDO Expert Group so if anyone's interested I'd be very interested in looking into this SDO stuff (for want of a better work).

    -John-
    C24.biz
  19. Bridging the Gap Between Java Clients and EJBs - UserObject

    Best regards,

    William Louth
    Product Architect
    JInspired - "Tune and Test with Insight"
  20. Dynamic, Adaptive Object Models[ Go to top ]

    What BEA and IBM did just added marketing and hype to this concept and some standarts like XPath and XQuery. Concept itself is not new.

    Take a look:

    Dynamic Object Model

    The User Defined Product Framework

    Adaptive Object Model
  21. Giedrius

    I don't think anyone was claiming this was a brand new idea, (the opposite in fact, on this thread we've been highlighting many similar efforts).

    However the neat thing about SDO is its looking like we'll finally get a standard API to this kinda thing. Its nice when standards follow established practices, rather than inventing something brand new & untried :)

    James
    Core Developers Network
  22. Dynamic, Adaptive Object Models[ Go to top ]

    Well that's a relief, finally the powers-that-be have realized the value of Adaptive Object Models.

    People seem to forget, standards are not about inventing something new.

    Carlos
  23. Performance[ Go to top ]

    SDO uses "Data Graph" to represent a graph of "Data Objects". Data Objects represents the native data (elements, attribtues, rows, etc..) plus references to other data objects.

    On update, the "Data Graph" is responsible for recording to all changes to its data objects (including new, modified or removed data objects). The "Data Mediator Service" is responsible for applying those changes in Data Graphs to underlying data source.

    The pros of such abstraction are quite evident to any body who has tried to integrate disparate data sources. However, one aspect, that would be clear on implementation, is the overhead of this abstraction.

    Can I build a heavy transactional system (trading application) or a system that requires me to load large data sets in memory (huge product catalogs) using this abstraction and still meet me performance objective ? Also, the notion of "optimistic concurrency" is limiting to an extent.

    If I have xml-enabled relational database, I would think about leveraging it from system design perspective before jumping on the SDO bandwagon.

    --Naren
  24. IBM and BEA voted No.[ Go to top ]

    If IBM and BEA came out with this spec, why then did they - as the only ones - vote No to the current spec? That just doesn't make sense, unless of course, they didn't write it. But then, who did?
  25. IBM and BEA voted No.[ Go to top ]

    If IBM and BEA came out with this spec, why then did they - as the only ones - vote No to the current spec? That just doesn't make sense, unless of course, they didn't write it. But then, who did?


    To the best of my knowledge Oracle submitted JSR 227. IBM and BEA thought the scope was too wide and voted No. Reference : http://web1.jcp.org/en/jsr/results?id=2045
  26. Concept of SDO evovles from the Microsoft .NET Framework. .NET has an extensive Data Trapping,Cacheing mechincsm called Dataset. SDO is also sort of an In-Memory Database for the Remote Resources. Its not a new feature to the J2EE but can also used in a generic applications in-memory datastructure catalog.

    -Sadhasivam.j
  27. Websphere 6.0, SDO[ Go to top ]

    Will IBM Websphere 6.0 implement the Service Data Objects specification?
  28. Websphere 6.0, SDO[ Go to top ]

    According to an IBM Software Group presentation, WebSphere 6 will support SDO.
  29. JSR-235[ Go to top ]

    JSR 235
    Service Data Objects

    http://www.jcp.org/en/jsr/detail?id=235

    http://www.jcp.org/en/jsr/results?id=2346