TSS Article: Using the Digester Component

Discussions

News: TSS Article: Using the Digester Component

  1. TSS Article: Using the Digester Component (20 messages)

    In this article, Harshad Oak looks at the Jakarta Commons Digester component, which helps to reduce the complexity involved in parsing XML. He shows you how Digester works on the simple concept of element matching patterns and how you can define rules in Java code as well as in a separate XML file. You'll also see some examples that reflect common XML parsing requirements.

    Read Using the Digester Component

    Threaded Messages (20)

  2. Deployment[ Go to top ]

    I am using common digester to parse XML into objects. First all everyone is happy with it. When we try to deploy into Solaris, and it parses the same XML file. We got the different object. We then simplified our test cases for using exactly same XML parser, same code and same JDK. The result is strange: Common digester generates different object in Windows and Solaris.
  3. Just missing mixed content handling ;-)
    But I'm happy to use the Simon Kitching's patch to do it !
    More info on:
    http://www.mail-archive.com/commons-dev at jakarta dot apache dot org/msg38889.html
    Mixed content can handle this type of XML:
    Hi, this is an example of some bold text.
  4. Simpsons & TSS[ Go to top ]

    Like the Simpsons, TSS looks like it has no stories or news to offer to J2ee Readers. Sure there many complex issues to discuss rather than How to use Xdoclet,Ant and most Jakarta commons lib.

    No harm intended!
  5. In this article, Harshad Oak looks at the Jakarta Commons Digester component, which helps to reduce the complexity involved in parsing XML.
    I am totally at loss to understand how this differs from JAXB which already is a simplest way to map XML to objects? Why should one use this instead of JAXB. Very surprisingly, this article does not use the word JAXB. Some JAXB implementations like EXOLAB CASTOR (I dont work for it :-), already have good enough features to map - without even knowing the minimal XML knowledge that Digester expects.

    -Sanjay
  6. It's been a year since I last looked at Castor, JAXB and Digester.
    Last used Castor version 0.79.

    My observations at that time were :

    JAXB is no doubt the easiest way to map XML to objects.
    However the generated classes are tightly coupled to xml schema, there by providing less flexibility if you want to use it for tasks such as application configuration and add application specific functionality in those classes.

    Castor allows to map XML to pre existing objects is the easiest possible way.

    However neither JAXB nor Castor provided good support for binding multiple implementations of a base class. (Things such as creating a collection of base abstract class - which contains the sub classes).

    Back then, Digester was the only way out with the above mentioned scenario. Have things changed now?

    -Srikanth
  7. since 3 or 4 years, i am using Digester for reading XML and JDOM for writing out my object structures to XML. I never had the impression that JAXB was "eaier" or even the "easiest mapping". However, maybe i should look into it again - though i am still quite happy with the Digester/JDOM combination.

    The only thing that bothered me with Digester some time ago was the handling of multiple and interleaved Namespaces (e.g. when elements have different namespaces than some of their attributes, etc.)
  8. What about XML Beans[ Go to top ]

    I currently prefer XML Beans:
    http://xml.apache.org/xmlbeans/
  9. The main disadvantage of using XmlBeans is that your businnes logic get locked into XmlBeans. The interfaces being generated by XmlBeans are not very easy to use, too many method calls....

    Based on Digester I have programmed my own XmlMapper class. This class maps properties of a java-class to attributes and properties. Each java-class then is an Xml-element.

    Using this method you just define your interface in JavaBeans and let the XmlMapper create/parse the corresponding Xml. This is for what I understood also the way how Digester handles this issue.
  10. If you really _must_ use XML for configuration (which you probably don't actually need at all) then XStream is an excellent choice. Whilst it isn't as flexible as Castor when dealing with node names, it's performance and simplicity is incredible. Add into the equation that it isn't Apache code which means that API's won't be changed on a whim (filling your build logs with deprecation warnings), the performance will be acceptible and the code may actually be production quality and you have a very compelling solution to a common problem.
  11. Hee, looks exactly what I am doing too. Seems like XStream is a simplified version of Digester, like my XmlMapper.

    Indeed serializing/deserializing is a better naming convention. This is also how I name it myself. The fact that it is XML in the end is just an aspect of an interface (in case you use Xml as an interface like SOAP). And so if u use Xml for this purpose, you dont want to end up in code-generation like JAX or XmlBeans.
  12. What about JiBX?[ Go to top ]

    What about JiBX?

    It has some nice mapping functions, allows to map existing objects to existing DTD/Schemas (unlike JAXB), is very fast and small. Looks like a good alternative!

    http://jibx.sourceforge.net/

    Stephan
  13. What about JiBX?[ Go to top ]

    Kewl!!
  14. More info about Xml Mapping see http://www.theserverside.com/news/thread.tss?thread_id=21312#94975
  15. Easiest as in less work for "Just in Time" Mapping.
    That does not mean it is easier to understand !!!
  16. Yes. I agree that Castor is tied to schema. It still is. But I cant imagine a scenario (even if it is config. files), where I would want to just parse without bothering about adherence to a schema. Perhaps, I have not come across such a scenario. In my experience, not having a schema to validate against "In the Long Run" messes things up. I found JAXB easier because I use XML Spy kind of tools to just write the schema and VOILA! I have all my classes generated. I know it wont be the best of the best performing classes, but it sufficed most of my performance requirements.
  17. I see XML just as a portable representation of some complex datastructure, mainly used in remote procedure calls like SOAP, JMS-messages etcetera.
    XML is therefor just an aspect in these RPC-calls. The fact XML is used should not influence the business logic in which the actual data is maintained. The reason why to keep this influence to an absolute minimum is to keep ur code generation in your MDA proces as clean as possible.
    For what I understood is that JAX(P/B), Castor, XmlBeans all more or less influence on your business logic. To use the generated code you should write an adapter to 'talk' to the generated classes.
    On the conterary, XStream / Digester / JIBX and also my XmlMapper (i stole this idea from Digester) use late binding in the Xml-(de)serialization proces, and therefor better they better fit in a model driven software development proces.
  18. Yes. I agree that Castor is tied to schema. It still is. But I cant imagine a scenario (even if it is config. files), where I would want to just parse without bothering about adherence to a schema. Perhaps, I have not come across such a scenario.
    You might try considering how to write a DTD or schema that describes Tomcat's server.xml file. You will soon find out that it is not technically feasible ... the set of attributes that are relevant for a particular <Valve> element, for example, are totally dependent on the Valve implementation class. And, the set of possible Valve implementation classes is infinite (since individual sites can easily provide their own implementations). The "set properties" rule in Commons Digester handles this scenario elegantly. It's a lot harder to accomplish with most other XML->Object mapping tools.

    Craig
  19. the set of attributes that are relevant for a particular <Valve> element, for example, are totally dependent on the Valve implementation class. And, the set of possible Valve implementation classes is infinite (since individual sites can easily provide their own implementations). The "set properties" rule in Commons Digester handles this scenario elegantly. It's a lot harder to accomplish with most other XML->Object mapping tools.Craig
    I understand. But in such a case, my dumb mind says I would rather go with a map data structure to map this rather than trying to plug the whole thing with setters and getters. I would then write a generic getter like getConfigAttribute that gets me the value. I still am getting a feeling digester will make my java files hard-code my XML tags and properties which I need to maintain manually (of course, JAXB also does this hard-coding, but at least I am "unware" of it as it happens as code generation. Just my 2 cents. Of course, Craig, you have much better credibility to enlighten us on this :-)
  20. I am totally at loss to understand how this differs from JAXB which already is a simplest way to map XML to objects?
    Digester does not specifically do data binding as do JAXB, Castor and XMLBeans. It simply allows trigger to be executed at specific XMLPath-like patterns. I don't think it allows for the serialisation/deserialisation of objects. Also, it's probably faster than binding systems as it work with SAX events.

    Some rambling about different binding systems (it's been a while since I've evaluated these technologies):

    Castor: Doesn't follow the JAXB spec (but who cares if it works). I use it for small docs, but I had problems with large schemas like (http://www.opentravel.org/prdoc.cfm?Name=648)

    JAXB: Works fine, but I had the following problem with support for xml:lang attributes, that was my showstopper (might be a parser issue though): http://forums.java.sun.com/thread.jsp?thread=509617&forum=34&message=2440821

    XMLBeans: I don't like the way it functions (compiling schema into a jar file), but it handles my schema correctly. The API is somewhat counter-intuitive, but I've learned to use it.
  21. Also, it's probably faster than binding systems as it work with SAX events.Some rambling about different binding systems (it's been a while since I've evaluated
    Do you have an idea how technlogies like jaxb are working and how slow is sax based API in comparision with e.g xml pull parsing (xpp)? In theory jaxb can get even better then xpp as it can contstruct the dedicated parser for your particular grammar.


    Michal
    (xstream really rocks!)