Using The Digester Component

Web services are unimaginable without XML, and with the usage of Web services projected to boom over the next few years, there is no escaping XML. In this article, you will look at the Jakarta Commons Digester component and how it can make working with XML a simple task.

Introducing the Digester Component

Adapted from:
Pro Jakarta Commons, by Harshad Oak
Publisher: Apress
ISBN: 1590592832

The emergence of Extensible Markup Language (XML) has led to a complete transformation of the application development world. All development seems to revolve around XML these days. In fact, it is difficult to find any new development that does not directly or indirectly rely on XML. For instance, Web services are unimaginable without XML, and with the usage of Web services projected to boom over the next few years, there is no escaping XML. In this article, you will look at the Jakarta Commons Digester component and how it can make working with XML a simple task.

Table 7-1 shows the component details.

Table 7-1. Component Details

Name Version Package
Digester 1.5 org.apache.commons.digester

One problem that has plagued XML development is the complexity of parsing and using XML. Everybody knows the advantages of using XML, but I doubt many people are able to write a piece of code that parses an XML file and picks up the value of a certain XML tag. Writing a piece of Java code to parse a piece of XML directly using the two core Application Programming Interfaces (APIs)—the Document Object Model (DOM) and Simple API for XML (SAX)—is anything but simple. APIs such as JDOM are relatively simple, but considering how often you have to encounter and tackle XML, Digester provides an easier option. You can be parsing and using XML in your Java code in less than the time it will take you to read this article. (No, I will not eat my hat if you do not manage to accomplish the task.)

To quickly get up and running with Digester, you will see an example first. Do not worry about the syntax because you will look at that in detail later in this article. The scenario for this example is that you are presented with an XML file containing the details of all the students attending the various courses at your training institute. What you are expected to do is to pick up all the details present in the XML file, and for each student detail, populate an instance of a class Student, which you create. You will then store all the Student instances created in an instance of the java.util.Vector class for further processing.

You first need to create a Student class that will hold the details of a student (see Listing 7-1).

Listing 7-1. Student Class

package com.commonsbook.chap7;

public class Student {
    private String name;
    private String course;

    public Student() {
    }

    public String getName() {
        return name;
    }

    public void setName(String newName) {
        name = newName;
    }

    public String getCourse() {
        return course;
    }

    public void setCourse(String newCourse) {
        course = newCourse;
    }
    public String toString() {
        return("Name="+this.name + " & Course=" +  this.course);
    }
}

Apart from the overridden toString method, there is nothing special about this class. It has just two properties with getter and setter methods for each. You want to create instances of this class based on the data you retrieve from an XML file.

Listing 7-2 shows the XML file contents. The number of student tags is not relevant; you could very well introduce more students if you like.

Listing 7-2. students.xml

<?xml version="1.0"?>
<students>
        <student>
                <name>Java Boy</name>
                <course>JSP</course>
        </student>
        <student>
                <name>Java Girl</name>
                <course>EJB</course>
        </student>
</students>

NOTE In Listings 7-1 and 7-2 you can see that the names of the tags and properties match exactly. So, for a tag course, you have a property named course in the Student class. However, you can have different tag names and property names. No mapping of the XML and the Java class is required; you could very well store the value of a tag ABC into a property XYZ. The matching names merely keep things simple.

The Java class DigestStudents, shown in Listing 7-3, will pick up the contents of the various XML tags and create a Vector class instance that can hold many instances of the class Student.

Listing 7-3. DigestStudents

package com.commonsbook.chap7;

import java.util.Vector;
import org.apache.commons.digester.Digester;

public class DigestStudents {
    Vector students;

    public DigestStudents() {
        students= new Vector();
    }

    public static void main(String[] args) {
        DigestStudents digestStudents = new DigestStudents();
        digestStudents.digest();
    }

    private void digest() {
        try {
            Digester digester = new Digester();
            //Push the current object onto the stack
            digester.push(this);

            //Creates a new instance of the Student class
            digester.addObjectCreate( "students/student", Student.class );

            //Uses setName method of the Student instance
            //Uses tag name as the property name
            digester.addBeanPropertySetter( "students/student/name");

            //Uses setCourse method of the Student instance
            //Explicitly specify property name as 'course'
            digester.addBeanPropertySetter( "students/student/course", "course" );

            //Move to next student
            digester.addSetNext( "students/student", "addStudent" );

            DigestStudents ds = (DigestStudents) digester.parse(this.getClass()
                                .getClassLoader()
                                .getResourceAsStream("students.xml"));

            //Print the contents of the Vector
            System.out.println("Students Vector "+ds.students);
        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }

    public void addStudent( Student stud ) {
        //Add a new Student instance to the Vector
        students.add( stud );
    }
}

In very few lines of code you have managed to create the Vector of Student instances. The output of the program is as follows, displaying the tag values in the file students.xml:

Students Vector [Name=Java Boy Course=JSP, Name=Java Girl Course=EJB]

Pretty cool, eh? I would have loved to write the corresponding DOM and SAX code to compare and illustrate the advantage of using the Digester component, but writing DOM and SAX code is something I forgot a long time ago and am not very keen on learning again. So you will just continue with the Digester experiments. Specifically, you will next look at some Digester fundamentals and learn how the example in Listing 7-3 works.

Understanding Digester Concepts

The Digester component has its origins in the Struts framework project. It began its life as a tool to quickly parse the struts-config.xml file without having to directly interact with SAX. Because the Digester functionality can be useful to all kinds of applications, it later moved to the Commons project.

The Digester is not an XML parser but just a high-level interface that uses SAX underneath to accomplish the actual XML parsing. So a requirement for Digester is the presence of an XML parser conforming to Java API for XML Processing (JAXP) version 1.1 or later. The Digester also depends on the following Commons components:

  • The BeanUtils component
  • * The Collections component
  • * The Logging component

Because Digester uses SAX to do the parsing, XML processing with Digester happens in an event-driven manner. An event-driven manner is when events are triggered while the document is being parsed; what you need to do is provide handlers for these events. That is the way SAX works. SAX is all about events being fired when a certain occurrence is found. SAX events are fired on occurrences such as starting tags, ending tags, and so on. DOM works a little differently: Object models are created in memory and parsed. However, when using the Digester, you do not need to understand how SAX or DOM works, and you do not need to do any SAX-specific tasks in your code. Just stick to Digester’s rules, and you should soon be parsing XML documents with ease.

Digester uses a stack to store or retrieve objects as the XML file is being parsed. If you are not familiar with what a stack is, just think of it as a box in which you keep putting items and can remove them only on the basis of Last In First Out (LIFO). Java provides a stack implementation with java.util.Stack.

Based on the rules defined and the XML encountered, the Digester component pushes objects on the stack. Upon encountering the start of a tag, the associated object is pushed onto the stack, and it is popped only after all the nested contents of that tag are processed. So, in Listing 7-3 upon the student tag being encountered, an instance of Student class will be pushed onto the stack and will be popped once the processing of its child tags name and course is complete.

Using Matching Patterns

The big advantage of using the Digester component instead of other APIs is the presence of element matching patterns. Unlike other APIs where you have to worry about parent/child relationships among tags, what is important with Digester is the matching pattern specified. For example, in Listing 7-3, you used the matching patterns students/student, students/student/name, and students/student/course. This is an easy and developer-friendly usage to precisely convey the tag to which you want to refer. If you have to map the tags in Listing 7-2 to the corresponding matching pattern, the mapping will be as shown in Table 7-2.

Table 7-2. Tag Pattern Mapping

Tag Pattern
<students> students
<students> students/students
<name> students/students/name
<course> students/students/course

You can also use the wildcard * if you want to have a more generalized matching. So the pattern */name would have matched all name tags within the document.

Using Rules

With element matching patterns you convey the exact location of the tag in the XML structure. However, to tell the Digester component what needs to be done upon finding that tag, you need to define processing rules. These rules fire when the matching pattern is found. All rules are expected to extend the abstract class org.apache.commons.digester.Rule and define specific actions that need to be taken when a certain element occurs.

You can define your own rules to handle application-specific cases. The Digester component comes with a set of rule implementations that extend the Rule class; you can find them in the package org.apache.commons.digester. As you move along, you will see some of these rules in the examples. In Listing 7-3 you used ObjectCreateRule to create an instance of the Student class, and you used BeanPropertySetterRule to set the properties of the class.

Before getting into a more complex XML example than the one you saw in Listing 7-2, you will look at the steps you need to perform for Digester to successfully retrieve data from XML:

  1. You need to create a new instance of org.apache.commons.digester.Digester and configure it using the various setXxx methods provided by the class. Among other properties, you can define whether the XML should be validated, define the logger to be used, and define the Rules implementation object.
  2. You push any initial objects on the object stack using the Digester’s push method before you define the patterns and the rules to be used. In Listing 7-3, you pushed the current object on the stack using the keyword this. The reason you need to push this initial object is because Digester keeps pushing and popping objects from the stack as it encounters tags. So the first object is created and pushed onto the stack upon encountering the first tag, and this object is popped off the stack when the last tag is processed. Because you need to hold a reference to the object for the first tag, the initial object you push before you parse the XML serves the purpose and retains a reference to that object.
  3. Register element matching patterns and the rules you want to be fired for each case. In Listing 7-3 you register three patterns and two rules that you want to be fired.
  4. Finally, you parse the XML file using the parse method of the Digester instance you created.

    NOTE The order in which you do things is important for Digester. You cannot randomly move around statements before the call to the parse method. For example, in Listing 7-3, you cannot move the call to addObjectCreate to after the call to addSetNext.

You will now look at a more complex XML example and try to process it using Digester. You will also see how you can move the specifying of Digester patterns and rules from code to a configuration XML file.

Following XML Rules

In Listing 7-3, most of the code is dedicated to configuring the Digester instance. Hardly any of the code can be termed as action-oriented code. The most common usage of Digester is to process XML-based configuration files. The reason why these configuration files are used is to keep code free of configuration information and make changes possible without having to change the code and recompile it. It would be unfair if you placed Digester configuration information within Java code. Even this bit has to move to a configuration XML file.

The package org.apache.commons.digester.xmlrules deals with this issue, and the DigesterLoader class that is present in this package makes it possible to create a Digester instance using just the information in an XML file.

In the following example, you will first look at Java code that will accomplish the task along somewhat similar lines as the example in Listing 7-3 and then move to an XML-based configuration file for the same example.

Listing 7-4 shows the XML file from which you want to fetch information. The XML stores information about an academy, its students, and its teachers. The Digester code picks up these details and makes them manageable within Java code

Listing 7-4. academy.xml

<?xml version="1.0"?>
<academy name="JAcademy" >
        <student name="JavaBoy" division="A">
                <course>
                    <id>C1</id>
                    <name>JSP</name>
                </course>
                <course>
                    <id>C2</id>
                    <name>Servlets</name>
                </course>
        </student>
        <student name="JavaGirl" division="B">
                <course>
                    <id>C3</id>
                    <name>EJB</name>
                </course>
        </student>

        <teacher name="JavaGuru">
                <certification>SCJP</certification>
                <certification>SCWCD</certification>
        </teacher>
        <teacher name="JavaMaster">
                <certification>OCP</certification>
                <certification>SCJP</certification>
                <certification>SCEA</certification>
        </teacher>
</academy>

NOTE With Listing 7-4 I have tried to address the many scenarios you might encounter when parsing XML files. Using this code from this example can get you started in no time.

Because you have to hold the XML data in Java objects, you need to decide which classes you have to create. Instances of these classes will hold the data for you. Looking at this example, you should see four classes that together can do a good job of holding the data in a properly structured format. These classes are Academy, Student, Course, and Teacher. You could very well create more classes, such as Certification. The most important thing is that you cannot have these as just separate classes; you also need to maintain the relationships among them as depicted in the XML file. So, you will first put down the Java classes. Instances of the Java classes will hold the data for you.

An instance of the Course class is meant to store just the name and the ID of the course. The Course instance will not be maintaining its relation to the Student; this will be done by the Student instance. Listing 7-5 shows the Course class; it has two properties and the corresponding get and set methods. Note that the package name for classes used in this example is com.commonsbook.chap7.academy.

Listing 7-5. Course Class

package com.commonsbook.chap7.academy;
import org.apache.commons.beanutils.PropertyUtils;

import java.util.Vector;

public class Course {
    private String id;
    private String name;

    public Course() {
    }

    public String getId() {
        return id;
    }

    public void setId(String newId) {
        id = newId;
    }

    public String getName() {
        return name;
    }

    public void setName(String newName) {
        name = newName;
    }

    public String toString() {
        StringBuffer buf = new StringBuffer(60);
        buf.append("ntCourseId>>> " + this.getId() + "t");
        buf.append("CourseName>>> " + this.getName());

        return buf.toString();
    }
}

Next you will define the Student class that not only has to hold information about the student but also about the courses the student attends. As shown in Listing 7-6, the student details are stored using properties, and the courses will be stored as a Vector of Course instances.

Listing 7-6. Student Class

package com.commonsbook.chap7.academy;
import java.util.Vector;

public class Student {
    private Vector courses;
    private String name;
    private String division;

    public Student() {
        courses = new Vector();
    }

    public void addCourse(Course course) {
        courses.addElement(course);
    }

    public String getName() {
        return name;
    }

    public void setName(String newName) {
        name = newName;
    }

    public String getDivision() {
        return division;
    }

    public void setDivision(String newDivision) {
        division = newDivision;
    }

    public void setCourses(Vector courses) {
        this.courses = courses;
    }

    public Vector getCourses() {
        return courses;
    }

    public String toString() {
        StringBuffer buf = new StringBuffer(60);

        buf.append("nStudent name>> " + this.getName());

        Vector courses = this.getCourses();

        //Iterate through vector. Append content to StringBuffer.
        for (int i = 0; i < courses.size(); i++) {
            buf.append(courses.get(i));
        }

        return buf.toString();
    }
}

Listing 7-4 shows that, for a teacher, you are expected to store the name and the list of certifications held by the teacher. The Teacher class, shown in Listing 7-7, does this by using a String property for the name and a Vector holding String instances for the certifications list.

Listing 7-7. Teacher Class

package com.commonsbook.chap7.academy;
import org.apache.commons.beanutils.PropertyUtils;

import java.util.Vector;

public class Teacher {
    private String name;
    private Vector certifications;

    public Teacher() {
        certifications = new Vector();
    }

    public void addCertification(String certification) {
        certifications.addElement(certification);
    }

    public String getName() {
        return name;
    }

    public void setName(String newName) {
        name = newName;
    }

    public void setCertifications(Vector certifications) {
        this.certifications = certifications;
    }

    public Vector getCertifications() {
        return certifications;
    }

    public String toString() {
        StringBuffer buf = new StringBuffer(60);
        buf.append("nTeacher name>> " + this.getName());

        Vector certs = this.getCertifications();

        //Iterate through vector. Append content to StringBuffer.
        for (int i = 0; i < certs.size(); i++) {
            buf.append("ntCertification>> " + certs.get(i));
        }

        return buf.toString();
    }
}

The academy tag is the root tag shown in Listing 7-4. So the Academy class not only has to store the name of the academy but also references to the data held by the child tags of the academy tag. Therefore, the Academy class, shown in Listing 7-8, has two Vectors, one that will store instances of Student classes and another that will store instances of Teacher classes. So directly or indirectly you should be able to access all the data depicted in Listing 7-4 using a reference to a properly populated Academy class instance. The overridden toString method will be used later in the article to print the data held by an Academy instance.

Listing 7-8. Academy Class

package com.commonsbook.chap7.academy;
import org.apache.commons.beanutils.PropertyUtils;

import java.util.Vector;

public class Academy {
    private Vector students;
    private Vector teachers;
    private String name;

    public Academy() {
        students = new Vector();
        teachers = new Vector();
    }

    public void addStudent(Student student) {
        students.addElement(student);
    }

    public void addTeacher(Teacher teacher) {
        teachers.addElement(teacher);
    }

    public Vector getStudents() {
        return students;
    }

    public void setStudents(Vector newStudents) {
        students = newStudents;
    }

    public Vector getTeachers() {
        return teachers;
    }

    public void setTeachers(Vector newTeachers) {
        teachers = newTeachers;
    }

    public String getName() {
        return name;
    }

    public void setName(String newName) {
        name = newName;
    }

    public String toString() {
        StringBuffer buf = new StringBuffer(60);

        buf.append("Academy name>> " + this.getName());

        Vector stud = this.getStudents();
        Vector teach = this.getTeachers();
        buf.append("nn**STUDENTS**");

        //Iterate through vectors. Append content to StringBuffer.
        for (int i = 0; i < stud.size(); i++) {
            buf.append(stud.get(i));
        }

        buf.append("nn**TEACHERS**");

        for (int i = 0; i < teach.size(); i++) {
            buf.append(teach.get(i));
        }

        return buf.toString();
    }
}

Now that you are done with the classes that will store the data for you, you will move to the Digester code that will actually parse the XML. You will first see how you specify Digester instructions within the Java code. Next you will move out these instructions to an easily configurable XML file, making your Java code short and simple. Listing 7-9 shows the Java code to specify Digester rules and parse the XML accordingly. The thing to note in this piece of code is the usage of the following rules:

  • ObjectCreate: This rule creates a new instance of the classes Academy, Student, Teacher, and Course on a matching pattern being found.
  • SetProperties: The SetProperties rule sets the properties of the class using the attribute values. Because the name of the attribute and the property in the class matches exactly, you did not specify those details; however, if the attribute names in XML and property names in Java differ, you have to specify that mapping.
  • BeanPropertySetter: This rule sets the properties of the bean using the values of the child tags. For example, the id and name properties of the instance of the class Course are set using this rule.
  • SetNext: The SetNext rule moves to the next course, student, and teacher tags. You have also specified the method to call in each case.
  • CallMethod: The CallMethod rule specifies the method to be called upon a certain pattern being found. You also specify the number of parameters that this method expects.
  • CallParam: The CallParam rule specifies the parameter value to be passed to the method call defined using the CallMethod rule.

Listing 7-9. DigestJavaAcademy Class (Digester Rules Defined in Java Code)

package com.commonsbook.chap7.academy;
import org.apache.commons.beanutils.PropertyUtils;
import org.apache.commons.digester.Digester;

import java.util.Vector;

public class DigestJavaAcademy {
    public static void main(String[] args) throws Exception {
        DigestJavaAcademy d = new DigestJavaAcademy();
        d.digest();
    }

    public void digest() throws Exception {
        Digester digester = new Digester();
        digester.addObjectCreate("academy", Academy.class);

        //Set the attribute values as properties
        digester.addSetProperties("academy");

        //A new Student instance for the student tag
        digester.addObjectCreate("academy/student", Student.class);

        //Set the attribute values as properties
        digester.addSetProperties("academy/student");

        //A new Course instance
        digester.addObjectCreate("academy/student/course", Course.class);

        //Set properties of the Course instance with values of two child tags
        digester.addBeanPropertySetter("academy/student/course/id", "id");
        digester.addBeanPropertySetter("academy/student/course/name", "name");

        //Next Course
        digester.addSetNext("academy/student/course", "addCourse");

        //Next student
        digester.addSetNext("academy/student", "addStudent");

        //A new instance of Teacher
        digester.addObjectCreate("academy/teacher", Teacher.class);

        ///Set teacher name with attribute value
        digester.addSetProperties("academy/teacher");

        //Call Method addCertification that takes a single parameter
        digester.addCallMethod("academy/teacher/certification",
            "addCertification", 1);

        //Set value of the parameter for the addCertification method
        digester.addCallParam("academy/teacher/certification", 0);

        //Next Teacher
        digester.addSetNext("academy/teacher", "addTeacher");

        //Parse the XML file to get an Academy instance
        Academy a = (Academy) digester.parse(this.getClass().getClassLoader()
                               .getResourceAsStream("academy.xml"));

        System.out.println(a);
    }
}

The order in which you define rules is important. You have just represented what was obvious to you in the XML in a form that Digester can understand.

To execute this piece of code, you need to have the academy.xml file present in the CLASSPATH. Listing 7-10 shows the output upon executing this piece of code.

Listing 7-10. Output Upon Executing the Code in Listing 7-9

Academy name>> JAcademy

**STUDENTS**
Student name>> JavaBoy
  CourseId>>> C1  CourseName>>> JSP
  CourseId>>> C2  CourseName>>> Servlets
Student name>> JavaGirl
  CourseId>>> C3  CourseName>>> EJB

**TEACHERS**
Teacher name>> JavaGuru
  Certification>> SCJP
  Certification>> SCWCD
Teacher name>> JavaMaster
  Certification>> OCP
  Certification>> SCJP
  Certification>> SCEA

Looking at Listing 7-9, it is obvious that almost all the code is dedicated to configuring the Digester. Did they not teach us in school that wherever possible move all configurable items to a file that can be easily managed and manipulated? So why not do that in this case?

The org.apache.commons.digester.xmlrules package provides for an XML-based definition of Digester rules. Defining Digester rules in XML is quite simple once you get the hang of the various rules and what they do for you. Considering the more widespread nature of XML, your Digester rules are now more easily understandable to a wide variety of people involved. Even your manager might understand a thing or two!

Listing 7-11 shows the rules you defined using Java in Listing 7-9 but using XML instead.

Listing 7-11. academyRules.xml Digester Rules Defined in XML

<?xml version="1.0"?>
<digester-rules>
  <pattern value="academy">
      <object-create-rule classname="com.commonsbook.chap7.academy.Academy" />
      <set-properties-rule />
      <pattern value="student">
          <object-create-rule classname="com.commonsbook.chap7.academy.Student" />
          <set-properties-rule />

          <pattern value="course">
            <object-create-rule classname="com.commonsbook.chap7.academy.Course" />
            <bean-property-setter-rule pattern="id"/>
            <bean-property-setter-rule pattern="name"/>
            <set-next-rule methodname="addCourse" />
         </pattern>
         <set-next-rule methodname="addStudent" />
     </pattern>

     <pattern value="teacher">
         <object-create-rule classname="com.commonsbook.chap7.academy.Teacher" />
         <set-properties-rule />
         <call-method-rule pattern="certification" methodname="addCertification"
             paramcount="1" />
         <call-param-rule pattern="certification" paramnumber="0"/>
         <set-next-rule methodname="addTeacher" />
     </pattern>
 </pattern>
</digester-rules>

In the XML in Listing 7-11, the rules defined in XML almost directly map to the methods defined in the Java in Listing 7-9. All the rules now are defined using tags of that name. The easiest way to check the usage of these tags is to open the digester-rules.dtd file. You can easily find this file in the source download of the Digester component. However, even with the binary download, this file can be extracted out of commons-digester.jar file and is present in the org.apache.commons.digester.xmlrules package. You can also look at the file and Digester code using ViewCVS at http://jakarta.apache.org/site/cvsindex.html.

Document Type Definition (DTD) files define the syntax and structure of XML files, and although they take some getting used to, understanding them is not difficult.

Once you are done defining the rules in XML, the Java bit left is simple. Listing 7-12 shows the Java code where you just define the rules file to be used to create a Digester instance and then parse the XML file using that Digester instance.

Listing 7-12. DigestXMLJavaAcademy Class (Java Code Using Rules Defined in XML)

package com.commonsbook.chap7.academy;
import java.io.File;
import java.util.Vector;
import org.apache.commons.beanutils.PropertyUtils;
import org.apache.commons.digester.Digester;
import org.apache.commons.digester.xmlrules.DigesterLoader;

public class DigestXMLJavaAcademy {

    public static void main( String[] args ) {
        DigestXMLJavaAcademy xmlDigest= new DigestXMLJavaAcademy();
        xmlDigest.digest();
    }

    public void digest(){
        try {
           //Create Digester using rules defined in academyRules.xml
           Digester digester = DigesterLoader.createDigester(
               this.getClass().getClassLoader().getResource("academyRules.xml"));

           //Parse academy.xml using the Digester to get an instance of Academy
           Academy a = (Academy)digester.parse(
           this.getClass().getClassLoader().getResourceAsStream("academy.xml"));

           Vector vStud=a.getStudents();
           Vector vTeach=a.getTeachers();

    for (int i = 0; i < vStud.size(); i++) {
           System.out.println("Student>> "+PropertyUtils.describe(vStud.get(i)));
           }

           for (int i = 0; i < vTeach.size(); i++) {
           System.out.println("Teacher>> "+ PropertyUtils.describe(vTeach.get(i)));
           }
        } catch( Exception e ) {
            e.printStackTrace();
        }
    }
}

The two files academy.xml and academyRules.xml have to be present in the CLASSPATH, and upon execution of the code, you get the same output as shown in Listing 7-10 that you got using the Java code in Listing 7-9.

Introducing Other Digester Features

Apart from these Digester features, I will mention some other features of Digester:

  • The Logging capability of Digester can be useful while troubleshooting. Digester uses the Commons Logging component and the Digester class even provides a setLogger method with which you can define the exact logger to be used.
  • The org.apache.commons.digester.rss package provides an example usage of Digester to parse XML in the Rich Site Summary (RSS) format, which is widely used by news sites to provide news feeds. Most of the popular content providers support RSS, and you can find more information about RSS at http://blogs.law.harvard.edu/tech/rss/.
  • You can configure Digester to validate XML using a DTD file. You should register the DTD using the register method, and you can switch on validation using the setValidating method of the Digester class.
  • You can configure Digester to match patterns based on namespaces. You use the methods setNamespaceAware and setRuleNamespaceURI so that the Digester does not confuse a name tag in a namespace X with a similar name tag in a namespace Y.

Summary

In this article, you looked at the Digester component, which drastically cuts down on the complexity involved in parsing XML. You saw how Digester works on the simple concept of element matching patterns and how you can define rules in Java code as well as in a separate XML file. You also saw some examples that reflected common XML parsing requirements.

Using Digester and defining the rules in a separate XML file gets a big thumbs-up from me. I highly recommend Digester for all your XML parsing requirements.


About the Author

Harshad wrote the books Pro Jakarta Commons (Apress, 2004), Oracle JDeveloper 10g: Empowering J2EE Development (Apress, 2004) and also coauthored Java 2 Enterprise Edition 1.4 Bible (Wiley & Sons, 2003).

Harshad Oak has a master's degree in computer management and is a Sun Certified Java Programmer and a Sun Certified Web Component Developer. He is the founder of Rightrix Solutions (http://www.rightrix.com) that is primarily involved in software development and content management services. Harshad has earlier been part of several J2EE projects at i-flex Solutions and Cognizant Technology Solutions.

Furthermore, he has written several articles about Java/J2EE for CNET Builder.com (http://www.builder.com/). He is also a guest lecturer on Java and J2EE. He can be reached at [email protected]

Dig Deeper on Front-end, back-end and middle-tier frameworks

App Architecture
Software Quality
Cloud Computing
Security
SearchAWS
Close