Schema design question

Discussions

XML & Web services: Schema design question

  1. Schema design question (5 messages)

    Hi,

    I'm new to XML + Web Services and have been trying to use JAXB to marshal various xml data from an online source. I think I understand JAXB well enough to use it properly but am having "problems" with the schema design in general.

    I have 3 different XML files (from the same source) that I want to marshal--call them xml1, xml2 and xml3 files. xml2 is just like xml1 but with an additional element. Half of xml3 contains the same elements as xml1, with the other half entirely different. Here is an example:
    ###########################
    XML1:
    <artist>
     <album>
      <song/>
      </album>
     <album>
      <song/>
     </album>
     ...
    </artist>

    XML2:
    <artist>
     <album>
      <song/>
      <time/>
     </album>
     <album>
      <song/>
      <time/>
     </album>
     ...
    </artist>


    XML3:
    <artist>
     <album>
      <title/>
      <producer/>
     </album>
     <album>
      <title/>
      <producer/>
     </album>
     ...
    </artist>
    ############################

    Here you can see that XML2 is like XML1 but with the additional time element. XML3 is also like XML1 but with different subelements under the album element--the point being it uses the same artist root element and album subelement as that of XML1 and XML2.


    What I have done so far is to write 3 different schemas, XML1.xsd, XML2.xsd and XML3.xsd, and have JAXB generate the classes for me. My problem here is that when I need to access data from either XML1, 2, or 3 using JAXB, I always have to use package names because of the namespace conflict. For example:

    com.recordcompany.xml1.Artist artist1 = <package>unmarshal()
    com.recordcompany.xml2.Artist artist2 = <package>unmarshal()
    com.recordcompany.xml3.Artist artist3 = <package>unmarshal()


    This is becoming a major pain, especially with long package names. Is there a way to "properly" design the schema according to the xml data provided above, i.e. into 1 schema? The problem I'm seeing here is that since the root element and its direct subelement (artist/album) are the same throughout the xml data, I will always encounter namespace conflicts and therefore will always need to use the full package name as an identifier.

    Any help on designing this schema would be appreciated. Thanks!

    -los

    Threaded Messages (5)

  2. Schema design question[ Go to top ]

    Design one schema that covers all three XML files, with elements like "song", "title" and "producer" marked as optional.
  3. Schema design question[ Go to top ]

    Hi, Thanks for the reply.

    I considered that but doesn't making elements optional "break" the schema in terms of validating. For example, suppose I take your advice and make the elements song, title, and producer option using minOccurs="0". Now suppose all the incoming data are of type XML3. I want to make sure that each album always includes a title and a producer. With the *combined* schema, it does not always check for the mandatory elements since I just made them optional. Is there someway around this? Thanks!

    -los
  4. Schema design question[ Go to top ]

    You can make elements optional by groups rather than individual. In DTD notation, your content model for your <song> tag might be:

    <!ELEMENT album (song?,(time|(title,producer))?)>

    Or maybe:

    <!ELEMENT album (song?,time?,(title,producer))?)>

    Either way, you can't have title without producer.

    In XML Schema notation:

    <xs:element name="album">
      <xs:complexType>
        <xs:sequence>
          <xs:sequence minOccurs="0" maxOccurs="1">
            <xs:element ref="title"/>
            <xs:element ref="producer"/>
          </xs:sequence>
          <xs:element minOccurs="0" maxOccurs="1" ref="song"/>
          <xs:element minOccurs="0" maxOccurs="1" ref="time"/>
        </xs:sequence>
      </xs:complexType>
    </xs:element>

    I am not sure how these more complex schemas will interact with JAXB, but it ought to work.
  5. Schema design question[ Go to top ]

    Thanks for the reply again!

    I see what you are saying and have not tried your solution yet. I encountered another situation which is puzzling me so far--the xs:sequence tag. I'm assuming this is required when defining complex types (all examples of schemas I have seen use this sequence when defining complex types). Therefore the order of the elements is essential. Going back to the solution where I combine everything into one schema and make elements optional, suppose I encounter this situation:

    XML1
    ----
    <artist>
     <album>
      <song/>
      <time/>
     </album>
     ..
    </artist>

    XML3
    ----
    <artist>
     <album>
      <time/>
      <producer/>
      <song/>
     </album>
     ..
    </artist>


    Here, with the combined schema, the producer element would be optional. However, the time and song elements in XML3 are swapped around when compared to XML1. When I define my combined schema like this:

    ...
     <xsd:complexType="album">
       <xsd:sequence>
         <xsd:element name="song" type="xsd:string"/>
         <xsd:element name="time" type="xsd:string"/>
         ...
       </xsd:sequence>
     </xsd:complexType>
    ...

    This will work with XML1 data but NOT XML3 data. The error goes something like "element song not expected..." when parsing XML3. As weird as it may be, the actual data I use is not consistent, i.e. although XML1, XML2 and XML3 practically use the same elements, they are not all in 'order'.

    So with this scenario, what should I do here. Thanks!

    -los
  6. Schema design question[ Go to top ]

    Hmm. You can use the <xsd:all> model group, which does not require elements to be in any particular order. Unfortunately, the <xsd:all> group does not allow nested model groups, so the following would not be allowed:

    <xsd:element name="album">
        <xsd:complexType>
            <xsd:all>
                <!-- BAD! NESTED GROUPS NOT ALLOWED BY xsd:all! -->
                <xsd:sequence minOccurs="0" maxOccurs="1">
                    <xsd:element ref="title"/>
                    <xsd:element ref="producer"/>
                </xsd:sequence>
                <xsd:element minOccurs="0" maxOccurs="1" ref="song"/>
                <xsd:element minOccurs="0" maxOccurs="1" ref="time"/>
            </xsd:all>
        </xsd:complexType>
    </xsd:element>

    You might be able to model it as a pair of choices, but this is pushing the boundaries of XML Schema, and might break JAXB:

    <xsd:element name="album">
        <xsd:complexType>
            <xsd:choice>
                <xsd:all>
                    <!-- Without title/producer -->
                    <xsd:element minOccurs="0" maxOccurs="1" ref="song"/>
                    <xsd:element minOccurs="0" maxOccurs="1" ref="time"/>
                </xsd:all>
                <xsd:all>
                    <!-- With title/producer -->
                    <xsd:element ref="title"/>
                    <xsd:element ref="producer"/>
                    <xsd:element minOccurs="0" maxOccurs="1" ref="song"/>
                    <xsd:element minOccurs="0" maxOccurs="1" ref="time"/>
                </xsd:all>
            </xsd:choice>
        </xsd:complexType>
    </xsd:element>

    If the above doesn't work, maybe the best you can do is this:

    <xsd:element name="album">
        <xsd:complexType>
            <xsd:all>
                <xsd:element minOccurs="0" maxOccurs="1" ref="title"/>
                <xsd:element minOccurs="0" maxOccurs="1" ref="producer"/>
                <xsd:element minOccurs="0" maxOccurs="1" ref="song"/>
                <xsd:element minOccurs="0" maxOccurs="1" ref="time"/>
            </xsd:all>
        </xsd:complexType>
    </xsd:element>