Tim Bray: Choose RELAX Now

Discussions

News: Tim Bray: Choose RELAX Now

  1. Tim Bray: Choose RELAX Now (21 messages)

    Tim Bray has posted "Choose RELAX Now," referring to the XML schema language RELAX-NG. This is in comparison to W3C XML Schema, previously more or less perceived as the way to specify XML schemas after DTDs.
    W3C XML Schemas (XSD) suck. They are hard to read, hard to write, hard to understand, have interoperability problems, and are unable to describe lots of things you want to do all the time in XML. Schemas based on Relax NG, also known as ISO Standard 19757, are easy to write, easy to read, are backed by a rigorous formalism for interoperability, and can describe immensely more different XML constructs.
    There's a significant difference between the two schemas. RELAX NG uses a sort of "Russian Doll" form and provides a compact form as well; XML Schema is rather verbose, to say the least, and uses a more XSL-like maze of twisty little definitions, all alike. This makes it more appropriate for some people and applications, in that this is how many language grammars are put together. W3C Schema also has a wider number of variable types built-in. However, writing XML Schema does seem to be a bit of a black art; it's certainly doable, but few seem to care to. RELAX NG, on the other hand, has a certain degree of user-friendliness that encourages its use in many applications; for example, many editors seem to use RELAX NG internally to suggest completions in DTD-less documents, plus many applications (OpenOffice, ATOM, etc.) use RELAX over XML Schema. What do you think? A complete and formal specification can be nice, but if nobody likes it, it's not worth very much in the real world; do you agree with Tim, and say "The time has come to declare it [XML Schema] a worthy but failed experiment, tear down the shaky towers with XSD in their foundation, and start using RELAX for all significant XML work. "?

    Threaded Messages (21)

  2. put up or shut up[ Go to top ]

    I have not seen any substantive criticism of XSD. I have been using it productively for five years--always to within tooling issues and a couple of hard stops such as the unusability of chameleon schemas. No one seems to be able to say quite what fault they find with it. (Note that I do NOT use it for runtime validation and do not ever expect to do runtime validation against any kind of schema. My use case for a schema language is to specify families of types, both by composition and by inheritance. XSD is a seamless fit for this.)
  3. finally![ Go to top ]

    Hooray for Tim Bray. We've been using Relax-NG (compact notation) for years now. Since the Sun specifications all use XSchema, which is entirely unreadable, it's generally easiest to just translate the XSchema to Relax-NG to figure out what the new syntax is supposed to be. Trying to understand the syntax from XSchema is useless. The general question is why everyone thought XML was the best representation for everything? I mean, you wouldn't want to program Java using XML notation, and in the case of grammars regular expression notation is well understood, even by beginning Perl programmers.
  4. Re: finally![ Go to top ]

    Hooray for Tim Bray. We've been using Relax-NG (compact notation) for years now.

    Since the Sun specifications all use XSchema, which is entirely unreadable, it's generally easiest to just translate the XSchema to Relax-NG to figure out what the new syntax is supposed to be. Trying to understand the syntax from XSchema is useless.

    The general question is why everyone thought XML was the best representation for everything? I mean, you wouldn't want to program Java using XML notation, and in the case of grammars regular expression notation is well understood, even by beginning Perl programmers.
    Mostly because very efficient XML parsers/processors already existed, so instead of reinventing the wheel with a DSL, XML was used. I mean it only makes sense to write XML types using XML, right? Or maybe not, in that case we can use some DSL for some programming idioms as well that java doesn't represent very well? I'm not against RELAX, and am rather a big proponent of DSLs, but just giving a bit of a background of why XML was used. Either way, what Tim says has been known and blogged/written about for years, so I don't know what's the hype now. Ilya
  5. reinventing the wheel[ Go to top ]

    BNF predates XSchema by decades. yacc is something like 30 years old. BNF is a long-standing, well-tested, usable syntax for describing grammars. It's not hard to parse. So if someone is to be blamed for reinventing the wheel, it would be the XSchema group, tossing out decades of experience describing grammars just because XML was cool. I mean, DTDs used BNF and were somewhat readable, and XSchema threw that away.
  6. Re: finally![ Go to top ]

    Mostly because very efficient XML parsers/processors already existed, so instead of reinventing the wheel with a DSL, XML was used.
    Please repeat after me 1000 times: XML is not a language, XML is not a language.... So, XML was used to define various DSLs syntaxes and grammars supposedly to avoid writing custom parsers.... But that does not alter the fact that we invent and use DSLs because they are valuable and necessary. XML was and is used to describe DSLs because allows any XML schema aware editor (XSD, DTD, or whatever) to act like poor man's IDE. That kind of "IDE" does not support of course all the richness of true language awareness. I think that all that 95% of that XML craze will be replaced by true DSLs and with proper tooling support by products like MPS.
  7. Re: finally![ Go to top ]

    Mostly because very efficient XML parsers/processors already existed, so instead of reinventing the wheel with a DSL, XML was used.

    Please repeat after me 1000 times:
    XML is not a language, XML is not a language....
    Dude, we discussed this before numerous times and I never disagreed with you. I'm one of the biggest proponents of using a DSL, I just basically explained why XML is mostly used in such cases. Say I'm writing a framework and besides limited time, I also have limited resources to train folks to use DSL. My team is already familiar with XML samantics and most tools out there support auto completion and validation based on XSD. So the choice is, create a DSL (which besides development time (which is trivial in some cases), also adds a lot of training overhead). The 1 hour solution is create an XML schema, which is pretty good actually at representing certain OO domains and a Configuration Parser either parse it into an OO model and/or use existing tools (i.e. JiBX) (notice no JAXB here, because it's horrible at representing a custom domain models). In many instances, XML though some might say is used as a programming language, is actually used to describe things. So, I'm all for say using a DSL for a build process, but I also don't see too many things wrong with the current ant build in XML. It all depends on how you look at it. Are you defining a programming language, or simply describing your build process for a processor to manage? You can argue both ways. Ilya
  8. Re: finally![ Go to top ]

    Dude, we discussed this before numerous times and I never disagreed with you.
    You appears to be on the same page, but you have used wrong words IMO, which imply using XML as language. Because words we use shape our minds and our world views I would like us to use words as precisely as possible. In this case we do not use XML instead of DSL, we _describe_ DSL with using XML alphabet, so lets honestly say so. That instantly makes way more sense and suggest that we should discuss what syntax is more beneficial for defining DSL, simplifying tooling, make easier to transfer knowledge, etc. IMO: use of XML obscures DSL definitions with no good reasons and therefore XML should not be used to describe real languages and slangs.
  9. putting up...[ Go to top ]

    Here is one then. There is no equivalent to SubstitutionGroup for attributes as there is for elements. This limits the things you can do with attributes quite badly. I wrote a long article on this here http://blog.nominet.org.uk/tech/2006/03/09/why-no-substitutiongroup-for-attributes-in-xml-schema/
  10. Re: put up or shut up[ Go to top ]

    I have not seen any substantive criticism of XSD. I have been using it productively for five years-... No one seems to be able to say quite what fault they find with it. (Note that I do NOT use it for runtime validation and do not ever expect to do runtime validation against any kind of schema. My use case for a schema language is to specify families of types, both by composition and by inheritance. XSD is a seamless fit for this.)
    Hmmh... Are you legally blind or something? For past 5 years, everyone in the field has been complaining about problems with Schema: not so much its expressiveness or lack thereof (there are things it does miss, like co-constraints, but that's not its biggest problem), but about its goddamn awkwardness for almost any use imaginable. Using w3c for validation or data binding is like pushing load of bricks with a rope. It readability is worse than DTDs; its strange type system covers neither hierarchical nor struct (OO) cases very well; implementing Schema processing is a very complicated thing to do (so relatively speaking it's as hard to implement as it is to use -- distinctly different from, say, RNG)... and in the end, very few people really understand how w3c Schema is even supposed to work. It is true though, that for data binding uses w3c Schema does its job. It is inconvenient to use, verbose, ugly and generally fubar (as much as it is when used for validation): but there is not much competition in this area. Whereas RNG runs circles around Schema for validation (it is arguably superior in every way, for validation use cases), it is not meant for data binding.
  11. Re: Tim Bray: Choose RELAX Now[ Go to top ]

    The write up is almost OK (though I'm not sure it's something everybody didn't already know). The problem is the title "Choose RELAX Now" ???? I wish I could, if there were just any viable libraries to use it in java as well as if it actually integrated with the other SOA based technologies out there. Unfortunately until the vendors and OSS projects actually start supporting it, the title should rather relate to a comparison (i.e. RELAS vs. XSD). Ilya
  12. No magic bullet[ Go to top ]

    All these validation schemes are only half-solutions and don't properly account for an inevitable case in business: dynamic validation. Sure these schemes all provide a means to check a tag value against a set of static values. But what happens when you need to check a tag value against a database table? Then you need a programming language, data access, etc. That means a true schema validation language has to have a full-power programming language. Sure, which one? If someone said XSLT, please shoot thyself. Perl? Python? JSP-EL? (sorry bad joke)? Java would be nice except it requires compilation. Groovy or JRuby or JPython? Well, anyway, when you get down to it, in most complicated schemas, structural validation is only a minor part of validation. RELAX-NG vs Schematron vs Schema vs DTD? Whatever. I say pick the simplest one that only processes structure, and then use XPath and your language of choice to perform real validation.
  13. Re: No magic bullet[ Go to top ]

    All these validation schemes are only half-solutions and don't properly account for an inevitable case in business: dynamic validation.

    Sure these schemes all provide a means to check a tag value against a set of static values. But what happens when you need to check a tag value against a database table?
    What happens if I need to validate a database column against data in another non-relational data source? Hmmm, do you need a programming language to do that as well? You're mixing two distinct data storage concepts and want them to be integrated. XSD is there to support XML data, just as DDL is there to support relational data.
  14. Re: No magic bullet[ Go to top ]

    No, I don't want them to be integrated. The point is, just like all the other XML "technologies", this consistently extends outside of its domain. Once you start doing datatype validations, it's over, you're screwed by the dynamic validation slippery slope. XML validators should only provide structural validation, that's it. That will restrict the problem space to a more compact, reliable language. DTDs weren't powerful enough. XSDs are unfocused, verbose, and ultimately a failure. RELAX-NG: Crappy name. Wow, super-crappy name. Supports two syntaxes (XML and DTD-ish)? And XSDs support some structural validation features that RELAX-NG doesn't? I don't sense progress. I sense desperation by XML Tool vendors to justify their existence. I sense the need for each mainstream programming language to implement each of these (DTD, XSD, RELAX-NG, Schematron) and still provide the means for doing real validation above and beyond what the validation frameworks can, and I'm not talking just data validation. It seems that each and every one of the validation schemes falls short in structural validation ability, which should be their primary mission. Type checking/default values/whatever that's all icing on the cake.
  15. Re: No magic bullet[ Go to top ]

    No, I don't want them to be integrated. The point is, just like all the other XML "technologies", this consistently extends outside of its domain.

    Once you start doing datatype validations, it's over, you're screwed by the dynamic validation slippery slope.

    XML validators should only provide structural validation, that's it.
    What? What about when you're storing your data as XML, what about RMI purposes, etc... with each needing type validation. Structural validations are limited in scope as to what you can then use XML for. If you're looking into thins like NXD, type validations are a must. Ilya
  16. XSD really difficult?[ Go to top ]

    I don't think that XSD are so difficult. I have used W3C Schemas after reading 2-3 hours power point slides it seems to me like programming in oo fashion. If you have to write a really detailed definition you prefare to use "relaxed" solution or a very good defined one?
  17. who cares?[ Go to top ]

    Well, the problem is that if you have to work with one of the zillions of XML-based standards, you have no choice _and_ don't care: the spec is defined in XML Schema and there is one written for you, so you don't even need to read the XSD; you just need to read their spec in PDF. For the rest of us, we don't care. I have never seen anyone in any shops writing any kind of formal grammar definition for their XML documents: no Schema, no RELAX-NG, no DTD. The structure is interpreted by the codes and the only "declaration" found in the XML documents are "<?xml encoding='utf-8'?>". There is no need to read/write any schema. If you can't decode the structure from some Word document or example XML files, go read the codes! (I'm sure some do, but they are rare specie.) So who cares?
  18. Re: who cares?[ Go to top ]

    For the rest of us, we don't care.
    90% of people do not care, so they would use whatever the creative 1% come up with and caring 9% support.
    I have never seen anyone in any shops writing any kind of formal grammar definition for their XML documents: no Schema, no RELAX-NG, no DTD.
    I have seen people writing schemes and cursing along the way. Of course I have seen schema-less documents but that was mostly because people gave-up on schemes by various reasons. In general they wish they could write one if that was not soooo cumbersome.
  19. waste of time[ Go to top ]

    XML Schema is adequate. Stop wasting people's time.
  20. Re: waste of time[ Go to top ]

    XML Schema is adequate. Stop wasting people's time.
    And DTDs were perfectly adequate before XML Schema. So why did people waste time with W3C Schema? If you want to clank along with your square tires, go ahead, but don't whine if others want smoother ride: just go back wasting your own time and let us stop wasting ours. Truth is, w3c Schema is as good and sensible a choice for xml validation as c++ is for programming. Schema may have its place with data binding (after all, most of cumbersome features it has were added to deal with OO mapping)... but for other uses its woefully inefficient. RelaxNG handles the structural validation (and, by borrowing core w3c Schema datatype lib, basic value validation as well) quite nicely. Plus its compact syntax is pure bliss, compared to ugly xml representations (although rng xml serialization itself is already a big improvement over xsd). One missing piece is still the data binding part: it'd be nice to have a companion to RelaxNG that tackled Object/Xml mapping.
  21. JAXB 2 and RELAX-NG[ Go to top ]

    fyi: JAXB 2.1 EA2 has partial support for RELAX-NG:
    -relaxng : treat input as RELAX NG (experimental,unsupported) -relaxng-compact : treat input as RELAX NG compact syntax (experimental,unsupported)
    @see http://forum.java.sun.com/thread.jspa?threadID=790085&tstart=0 @see http://wiki.java.net/bin/view/Javapedia/RELAX-NG
  22. Relax but not too much....[ Go to top ]

    If w3c schemas are so badly defined why we are speaking about it? Maybe because W3C Schemas are used by most of the products actually in the market as principal schema language? Why JAXB committee has choosen W3C Schemas...??? People are dissatisfied with DTDs.... It's a different syntax You write your XML (instance) document using one syntax and the DTD using another syntax --> bad, inconsistent Limited datatype capability DTDs support a very limited capability for specifying datatypes. You can't, for example, express "I want the element to hold an integer with a range of 0 to 12,000" Desire a set of datatypes compatible with those found in databases DTD supports 10 datatypes; XML Schemas supports 44+ datatypes