Where XML goes astray... and when you should or shouldn't use it

Discussions

News: Where XML goes astray... and when you should or shouldn't use it

  1. Derek Denny-Brown has been involved with XML before XML existed. He has taken a look back, and written up his thoughts on where he would like to go back in time and change things.

    He discusses allowed characters, whitespace, XML namespaces, and more.
    First, some background: XML was originally designed as an evolution of SGML, a simplification that mostly matched a lot of then existing common usage patterns. Most of its creators saw XML and evolving and expanding the role of SGML, namely text markup. XML was primarily intended to support taking a stream of text intended to be interpreted as a human readable document, and delineate portions according to some role. This sequence of characters is a paragraph. That sequence should be displayed with a link to some other information. Et cetera, et cetera. Much of the process in defining XML based on the assumption that the text in an XML document would eventually be exposed for human consumption. You can see this in the rules for what characters are allowed in XML content, what are valid characters in Names, and even in "</tagname>" being required rather than just "</>".

    Read Where XML goes astray...

    Dare Obasanjo also got into the mix, and posted criteria on when you shouldn't or shouldn't be using XML.
    XML is the appropriate tool for the job if the following criteria are satisfied by choosing XML as the data representation format for a given application.

    1. there is a need to interoperate across multiple software platforms
    2. one or more of the off-the-shelf tools for dealing with XML can be leveraged when producing or consuming the data
    3. parsing performance is not critical
    4. the content is not primarily binary content such as a music or image file
    5. the content does not contain control characters or any other characters that are illegal in XML
    6. If the expected usage scenario does not satisfy most or all of the above criteria then it doesn't make much sense to use XML as the data representation format for the situation in question.

    The XML Litmus Test: Understanding When and Why to Use XML

    Threaded Messages (38)

  2. #1 there is a need to interoperate across multiple software platforms
    Lets simply fix firewall stupidity and allow IIOP/Naming ports ( much less pain than whole XML stack) this way we will have near perfect interoperability between various platforms.

    Rest of that stands:no human consumption - no XML.
  3. #1 there is a need to interoperate across multiple software platforms
    Lets simply fix firewall stupidity and allow IIOP/Naming ports ( much less pain than whole XML stack) this way we will have near perfect interoperability between various platforms.Rest of that stands:no human consumption - no XML.
    I see no technical problems related to "firewall" too, it looks like a political motive to sell some workaround.
  4. Plenty of CORBA ORB's handled this through smart tunneling of the IIOP inside of HTTP GET requests. So the GET becomes something like

    GET HIOP 1.0 /0xFFAAEE

    Blah blah. Now the firewall cant tell HEX encoded IIOP from any other GET request. Ta-da you beat the packet filtering of the firewall.

    Dave Wolf
    Cynergy Systems
  5. What happened to messaging ?[ Go to top ]

    I totally agree that XML has been over used to a large extend, we have somehow failed to grasp the nature of our problems and have used XML as a magic bullet for all sorts of inter-changes
    The J2EE community could have concentrated on making JMS an industry standard but it never happened. JMS (object based communication) ended up being an underdog to the MQ and MSMQ
  6. What happened to messaging ?[ Go to top ]

    I find it obvious that JMS is a spec designed to extend Java's reach as a development platform to existing and established messaging products such as those you've mentioned.
    I also think there's nothing wrong with that (and no, I don't sell or develop any *MQ).
    It's also 98% CORBA (http://java.sun.com/products/jms/faq.html#corba_alignment), i.e. based on existing specs.
  7. What happened to messaging ?[ Go to top ]

    It's also 98% CORBA (http://java.sun.com/products/jms/faq.html#corba_alignment), i.e. based on existing specs.

    XML and Corba dont lie in same layer,they don't compete with each other.Intra component communication is what Corba addresses and ,and inter service (including Component services)communication is through XML.


    Regards
    Surajeet
  8. layers[ Go to top ]

    It's also 98% CORBA (http://java.sun.com/products/jms/faq.html#corba_alignment), i.e. based on existing specs.
    XML and Corba dont lie in same layer,they don't compete with each other.Intra component communication is what Corba addresses and ,and inter service (including Component services)communication is through XML.RegardsSurajeet

    A CORBA remote object reference is a layer 7 (application layer) concept just like a service end point. XML and CORBA overlap at layer 6 (presentation/marshalling). SOAP/XML and CORBA overlap at layers 6 and 7.

    Paul C.
  9. layers[ Go to top ]

    Hi Paul,

    That might be true. The understanding i had is that there are differencet scenarios based on the size/scale/purpose of the project.
    For using Corba,i need to know what are the functions that are to be written in IDL and this is the first stage of development.Then produce stubs and skeletons , and proxies etc. which are specifically for Client-Server scenario.So this means the IDL is the central thing and stubs and skeleton are dervived from it.
    What if i am not aware of the central thing called IDL,but decide on communicating with ASCII,if a communication is required in future.This is when xml can play a role.

    Regards
    Surajeet
  10. Hi Paul,That might be true. The understanding i had is that there are differencet scenarios based on the size/scale/purpose of the project.For using Corba,i need to know what are the functions that are to be written in IDL
    I would say it is generally a good idea to know what do you call :)
    and this is the first stage of development.Then produce stubs and skeletons , and proxies etc. which are specifically for Client-Server scenario.
    Was not WSDL added exactly to catch up with CORBA and provide machine readable spec of the running service?

    So this means the IDL is the central thing and stubs and skeleton are dervived from it.What if i am not aware of the central thing called IDL,but decide on communicating with ASCII,if a communication is required in future.This is when xml can play a role.RegardsSurajeet

    Strictly speaking it is not necessary to start from IDL and generate all those skeletons and stubs ( although it is very convenient ), feel free to use dynamic CORBA invocation (DII) and construct your calls yourself http://www.cuj.com/documents/s=8244/cujcexp2101vinoski/

    I never used DII myself but nonetheless it is possible :)
  11. layer[ Go to top ]

    XML and Corba dont lie in same layer,they don't compete with each other.Intra component communication is what Corba addresses and ,and inter service (including Component services)communication is through XML.RegardsSurajeet
    Of course not! XML is just alphabet ( latin ), not even language. Multiple schemas define languages XML languages /SOAP, ebXML etc./ ( like English, German, Italian etc )
  12. #1 there is a need to interoperate across multiple software platforms
    Lets simply fix firewall stupidity and allow IIOP/Naming ports ( much less pain than whole XML stack) this way we will have near perfect interoperability between various platforms.Rest of that stands:no human consumption - no XML.

    Basically you are right. The firewall allowing ONLY port 80 access can't do shit against attacks based on an XML RPC. All security treats are valid and apply on XML based RPC as any other remoting mechanism. So the security of the port 80 behind a firewall is a mith when it commes about web services/ XML based remoting. So we might open RMI/IIOP CORBA anything else ports and go ahead with 'classic' remoting where it's possible and use XML only for 10% of the goddamn interoperability where it is sooo good.
  13. Two more scenario for XML[ Go to top ]

    1) If you don't know how to write a parser;
    2) If you don't know how to use a debugger and know only how to print something to the console

    then you should use XML.

    these are THE true reasons XML becomes popular, IMHO.
  14. Two more scenario for XML[ Go to top ]

    3) If you do not like overcomplicating your project with writing a parser and you are not ignorant of other's ideas and skills
    4) If you are wondering, what he hell does the debugger and printing to console has to do with using XML

    Why such a sudden bashing of XML? This keeps coming up over and over on TSS. What don't you really like about XML, huh? Anybody? Do you have any concrete facts where your project has failed by using XML as one of the implementing technologies? Please tell me, or otherwise all this is just blowing hot air.

    Regards,

    Artem D. Yegorov
    http://www.activexml.org
  15. Yeah, great...[ Go to top ]

    Yeah, you're right, let's just come back to a time when data is impossible to transform or understand because of proprietary formats you have to parse yourself. It's not that I don't know how to write a parser, I've written quite a few, it's that I don't want to write a parser anymore, because it doesn't give me anything more than parsing XML. XML is chatty? Use short tag names, zip it, do whatever you want, but don't add data formats over data formats over data formats that we'll all have to parse with custom code. Anytime you change things, you always have drawbacks, and you always have lots of people telling you how it was better before, blablabla, and how the new thing is rubbish. Of course, XML is not a golden hammer, and the rules given by Dare seem very good to me, but don't throw it away for a new war of data formats.

    My 2 cents anyway :)
  16. Re: Yeah, great...[ Go to top ]

    I think nobody wanted to start a data formats war.

    Also, I don't understand why you bring proprietary formats into in the discussion. Do you mean that CORBA/IIOP is a closed format ? ... Well that can't be since this is standardized.

    About XML ... I also think this is overabused these days.
    I can agree with the argument that it can allow communication in an heterogenous world, and even maybe make it easier with firewalls.

    But, what else ?

    -> XML is human readable

    Well not necessarly. Look at a SOAP message for a complex service and tell me if this is REALLY readable. Agreed, this is more readable than a IIOP conversion. All applications that I've seen so far use machine-to-machine communication, so what's really the point to make the format human-readable ? BTW, using short tag names and zipping won't make it more readable.

    -> Reusing parser

    That's true ... but you still need to write some lines of code to handle you SAX stream or DOM tree. Of course, you can always use libraries like XMLBeans.

    -> Flexible

    For sure, XML is flexible. But again, it is always what you want for a computer-readable format ? Here, I'm refering to the way XML managed whitespaces. For a machine-to-machine discussion, whitespaces are most of the time garbage that pollute the bandwidth and need extra code to be skipped.

    We could discuss for hours about the pros and cons of using XML. As you said, XML is not a golden hammer. It fits some jobs but not others.

    Regards,
    Sebastien.
  17. Re: Yeah, great...[ Go to top ]

    I think nobody wanted to start a data formats war.

    Also, I don't understand why you bring proprietary formats into in the discussion. Do you mean that CORBA/IIOP is a closed format ? ... Well that can't be since this is standardized.

    About XML ... I also think this is overabused these days.
    I can agree with the argument that it can allow communication in an heterogenous world, and even maybe make it easier with firewalls.

    But, what else ?

    -> XML is human readable

    Well not necessarly. Look at a SOAP message for a complex service and tell me if this is REALLY readable. Agreed, this is more readable than a IIOP conversion. All applications that I've seen so far use machine-to-machine communication, so what's really the point to make the format human-readable ? BTW, using short tag names and zipping won't make it more readable.

    -> Reusing parser

    That's true ... but you still need to write some lines of code to handle you SAX stream or DOM tree. Of course, you can always use libraries like XMLBeans.

    -> Flexible

    For sure, XML is flexible. But again, it is always what you want for a computer-readable format ? Here, I'm refering to the way XML managed whitespaces. For a machine-to-machine discussion, whitespaces are most of the time garbage that pollute the bandwidth and need extra code to be skipped.

    We could discuss for hours about the pros and cons of using XML. As you said, XML is not a golden hammer. It fits some jobs but not others.

    Regards,
    Sebastien.
  18. Yeah, great...[ Go to top ]

    XML is a nice common format for a lot of things today and is what future documents will be formatted in.

    As "data", (going back to the discussion), there are things about XML that don't seem as nice, ie: database. Using SAX, you can sort of "query" an XML document (without loading the whole "database" in memory with DOM) and you are basically searching for tags. With a database, you don't need the redundancy and everything under a column is going to represent the same thing, there is an "invisible structure" and a very fast performant indexing going on that parsers and readers cannot beat.

    Much configuration is "data". Sometimes, I wonder if developers should be using two databases (a configuration/data dictionary database controlled by system) and an application database. Runtime updating is faster and a little easier (a single SQL statement) as opposed to a bunch of API calls or unfamiliar language.(probably also in XML).

    In a sense, configuration files (like properties and xml) are a kind of compromise to not giving out an admin GUI. The plain text *is* the UI for configuration. A good application won't force the users to configure a lot of unfamiliar files but provide some admin UI instead. (compare benefits of tools like webmin with old school configuration)
  19. Yeah, great...[ Go to top ]

    A good application won't force the users to configure a lot of unfamiliar files but provide some admin UI instead. (compare benefits of tools like webmin with old school configuration)
    "old school configuration" enables various UIs including Webmin but does not mandate it.(thanks! thanks! thanks!)
    It is much easier to deal with old config format than with XML where actual data has to be enclosed in CDATA. And got burned from time to time because system picks yet another XML parser that decides to validate something on its own or simply buggy.
    XML is a nice common format for a lot of things today and is what future documents will be formatted in.

    XML parsers today are slow and buggy, it probably indicates that XML as a formal in not that nice ( especially with all the recent additions)
  20. Yeah, great...[ Go to top ]

    "old school configuration" enables various UIs including Webmin but does not mandate it.(thanks! thanks! thanks!)

    Using XML as your configuration backend does not mandate anything, more over you can much easily manage your frontend admin tools through transformations and templating and at least have some way of validating it (DTD or XSD).

    There are plenty of tools with front-end admin tools where configuration is stored in XML (BEA Weblogic AS is a good example).
    It is much easier to deal with old config format than with XML where actual data has to be enclosed in CDATA.

    Could you provide a configuration example where you need to use CDATA to enclose something?
    Aren't configuration files supposed to be lightwieght and readable? Anything that needs to be inside the CDATA does not strike me as simple and lightweight. But if you could provide an example, it may clear things out. And which one of the old configuration formats are you talking about? Yours, mine, or any of the other application-specific formats? Even using Properties and ResourceBundle is not really abiding to any format, you can screw up and make the files behind the scenes unreadable as well.
    XML parsers today are slow and buggy, it probably indicates that XML as a formal in not that nice ( especially with all the recent additions)

    Could you elaborate on which parsers are buggy? Since you make the statement, you should be able to provide some concrete examples.

    Regards,

    Artem D. Yegorov
    http://www.activexml.org
  21. Yeah, great...[ Go to top ]

    Could you provide a configuration example where you need to use CDATA to enclose something?
    iBatis config files as one example.
    Pretty much anything that may have < > inside.
    Could you elaborate on which parsers are buggy? Since you make the statement, you should be able to provide some concrete examples.

    Crimson – I had to add special provision to Ant build file to make sure it is nowhere; it cannot resolve _nonexistent_ references in config files.
  22. Yeah, great...[ Go to top ]

    I might me mistaken, but it does not seem like a lot to be so much against the XML technology. Maybe you should reevaluate which frameworks you are using to work with XML.

    It is not a problem with XML, it is a problem with implementations that work with it and not all of them are bad. JDOM is a very well set parser and it can furnish almost any requirement you may have in working with XML. I've been using it since beta 8 and even than never had problems.
    iBatis config files as one example.
    Pretty much anything that may have < > inside.

    What would you suggest as an alternative? Let's say for SQLMaps? BTW, do you have an example where you enclosed anything in the iBatis config file in CDATA?
    Crimson – I had to add special provision to Ant build file to make sure it is nowhere; it cannot resolve _nonexistent_ references in config files.

    Would you rather see Ant build scripts defined in an "old school" configuration format? What would such alternative look like? Something tells me it would be much harder to read, given all the dependencies one has to specify.

    It is interesting when one technology is bashed, nobody concerns themselves with evaluating cons of the technology that it is being compared against. I bet you had problems with old format configuration files and still do, but saying that would make the rest of the arguments against using XML obsolete.

    Regards,

    Artem D. Yegorov
    http://www.activexml.org
  23. Ant & XML[ Go to top ]

    Would you rather see Ant build scripts defined in an "old school" configuration format? What would such alternative look like? Something tells me it would be much harder to read, given all the dependencies one has to specify.

    you might want to read http://www.theserverside.com/news/thread.tss?thread_id=24864

    to quote:
    James Duncan Davidson, the creator of Ant, has written about the very beginnings of the project. This history lesson shows us how Ant started out with properties files, and grew to XML... and how in many ways he wished it hadn't.
  24. Ant &amp; XML[ Go to top ]

    Did you read the article behind the post?

    Article excerpt:
    When there were 10 or 20 keys in the build file, this wasn't so bad. But as projects grew in complexity, editing these files became an exercise in managing the visual noise created by all the repeated initial parts of property keys. It was clear to me that using properties for the build file syntax just wasn't sustainable in the long term.

    James is not talking about XML being bad for configuration, he talks in general about it not being the right technology to be used as a backend for a scripting langauge which as I quote:
    In retrospect, and many years later, XML probably wasn't as good a choice as it seemed at the time. I have now seen build files that are hundreds, and even thousands, of lines long and, at those sizes, it turns out that XML isn't quite as friendly a format to edit as I had hoped for. As well, when you mix XML and the interesting reflection based internals of Ant that provide easy extensibility with your own tasks, you end up with an environment which gives you quite a bit of power and flexibility of a scripting language—but with a whole lot of headache in trying to express that flexibility with angle brackets.

    Size is a concern, but than multiple build files help resolve that and structure your project build process much better. I mean, Java class with thousands lines of code is as unreadable as an XML file or a property File or even a Script-based alternative of such size. Verbosity is a tradeoff of readability and again multiple build files help.

    The property file format would have been even worse, since it would be flat and a given developer would have no clue about the dependencies and hierarchies of the project build flow.

    James points out that he should have:
     tried using a real scripting language, such as JavaScript via the Rhino component or Python via JPython, with bindings to Java objects which implemented the functionality expressed in todays tasks

    Would you like that alternative better? I wonder how much more unreadable that would have been, given the abuse that always surfaces it would be as disastrous as some of the multi-millon lined JSP-based applications which are just impossible to read with all the Java, HTML and JavaScript mixed together.

    From what the article says, the Ant and it's original intention was abused turning into a scripting environment rather than project build structure and scripting was not an intended feature.
    Now, I never intended for the file format to become a scripting language—after all, my original view of Ant was that there was a declaration of some properties that described the project and that the tasks written in Java performed all the logic. The current maintainers of Ant generally share the same feelings. But when I fused XML and task reflection in Ant, I put together something that is 70-80% of a scripting environment. I just didn't recognize it at the time. To deny that people will use it as a scripting language is equivalent to asking them to pretend that sugar isn't sweet.

    So, why blame any technology, in this case XML, because of an obvious developer's abuse? Isn't it really up to the developer or architect to decide what technology is applicable and where?
    I think it is and therefore there is nothing wrong with XML, it being in Ant or anywhere where APPLICABLE, but the developer's abuse will always be there and we would always have people who would complain how a given technology fails to work outside of it's defined paradigm! Duh! No technology is a panacea for every little problem one may have. Being a developer is having an ability to see how to use the healthy blend of available technologies effectively to deliver stable, scalable and usable applications.

    BTW article does not suggest any concrete alternative, just a variety of possible choices. It is interesting how XML gets blamed all the time for not doing that and that and God knows what, but nobody, not now nor ever was able to present a decent alternative for doing all of those things everybody complain about. I have not seen a single concrete example, just bashing, bashing and more bashing...

    Regards,

    Artem D. Yegorov
    http://www.activexml.org
  25. Ant &amp;amp; XML[ Go to top ]

    No technology is a panacea for every little problem one may have.
    Sure, but some invite and encourage problematic and wrong practices.

    XML is not simple anymore, and bugginess of XML parsers prove it. XML promise was: you do not need to write own parsers anymore, just use 'standard' XML parser and be happy. What a joke!
  26. Ant &amp;amp;amp; XML[ Go to top ]

    Sure, but some invite and encourage problematic and wrong practices.

    No technology invites and definitely does not encourage any bad practice. It is all in developer's head, developer makes the wrong choices and developer writes buggy implementations and bad products, the technology has nothing to do with it. If a developer does not know how to use the technology, learn, or don't use it, but do not blame it for your mistakes. What's that Russian proverb:
    "Bad dancer always blames the dancing shoes".

    Sounds a lot like it.
    XML is not simple anymore, and bugginess of XML parsers prove it. XML promise was: you do not need to write own parsers anymore, just use 'standard' XML parser and be happy. What a joke!

    Why do you write your own parsers? I mean, seriously, why? Maybe you are misusing XML for something that it is not fit to be? What is a situation where you need to do it. You keep saying that, but I fail to grasp the case when it is so much necessary.

    And what is that about promises? They are being fullfilled, technology is advancing and bashing it does not help. And it really sounds kind of childish. Oh, XML promised that and that, so I am not going to play with it anymore. We are all part of the developer's community, if it has not lived up to its promise yet, let's help it do so, not run it down to the ground.

    Regards,

    Artem D. Yegorov
    http://www.activexml.org
  27. Ant &amp; XML[ Go to top ]

    James Duncan Davidson, the creator of Ant, has written about the very beginnings of the project. This history lesson shows us how Ant started out with properties files, and grew to XML... and how in many ways he wished it hadn't.
    And James apologized for Jelly: http://www.jroller.com/comments/lsd?anchor=james_strachan_apologizes_for_jelly
  28. Yeah, great...[ Go to top ]

    I might me mistaken, but it does not seem like a lot to be so much against the XML technology. Maybe you should reevaluate which frameworks you are using to work with XML.

    It is not a problem with XML, it is a problem with implementations that work with it and not all of them are bad. JDOM is a very well set parser and it can furnish almost any requirement you may have in working with XML. I've been using it since beta 8 and even than never had problems.
    iBatis config files as one example.
    Pretty much anything that may have < > inside.

    What would you suggest as an alternative? Let's say for SQLMaps? BTW, do you have an example where you enclosed anything in the iBatis config file in CDATA?
    Crimson – I had to add special provision to Ant build file to make sure it is nowhere; it cannot resolve _nonexistent_ references in config files.

    Would you rather see Ant build scripts defined in an "old school" configuration format? What would such alternative look like? Something tells me it would be much harder to read, given all the dependencies one has to specify.

    It is interesting when one technology is bashed, nobody concerns themselves with evaluating cons of the technology that it is being compared against. I bet you had problems with old format configuration files and still do, but saying that would make the rest of the arguments against using XML obsolete.

    Regards,

    Artem D. Yegorov
    http://www.activexml.org
  29. Yeah, great...[ Go to top ]

    BTW, do you have an example where you enclosed anything in the iBatis config file in CDATA?
    Taken from iBatis tests, do you see “ACC_ID < 2” that calls for CDATA for human readability sake.

    <select id="getFewAccountsViaResultMap"
        resultMap="account-result">
        <![CDATA[
        select * from ACCOUNT
        where ACC_ID < 2
        order by ACC_ID
        ]]>
      </select>
    Would you rather see Ant build scripts defined in an "old school" configuration format? What would such alternative look like? Something tells me it would be much harder to read, given all the dependencies one has to specify.

    Well, I was for a long time pretty happy with make. Only OS independence brought me to Ant. Of course over time Ant became more convenient because it has specialized and optimized tasks, but initially I missed ability to write Macroses.
    IMO: Ant build script could be
    1. a plain Java class – that will automatically enable attributes auto completion etc. when used from IDE.
    2. Jython script.

    PS: There are too many post duplicates these days on TSS, looks like resubmit control is gone, please click once.
  30. Yeah, great...[ Go to top ]

    Isn;t it cleaner and simpler intead of the following:
    <select id="getFewAccountsViaResultMap"
        resultMap="account-result">
        <![CDATA[
        select * from ACCOUNT
        where ACC_ID < 2
        order by ACC_ID
        ]]>
      </select>

    to have, IMO:

    <select id="getFewAccountsViaResultMap"
        resultMap="account-result">
        select * from ACCOUNT
        where ACC_ID < 2
        order by ACC_ID
     </select>

    I mean, we all have known for a long time now what the < and > are. It is not that tough to read.

    But I see your point there.

    Could have been something like:

    select.getFewAccountsViaResultMap.sql=select * from ACCOUNT where ACC_ID < 2 order by ACC_ID
    select.getFewAccountsViaResultMap.resultMap=account-result

    But this is a simple example of iBatis use, given a more complicated one, the properties alternative turns out to be quite ugly, IMO.
    a plain Java class – that will automatically enable attributes auto completion etc. when used from IDE.

    Idea of an API-based project build automation, where one would have to write a Java class to describe a flow of a build process is not bad and frankly, if API is easy to follow and use, that could be a sound alternative. So I'm in agreement here.

    Anyway, I just think that each technology has it's place and the better we benefit from using a blend of what is out there to help in our development efforts, the better off we will be ouselves.

    Regards,

    Artem D. Yegorov
    http://www.activexml.org
  31. Yeah, great...[ Go to top ]

    All the < > are really &amp;lt; and &amp;gt;

    Regards,

    Artem D. Yegorov
    http://www.activexml.org
  32. Yeah, great...[ Go to top ]

    All the < > are really &amp;amp;lt; and &amp;amp;gt;Regards,Artem D. Yegorovhttp://www.activexml.org
    We all do know that stupid abbreviations for stupid XML parsers. That is the point, why we should study such idioms?
    Code with all those amps is just not easy readable by anybody but XML drones.
  33. Yeah, great...[ Go to top ]

    We all do know that stupid abbreviations for stupid XML parsers.

    Parsers cannot be stupid, they do not have a capability of thinking or making decisions, they all are just APIs not AIs. :-) And stupid abbreviations become less stupid when you learn their full meaning. Learn is the word!
    Code with all those amps is just not easy readable by anybody but XML drones.

    You are not capable of remembering 5 embedded entity references? Do not tell me that you get so much confused when you see amp or lt or gt. What is so unreadable? You are fine reading IMO or IMHO, these are abreviations, what's wrong with the other ones? But you would say it is a different thing, right? :-)

    There are a ton of examples of different abreviations and human brain is just fine interpreting them, you do not even notice the process after some time, IMHO. :-)

    Regards,

    Artem D. Yegorov
    http://www.activexml.org
  34. Yeah, great...[ Go to top ]

    ==You are not capable of remembering 5 embedded entity references?

    I am not if I do not see a good reason for that.

    ==Do not tell me that you get so much confused when you see amp or lt or gt.
    ==What is so unreadable?

    Sorry, I am confused. I can catch meaning of a < b at a glance but really confused when I see a?&amp;lt;b, try to add more comparisons ( ?&amp;lt;a?&amp;gt;? ) ant my brain is a toast.

    ==You are fine reading IMO or IMHO, these are abreviations, what's wrong with the other ones? But you would say it is a different thing, right? :-)

    - IMO and alike are different because they made by humans for humans, when < stuff is there for making life easier for parser, not human’s one.
    - IMO/IMHO etc. do not break text processor when written as &IMO;
    - nobody _must_ use IMO, it is very good when stuff is spelled out;
  35. Yeah, great...[ Go to top ]

    Sorry, I am confused. I can catch meaning of a < b at a glance but really confused when I see a?&amp;lt;b, try to add more comparisons ( ?&amp;lt;a?&amp;gt;? ) ant my brain is a toast.

    I am sorry that you are. Just do not forget that any format would have exceptions, one way or another escaped characters. There is a lot of stuff in programming languages that is done for the sake of compiler/parser, not for humans. That's why we learn the syntax.

    But arguing about this stuff is the same as arguing with a person who cannot read, until you learn, the letters, words, phrases and idioms will be confusing and won't mean anything to you.

    Anyway, I think this is getting old and circular...

    Regards,

    Artem D. Yegorov
    http://www.activexml.org
  36. Yeah, great...[ Go to top ]

    All the < > are really &amp;lt; and &amp;gt;

    Regards,

    Artem D. Yegorov
    http://www.activexml.org
  37. Yeah, great...[ Go to top ]

    XML is a nice common format for a lot of things today and is what future documents will be formatted in.

    As "data", (going back to the discussion), there are things about XML that don't seem as nice, ie: database. Using SAX, you can sort of "query" an XML document (without loading the whole "database" in memory with DOM) and you are basically searching for tags. With a database, you don't need the redundancy and everything under a column is going to represent the same thing, there is an "invisible structure" and a very fast performant indexing going on that parsers and readers cannot beat.

    Much configuration is "data". Sometimes, I wonder if developers should be using two databases (a configuration/data dictionary database controlled by system) and an application database. Runtime updating is faster and a little easier (a single SQL statement) as opposed to a bunch of API calls or unfamiliar language.(probably also in XML).

    In a sense, configuration files (like properties and xml) are a kind of compromise to not giving out an admin GUI. The plain text *is* the UI for configuration. A good application won't force the users to configure a lot of unfamiliar files but provide some admin UI instead. (compare benefits of tools like webmin with old school configuration)
  38. Plus 2 more[ Go to top ]

    #7 - When the content may be added to or its format changed in the future
    #8 - When the content needs to be rearranged or transformed into several other formats, incl. object instances.

    XML's usefulness comes from decoupling content from representation. When that is done, one parser can lexically analyse any representation into a usable graph. You can go straight to the "what" and bypass the "how" completely.

    As mentioned before, without XML, any new representation means new code to parse and transform it, code that must be maintained.

    And with care, XML needn't impose too big a performance penalty. Pull parsers are very fast.

    My 2p,
    Kit
  39. Could someone please explain in more detail the complexities introduced by XML namespaces as mentioned int he article. Specifically, what does the author mean by: "It complicates XML stores, such as DOM implementations, because the XML Namespace specification only discusses parsing XML, and introduces a number of serious complications to edit scenarios. It complicates XML writers, because it introduces new constraints and ambiguities."
    Moreover, when the author says namespaces forces parsers to parse the entire start-tag before returning any text information, is he talking about the performance overhead or about something else ? Please clarify.

    Thanks,
    Venkat