New way of doing XML



  •  Recently in a school project we got the group task to develop a little java application with a defined set of functions. One of the functions was XML import/export, which, being the "project manager" in my group, i assigned to the one i didn't trust doing the more complex tasks. I left the implementation to him, thinking that XML can't be that hard to do in Java.

    Well, the resulting XML looked like this:

    <teachers>
      <id1>
        <name>Foo</name>
      </id1>
      <id2>
        <name>Bar</name>
      </id2>
    </teachers>

    I didn't say anything against using SAX for import, but this format was too much...

    (Oh, and as it is to be expected, he created the XML with

    result += "<id" + teacher.getID() + ">";
    result += "<name>" + teacher.getName() + "</name>";
    result += "</id" + teacher.getID() + ">";

    )
     



  •  Ah, group projects.... how I hated being stuck with clueless people. I suspect most of us did. And if you complained to the faculty you'd get the usual "the whole point is to work as a group, because that's what it'll be like when you get a job". The really sad part is that you get a job and it really is like that.



  • The difference is that in a real-world environment, the clueless people have often already been promoted.



  • @whoever said:

     Recently in a school project we got the group task to develop a little java application with a defined set of functions. One of the functions was XML import/export, which, being the "project manager" in my group, i assigned to the one i didn't trust doing the more complex tasks. I left the implementation to him, thinking that XML can't be that hard to do in Java.

    Well, the resulting XML looked like this:

    <teachers>
      <id1>
        <name>Foo</name>
      </id1>
      <id2>
        <name>Bar</name>
      </id2>
    </teachers>

    I didn't say anything against using SAX for import, but this format was too much...

    (Oh, and as it is to be expected, he created the XML with

    result += "<id" + teacher.getID() + ">";
    result += "<name>" + teacher.getName() + "</name>";
    result += "</id" + teacher.getID() + ">";

    )
     

     

    How would you have done it?



  • I would have changed the requirement to use a database.



  • @DOA said:

     Ah, group projects.... how I hated being stuck with clueless people. I suspect most of us did. And if you complained to the faculty you'd get the usual "the whole point is to work as a group, because that's what it'll be like when you get a job". The really sad part is that you get a job and it really is like that.

     

    Yes, I remember the group project where we met the day before to plan our "presentation" and the one guy said he was not finished because he had been working overtime but had that night off and would be finished in time. He was even asked if he wanted any help but insisted he did not need it. Not only did he not finish... and not show up for the presentation... but he actually went to the registrar that morning and DROPPED the class and didn't bother to tell any of us. The instructor was understanding enough that she graded us on what we had finished ourselves and not taking into account the pieces that were missing. I found out this was the second time he had done this. He got my "evil eye" everytime I saw him after that.



  • @mrprogguy said:

    How would you have done it?

    <teachers>
      <teacher id="1">
        <name>Foo</name>
      </teacher>
      <teacher id="2>
        <name>Bar</name>
      </teacher>
    </teachers>
    like any other civilized human being?


  • I have been in a group project that was supposed to be composed of 3 members, but the third student disappeared shortly after.  We tried to contact him several times by phone and email, unsuccessfully (the classes already ended, we were working for our final  projects and exams).  Then we started working without him.  Two months after, the eve of the presentation at 7 PM, he showed up saying that he had been busy with a traineeship and asked us to put his name on our project anyway.  Our answer was ___ (left as an exercise to the reader).        



  • @PSWorx said:

    @mrprogguy said:

    How would you have done it?

    <teachers>
      <teacher id="1">
        <name>Foo</name>
      </teacher>
      <teacher id="2>
        <name>Bar</name>
      </teacher>
    </teachers>

    like any other civilized human being?

     

     Ok, hopefully I'm not just being dumb, but why would you put the ID as a tag parameter, but the name as a separate tag?

    Could you do: <teacher id="1" name="foo">

    or the other way:

     <teacher><id>1</id><name>foo</name></teacher>

     

     



  • @DOA said:

     Ah, group projects.... how I hated being stuck with clueless people. I suspect most of us did. And if you complained to the faculty you'd get the usual "the whole point is to work as a group, because that's what it'll be like when you get a job". The really sad part is that you get a job and it really is like that.


    I remember feeling all smug one time, retorting with: "yeah, but in the real world this dude would have gotten fired."  Oh how naiive I was.  In the real world, that guy gets promoted.


  • Preference. w3c states that you should put information in elements, but allows you to put data which can modify a tags contents in attributes. If later on you decided you wanted to put additional stuff besides 'name' into your xml, then you would be better having the name as an element also. It would be horrible to have everything in as attributes.

    So I'd go for the <teacher id="1"><name>foo</name></teacher> 

     

     



  • @EJ_ said:

     Ok, hopefully I'm not just being dumb, but why would you put the ID as a tag parameter, but the name as a separate tag?

    You cannot use CDATA or newlines in attributes, also you can only use one type of quote character.

    On the other hand attributes take less space, and they also cannot contain subnodes, which makes them prefered when you are limited to plain text or other data types such as numbers.



  • TRWTF is XML.



  • @Mole said:

    w3c states that you should put information in elements, but allows you to put data which can modify a tags contents in attributes.
     

    Really? I've never seen this statement. I didn't think there was any agreed-upon standard for when to use attributes vs. tags, which is one of the reasons I hate doing integration projects - everyone has a different concept of what's "attribute-worthy", and have absolutely put "data which can modify a tag's contents" into elements, and vice-versa.

    That said, PSWorx's approach is probably how I'd do it, too.



  • @SlyEcho said:

    You cannot use CDATA or newlines in attributes, also you can only use one type of quote character.

     

    <statement content="&quot;Only one type of quote charcacter&quot;? I beleive that&apos;s just not true." />



  •  Personally, I think they mean stuff like <title lang="en">My title!</title>



  • @Mole said:

     Personally, I think they mean stuff like <title lang="en">My title!</title>

    Or <title lang="FileNotFound">Brillant!</title>. This is the kind of stuff that XML is good for.


  • @whoever said:

     Recently in a school project we got the group task to develop a little java application with a defined set of functions. One of the functions was XML import/export, which, being the "project manager" in my group, i assigned to the one i didn't trust doing the more complex tasks. I left the implementation to him, thinking that XML can't be that hard to do in Java.

    Well, the resulting XML looked like this:

    <teachers>
      <id1>
        <name>Foo</name>
      </id1>
      <id2>
        <name>Bar</name>
      </id2>
    </teachers>

    I didn't say anything against using SAX for import, but this format was too much...

    (Oh, and as it is to be expected, he created the XML with

    result += "<id" + teacher.getID() + ">";
    result += "<name>" + teacher.getName() + "</name>";
    result += "</id" + teacher.getID() + ">";

    )
     

    Let him create a schema file for his XML ;-)



  • @EJ_ said:

     Ok, hopefully I'm not just being dumb, but why would you put the ID as a tag parameter, but the name as a separate tag?

    Could you do: <teacher id="1" name="foo">

    or the other way:

     <teacher><id>1</id><name>foo</name></teacher>

     

    Typically the idea is that attributes are for data that only exists once.  Sub-elements are typically used when there's repetition.  Your first example would be correct.  A possible benefit to having it as sub-elements would be if you wanted an unlimited number of names for first, last and middle names.

    Either way, both your examples and the OP's example are still much better than what this team member came up with. 



  • @Soviut said:

    Typically the idea is that attributes are for data that only exists once.  Sub-elements are typically used when there's repetition.  Your first example would be correct.  A possible benefit to having it as sub-elements would be if you wanted an unlimited number of names for first, last and middle names.

    Either way, both your examples and the OP's example are still much better than what this team member came up with. 

     

    Another possible advantage of attributes is that most XML-parses allow quicker access to attribates than to childeren (I think mostly because an attribute has a 0,1 relation with a tag and a child has a 0,* relation to a tag). Furthermore attributes obviously can't hold child elements and are slightly trickier if they can hold XML-special characters (such as &. ", ' etc.).

    But, like more times in software development, there is no "correct" way to do it. There are incorrect ways, which the OP is an obvious example of.



  •  Group projects suck period. My last semester of my CSc degree I had 3 group projects (3 different classes) a full course load and was contracting part-time as a business analyst for a gov agency (a wtf in and of itself). I got good at time management quickly (and my liquor tollerance grew substantially).

    My useless member story:

    One of the group projects was a system analysis course which required a large paper as the final project (as well as a presentation). System design went fairly well, though one student always lagged behind on comprehension. We got through that fine (though not w/o the prof's help.. he had to confirm a lot of our design decisions as correct with the slower student.. masters student I may add).To be fair to the student, their undergrad was not in CSc and they had no experience doing any sort of data or process design. Having even the trivial design decisions questioned was a PITA though as it slowed everything down (good experience for real life though).

     

    Once the design was done we had to document and write-up the analysis and decisions. Everything should have been straight forward as everything was documented and just needed to be put into plain english. Initially we divided the writing up, however the writing of the student we had issues with was so poor re-writing their work was quicker than editing it (obviously they didn't do english or writing for their undergrad either). Luckily we caught that issue early and re-divided their assigned part. Instead we assigned them to do the index. I had already started the index, so I provided them with a link to a guide on how to create an index and some suggested topics to add. They were to add the topics I had given them and think of some more to fill it out.

     

    A week went by and we heard nothing. Two days before the due date with some prodding the student forwarded their work. They hadn't added any topics, but instead defined the existing topics. Bascially they attempted turn the index into a glossary. I say attempted because the definitions were incomplete, wrong or recursive. At least they left the page numbers.

     

    (made up example..)

     

    Data flow - How data flows ........ 5, 15, 25

    Data model - the model of how the datas are .... 30-34

    Full custom system -??? ............. 35, 46

     

    Not quite a paula, but pretty bad.

     

    We were left to fix the issues last minute. The student made no further contact with us after that.  In fact they showed up late for the final (after it had begun) and left before anyone else (assumeably to avoid contact with us). Not sure what they are doing now. Most likely a phd.

     

     

     



  • @EJ_ said:

     Ok, hopefully I'm not just being dumb, but why would you put the ID as a tag parameter, but the name as a separate tag?

    That's the fun of XML.  Everyone has their own rules for deciding between attributes and elements, and some people can't even follow their own rules - creating a hopeless clusterfuck mix of the two. ( e.g. name being an attribute for students, but an element for teachers ).  

     



  • @dtech said:

    But, like more times in software development, there is no "correct" way to do it. There are incorrect ways, which the OP is an obvious example of.

    I was using the term "correct" very loosly and in a relative sense.  I just didn't feel like beating around the bush with "your first example is one of several prefered representations".



  • @obediah said:

    @EJ_ said:

     Ok, hopefully I'm not just being dumb, but why would you put the ID as a tag parameter, but the name as a separate tag?

    That's the fun of XML.  Everyone has their own rules for deciding between attributes and elements, and some people can't even follow their own rules - creating a hopeless clusterfuck mix of the two. ( e.g. name being an attribute for students, but an element for teachers ).  

    This is what I've never understood about people saying "The real WTF is XML".  That's not really the problem, its the people using it.  Everything can be abused; XML, HTML, code, the english language on YouTube and instant messenger, etc.

    Hell, even languages like Python that strip away a lot of ambiguity still leave room for people to invent their own rules.  This is especially true of people coming to Python from statically typed backgrounds.



  • @obediah said:

    That's the fun of XML.  Everyone has their own rules for deciding between attributes and elements, and some people can't even follow their own rules
     

    Is it really a problem? As long as the documents has either a undestandable schema or is otherwise well (self-)documented it doesn't really matter. (and since XML is highly self-documenting it usually not a problem)

    Ofcourse, nothing can withstand this type of idiocy.



  • @dubbreak said:

    I say attempted because the definitions were incomplete, wrong or recursive. At least they left the page numbers.

     (made up example..)

     Data flow - How data flows ........ 5, 15, 25

     

     I always like it when I see that sort of things. I once looked "accountant" up in a Dutch-English dictionary. The definition given: "Someone who makes a living out of accounting". None the wiser, I look up accounting: "The work of an accountant"

    I actually still have that dictionary somewhere I think. Useless piece of crap it was indeed.



  • @dtech said:

    I always like it when I see that sort of things. I once looked "accountant" up in a Dutch-English dictionary. The definition given: "Someone who makes a living out of accounting". None the wiser, I look up accounting: "The work of an accountant"
     

    Clearly a reverse lookup:

    Accounting accounting = new Accounting();
    Accountant.Profession = accounting;
    

    Accountant accountant = new Accountant();
    Accounting.Professional = accountant;



  • @whoever said:

    (Oh, and as it is to be expected, he created the XML with

    result += "<id" + teacher.getID() + ">";
    result += "<name>" + teacher.getName() + "</name>";
    result += "</id" + teacher.getID() + ">";

    )

     

     

    sorry, but this is no WTF for me. string concatenation (i hope i wrote it correctly, i'm not a native speaker) is the fastest and easiest way to generate xml in some cases (i'm tempted to write "in all cases").

     and, maybe even faster compared to the object (/DOM)-based approach (in some languages, at least).



  •  @SEMI-HYBRID code said:

    @whoever said:

    (Oh, and as it is to be expected, he created the XML with

    result += "<id" + teacher.getID() + ">";
    result += "<name>" + teacher.getName() + "</name>";
    result += "</id" + teacher.getID() + ">";

    )

     

     

    sorry, but this is no WTF for me. string concatenation (i hope i wrote it correctly, i'm not a native speaker) is the fastest and easiest way to generate xml in some cases (i'm tempted to write "in all cases").

     and, maybe even faster compared to the object (/DOM)-based approach (in some languages, at least).

    I guess that's why I did in 30 minutes using XML serialization of Java objects what it took other people weeks to do by manually concatenating the fields of their classes.  Plus everytime a new field was added I didn't have to do anything, they had to modify their XML code.  Obviously at some level creating XML is string concatenation, but to do the concatenation at the same level as your business logic is indeed, worse than failure.



  • @SEMI-HYBRID code said:

    sorry, but this is no WTF for me. string concatenation (i hope i wrote it correctly, i'm not a native speaker) is the fastest and easiest way to generate xml in some cases (i'm tempted to write "in all cases").

     and, maybe even faster compared to the object (/DOM)-based approach (in some languages, at least).

     

    It might not be that bad for fixed classes that arend going to change (such as with a school project). But hey, at least use stringbuilder for readability&performance ;)



  • @SEMI-HYBRID code said:

    sorry, but this is no WTF for me. string concatenation (i hope i wrote it correctly, i'm not a native speaker) is the fastest and easiest way to generate xml in some cases (i'm tempted to write "in all cases").

     and, maybe even faster compared to the object (/DOM)-based approach (in some languages, at least).


    Please do us all a favor and don't ever write another line of code.



  • @dtech said:

    It might not be that bad

    Yes it is bad. Every time. Always.



    Do us all a favor, and leave generating XML to the big boys if you don't understand this topic.



  • @SEMI-HYBRID code said:

    @whoever said:

    (Oh, and as it is to be expected, he created the XML with

    result += "<id" + teacher.getID() + ">";
    result += "<name>" + teacher.getName() + "</name>";
    result += "</id" + teacher.getID() + ">";

    )

     

     

    sorry, but this is no WTF for me. string concatenation (i hope i wrote it correctly, i'm not a native speaker) is the fastest and easiest way to generate xml in some cases (i'm tempted to write "in all cases").

     and, maybe even faster compared to the object (/DOM)-based approach (in some languages, at least).

     

    Dear God almighty, NO!

    Please use concatenation all you like to generate, but please generate something reasonable:

    result += "<id=" + teacher.getID() + ">";
    result += "<name>" + teacher.getName() + "</name>";
    result += "</id>";
    Although I shouldn't complain, tidying up witless developer's XML has kept me in SOA consultancy for a while now...


  • @dtech said:

    @SEMI-HYBRID code said:

    sorry, but this is no WTF for me. string concatenation (i hope i wrote it correctly, i'm not a native speaker) is the fastest and easiest way to generate xml in some cases (i'm tempted to write "in all cases").

     and, maybe even faster compared to the object (/DOM)-based approach (in some languages, at least).

     

    It might not be that bad for fixed classes that arend going to change (such as with a school project). But hey, at least use stringbuilder for readability&performance ;)

     

    The classes might not change, that's true - but do you want to re-write the xsd schema every time a new record is added? Really? OK, suit yourself!


  • :belt_onion:

    @Mr B said:

    Dear God almighty, NO!

    Please use concatenation all you like to generate, but please generate something reasonable:

    result += "<id=" + teacher.getID() + ">";
    result += "<name>" + teacher.getName() + "</name>";
    result += "</id>";
    Although I shouldn't complain, tidying up witless developer's XML has kept me in SOA consultancy for a while now...

    And how can you beat this then?

    Teacher aTeacher = new Teacher(5, "Bob"); 

    xmlSerializer.Serialize( aTeacher );

    Everytime you need to add a new element, just add it as a public property to the Teacher class (which you would have to do anyway). No need to change the code for writing and reading your XML file because the serializer takes care of the mapping. Plenty of attributes out there to cater for your taste of XML.

     



  • @bjolling said:

    And how can you beat this then?

    Teacher aTeacher = new Teacher(5, "Bob"); 

    xmlSerializer.Serialize( aTeacher );

    Everytime you need to add a new element, just add it as a public property to the Teacher class (which you would have to do anyway). No need to change the code for writing and reading your XML file because the serializer takes care of the mapping. Plenty of attributes out there to cater for your taste of XML.

     

    Yup, that's fine.  My example was merely demonstrating how to achieve a reasonable xml structure using concatenation. I didn't say it was the best way, just that it was *a* way.



  • @Mr B said:

    My example was merely demonstrating how to achieve a reasonable xml structure using concatenation. I didn't say it was the best way, just that it was *a* way.

    Agreed.  While the DOM was is usually safer and more flexible, from time to time just dumping raw text for something like a test isn't so bad.  Hell, I've used a Django template to generate XML to pass to a Flash app.  I just didn't feel like digging into the python xml module just to see if Flash was parsing correctly.



  • @SEMI-HYBRID code said:

    @whoever said:

    (Oh, and as it is to be expected, he created the XML with

    result += "<id" + teacher.getID() + ">";
    result += "<name>" + teacher.getName() + "</name>";
    result += "</id" + teacher.getID() + ">";

    )

     

     

    sorry, but this is no WTF for me. string concatenation (i hope i wrote it correctly, i'm not a native speaker) is the fastest and easiest way to generate xml in some cases (i'm tempted to write "in all cases").

     and, maybe even faster compared to the object (/DOM)-based approach (in some languages, at least).

    Except when the guy screws up doing his "standard" XML. I found out the hard way that the "generated" XML output from one of our modules was marked as invalid. Why? Because the original dev forgot to put quotes on attribute values.


  • @Mr B said:

    Please use concatenation all you like to generate, but please generate something reasonable:
    result += "<id=" + teacher.getID() + ">";
    result += "<name>" + teacher.getName() + "</name>";
    result += "</id>";
    Although I shouldn't complain, tidying up witless developer's XML has kept me in SOA consultancy for a while now...

     

     

    Really? you can do that? I had no idea...

     

    <id = 1><name>WTF</name></id>

     



  • @Mr B said:

    result += "<id=" + teacher.getID() + ">";
    result += "<name>" + teacher.getName() + "</name>";
    result += "</id>";
    Although I shouldn't complain, tidying up witless developer's XML has kept me in SOA consultancy for a while now...

    Judging by your little example, I bet someone is saying the same thing about you being witless and 'tidying' up after you.



  •  @Farmer Brown said:

    Please do us all a favor and don't ever write another line of code.

    ...okay, but who will you then rant about?

    (anyways, when writing that post, the first two things i remembered were DOM-based JavaScript and DOM-based PHP xml generation, and I still feel more comfortable using string concatenation instead of that. and I don't get the people who say it's not flexible enough. modifying code that generates is just the same as modifying (kinda) template for the xml document.

    in addition, there's whole lot of dynamic webpages, that generate html in this manner, and It's even considered to be one of common practices. and if it's convenient for HTML, why shouldn't it be for XML?

     and, usually it's even more obvious what the code is doing, and how the final xml file will look, as reading the code is nearly the same as reading the xml itself.

    (which may be just a personal preference, but to me, it's way more comfortable to see the text of xml than to see the dom generation code and have to imagine the nodes nesting and the like...)

    ...or was this just about humiliating the person with least posts? :-)


  • :belt_onion:

    @Soviut said:

    @Mr B said:

    My example was merely demonstrating how to achieve a reasonable xml structure using concatenation. I didn't say it was the best way, just that it was *a* way.

    Agreed.  While the DOM was is usually safer and more flexible, from time to time just dumping raw text for something like a test isn't so bad.  Hell, I've used a Django template to generate XML to pass to a Flash app.  I just didn't feel like digging into the python xml module just to see if Flash was parsing correctly.

    I see where you're coming from but using the DOM is not only safer and more flexible, but also faster to implement. I've started on a pet project where I have to load a bunch of XML files for analysis. It takes less than 10 minutes (while watching TV) to:

    • Open the XML file in Visual Studio 
    • Click on Generate Schema
    • Run xsd.exe with the /classes option to generate the classes that represent the XML structure
    • Write a unit test and write 4 line of code to deserialize the file into above classes
    • Compile
    • ???
    • PROFIT

    Making changes to the data and saving them is now extremely easy. It takes an extra four lines of code to serialize the data to file again. Guaranteed error free. If the XML schema changes, I just need to execute the xsd.exe step again and recompile.

    A "quick test" using manual parsing or string concatenation can never be faster to implement, not even for this simple list of teachers from the OP



  • @Mr B said:

    Dear God almighty, NO!

    Please use concatenation all you like to generate, but please generate something reasonable:

    result += "<id=" + teacher.getID() + ">";
    result += "<name>" + teacher.getName() + "</name>";
    result += "</id>";
    Although I shouldn't complain, tidying up witless developer's XML has kept me in SOA consultancy for a while now...

     

     

    i wasn't talking about the XML itself, i assumed it is obvious that it's WTF, i just focused on the WTFness of the way he generated that (incorrect) xml.

    anyways, your version doesn't seem too much of an improvement to me... the original xml was at least valid... ;-)


  • :belt_onion:

    @SEMI-HYBRID code said:

    i wasn't talking about the XML itself, i assumed it is obvious that it's WTF, i just focused on the WTFness of the way he generated that (incorrect) xml.

    anyways, your version doesn't seem too much of an improvement to me... the original xml was at least valid... ;-)

    No it wasn't. You can't generate a correct schema for it as already pointed out by mihi.


  •  well, okay... I say something stupid learn something new every day, don't I?


  • :belt_onion:

    @SEMI-HYBRID code said:

     well, okay... I say something stupid learn something new every day, don't I?

    Happens to me all the time :o)


  • @SEMI-HYBRID code said:

     well, okay... I say something stupid learn something new every day, don't I?


    Which is exactly why you should not be generating XML yourself. Ever. Use proven strategies and frameworks, not things you reason will be ok.



  • I only ever use string concatenation for very simple (few element, hardly any nesting, no attributes) xml files. I'd never attempt to write a quick and dirty xml parser however.

    For everything else, they are DOM-based. String concatenation just exposes you to too many possible bugs,missing quotes, unterminated quotes, unterminated elements, etc. I know when I spit out the text from the DOM-based code that it will be valid. Also, I actually find that code such as teacher.AddAttribute("id",getteacherid()); makes more immediate sense than result += "<teacher id=\""+String(geteacherid())+"\">";



  • @Farmer Brown said:

    @dtech said:
    It might not be that bad
    Yes it is bad. Every time. Always.

    Do us all a favor, and leave generating XML to the big boys if you don't understand this topic.

    I know you're just trolling, but I can't let stupidity of this magnitude go uncorrected.  Some people actually try to learn here and letting you spread nonsense like this just so you can have some fun mocking people is unacceptable.

     

    Ladies and gentlemen, there is nothing wrong with using string concatenation to build XML (assuming, of course, you build valid XML).  DOM is horribly verbose and frequently inflexible and it only guarantees that the XML is well-formed and validates against a particular schema -- something that should be fairly trivial to test with concatenated XML and an XML validation tool.  DOM does not protect against retarded schemas or any other variety of WTF.  It's this same cargo-cult mentality that convinces programmers that the compiler protects them from errors rather than just catching minor problems of syntax.



  • @morbiuswilters said:

    there is nothing wrong with using string concatenation to build XML (assuming, of course, you build valid XML)

    You are very, very wrong. You should always use a proven framework like XML serialization (which has pointed out already). Rolling your own XML generator is what gets you a featured article on the front page.


Log in to reply