Bizarre file format



  • Today I had to parse some telephone bills sent by the provider, who have just changed the format in which they send out the bills (without any kind of warning or offering to keep sending us the old format, which was working fine). Their idea of a sensible format is apparently a single xml file contained in a zip, with contents of the following form:

    <z:row Version="C" CallDay="Mon" CallDate="01/10/2007" CallTime="08:03:21" CallHour="8" CallType="Outbound" DestType="LOCAL" CustCLI="..." Other_Party_No="..." Description="..." Duration="0.00130787037037037" DurationMins="1" Billed="0.04" AccountCode="UNKNOWN" CLIDescription="Switch" DDI="" OwnCLI="No"/>

    So here we have:

    • CallDate, in the usual dd/mm/yyyy format
    • CallDay, the first three letters of the weekday of that date - duplicated information, but possible excusable since it's not trivial to compute
    • CallTime, in the usual hh🇲🇲ss format
    • CallHour. WTF is this doing here? Why do we have the hour in two places? Why would anybody even want to know just the hour?

    And:

    • Duration, measured in days. Who measures the length of telephone calls in days?
    • DurationMins, again, why is this even here? They bill us by the second, not by the minute.

    And finally we have Other_Party_No, the only attribute to have underscores in its name.

    I cannot imagine why anybody would deliberately create a file like this.



  • Two format designers who do not swap notes might account for the under_score vs UpperCamelCase.

    I'm guessing callHour is the same duplicate-for-easiness principle as CallDay, which principle is flushed with callTime measured in days, also likely due to >1 format designers not swapping notes.

    Pray tell, what unit is Billed in? Seconds? Minutes? Swatches? Dollars? Cents? Weeks?

    But isn't it fairly common for large institutes to meaure things in the equivalent of Snoots or Morks?



  •  

    Pray tell, what unit is Billed in? Seconds? Minutes? Swatches? Dollars? Cents? Weeks?

    I'm kinda hoping it's in pounds, but I'm too depressed by the general insanity to check, so I guess we'll find out if anybody complains that their total resembled their telephone number.



  • I've worked with some xml formats which have duplicate data in them(mainly landxml and its variants) and although it makes the file huge and does seem a bit pointless at a cursory inspection given that a lot of attributes whose values are trivial to derive from some of the other attributes with a quick bit of geometry are included.
    it does have an advantage when you are converting into another less verbose file format as all the attributes are already calculated by the program generating the xml.   That sounds a bit rambly but you get what i mean i hope.

    So i'm thinking that maybe the CallHour is used by something else that consumes that xml, possibly some sort of xslt file that needs to do some showing/hiding/ordering based on the call hour and it was easier than converting the time from xslt?  just a guess but the only remotely sane reason for doing it that way.

    As for some of their choices in units it seems a bit crazy but some systems do use fractions of a day to store a length of time

    http://support.microsoft.com/kb/210276#

    <script type="text/javascript">loadTOCNode(2, 'moreinformation');</script> Because a time value is stored as a fraction of a 24-hour day, you may receive incorrect formatting results when you calculate time intervals greater than 24 hours

     

     

    But over all it does seem like some strange choices in their format.


     



  • @element[0] said:

    So i'm thinking that maybe the CallHour is used by something else that consumes that xml, possibly some sort of xslt file that needs to do some showing/hiding/ordering based on the call hour and it was easier than converting the time from xslt?  just a guess but the only remotely sane reason for doing it that way.

    Which just leads us to the question of why anybody would ever want to filter or order based on the hour in which a call occurred, but not the actual time. Really can't think of any use for this. 



  • @asuffield said:

    @element[0] said:

    So i'm thinking that maybe the CallHour is used by something else that consumes that xml, possibly some sort of xslt file that needs to do some showing/hiding/ordering based on the call hour and it was easier than converting the time from xslt?  just a guess but the only remotely sane reason for doing it that way.

    Which just leads us to the question of why anybody would ever want to filter or order based on the hour in which a call occurred, but not the actual time. Really can't think of any use for this. 


    Possibly because there's some hour-grouped accumulator, that shows the number of calls done in each hour interval? So that management knows that x people are needed from 8-10, but only y from 11 to 12, and so on ...



  • @asuffield said:

    @element[0] said:

    So i'm thinking that maybe the CallHour is used by something else that consumes that xml, possibly some sort of xslt file that needs to do some showing/hiding/ordering based on the call hour and it was easier than converting the time from xslt?  just a guess but the only remotely sane reason for doing it that way.

    Which just leads us to the question of why anybody would ever want to filter or order based on the hour in which a call occurred, but not the actual time. Really can't think of any use for this.

    Maybe they have free nights and weekends, which kind of explains the shortcut attributes for both hour and day of week.



  • I would guess this is just some kind of dump with the same structure as their internal database schema, so it is easy for them to code but isn't necessarily optimal as a file format.  A little redundancy and inconsistency in syntax doesn't make this count as a huge WTF IMHO.



  • @seaturnip said:

    I would guess this is just some kind of dump with the same structure as their internal database schema, so it is easy for them to code but isn't necessarily optimal as a file format.  A little redundancy and inconsistency in syntax doesn't make this count as a huge WTF IMHO.

     

    A little redundancy and inconsistency in the database schema is way scarier than the 'multiple audiences' theory.

     

     

     

     



  • These files are supplied to us via an https site, with a typical login dialog on the front page.

    This morning I discovered that if you know the URL where the file is located, you don't actually need to authenticate.

    I'll just be checking the fuel level in my chainsaw...



  • @Hatshepsut said:

    @seaturnip said:

    I would guess this is just some kind of dump with the same structure as their internal database schema, so it is easy for them to code but isn't necessarily optimal as a file format.  A little redundancy and inconsistency in syntax doesn't make this count as a huge WTF IMHO.

    A little redundancy and inconsistency in the database schema is way scarier than the 'multiple audiences' theory.

    I worked on software in the telecom industry for more than 10 years (both long-distance and cellular) and I saw enough bizarre crap like this that I read my phone bills very carefully every month...



  • It would be very WTFey if CallTime was always in 12-hour format and the CallHour was required to differentiate AM/PM, i.e. CallTime="03:15:22" CallHour="15"...  I wouldn't put it past them.



  • @asuffield said:

    <z:row Version="C" CallDay="Mon" CallDate="01/10/2007" CallTime="08:03:21" CallHour="8" CallType="Outbound" DestType="LOCAL" CustCLI="..." Other_Party_No="..." Description="..." Duration="0.00130787037037037" DurationMins="1" Billed="0.04" AccountCode="UNKNOWN" CLIDescription="Switch" DDI="" OwnCLI="No"/>

    The real WTF is the XML, it should be

    <DATA>

     <CALL>

      <DAY>Mon</DAY>

      <DATE>1/10/2007</DATE>

      <TIME>08:03:21</TIME>

      <HOUR>8</HOUR>

      <TYPE>Outbound

       <DEST>LOCAL</DEST>

      </TYPE>

     </CALL>

    </DATA>
     

    Hate how all info is include in attributes and not done proper! 



  • @bugmenot1 said:

    @asuffield said:

    <z:row Version="C" CallDay="Mon" CallDate="01/10/2007" CallTime="08:03:21" CallHour="8" CallType="Outbound" DestType="LOCAL" CustCLI="..." Other_Party_No="..." Description="..." Duration="0.00130787037037037" DurationMins="1" Billed="0.04" AccountCode="UNKNOWN" CLIDescription="Switch" DDI="" OwnCLI="No"/>

    The real WTF is the XML, it should be

    <DATA>

     <CALL>

      <DAY>Mon</DAY>

      <DATE>1/10/2007</DATE>

      <TIME>08:03:21</TIME>

      <HOUR>8</HOUR>

      <TYPE>Outbound

       <DEST>LOCAL</DEST>

      </TYPE>

     </CALL>

    </DATA>
     

    Hate how all info is include in attributes and not done proper! 

    Seconded! 


Log in to reply