.csv.xml



  • Well I was working on a little something and a certain manufacturer has these encrypted config files for their software to manage their ICs. I wanted to save some time and not type out the 200+ data items they had in their config files by hand so I decided to break their stuff.

    Needless to say, I opened a portal to hell.

    [code]
    <Comment> b4 format doesn't work yet in EVSW

               <Data> Saved Permanent Fail Flags,0, PF Flags, H4,0,0, x, x, b4:FBF|VSHUT|SUV|RSVD|SOCD|SOCC|AFE_P|AFE_C|DFF|DFETF|CFETF|CIM_R|SOT1D|SOT1C|SOV|PFIN|RSVD|RSVD|RSVD|RSVD|RSVD|RSVD|RSVD|RSVD|RSVD|RSVD|RSVD|RSVD|RSVD|SOT2D|SOT2C|CIM_A, -,0, FFFFFFFF,LCL:DF_Help_z60z65_RevA1.html#PF_Flags_1,0, - </Data>
    
            </Comment>
    

    [/code]



  • Dear $deity, that's some of the worst XML bastardy I've ever seen.


  • BINNED

    You say CSV... but is it? I see commas, I see colons, I see pipes...

    Is it time to play "How many separators?" now?



  • @Onyx said:

    You say CSV... but is it? I see commas, I see colons, I see pipes...

    Is it time to play "How many separators?" now?

    Clearly CSV means character-separated-values.



  • Right, it's hilarious. Commas are "fields". The colon in LCL:DF_Help_z60z65_RevA1.html#PF_Flags_1

    is a reference to go into a docs folder to provide help text if required.

    The pipes are for register bit definitions. In this case b4 means 4 bytes and 32 bits are being defined.

    But wait it doesn't support the b4 type in the software! (why the fuck is it there?)

    So let's shove it into <comment> tags and then add the real definitions below it.

    [code]

               <Data> Saved Permanent Fail 1 Flags,0, Saved PF Flags 1, H2,0,0, x, x, b2:FBF|VSHUT|SUV|SOPT1|SOCD|SOCC|AFE_P|AFE_C|DFF|DFETF|CFETF|CIM_R|SOT1D|SOT1C|SOV|PFIN, -,0, FFFF,LCL:DF_Help_z60z65_RevA1.html#Saved_PF_Flags1,0, - </Data>
    
               <Data> Saved Permanent Fail 2 Flags,2, Saved PF Flags 2, H2,0,0, x, x, b2:RSVD|RSVD|RSVD|RSVD|RSVD|RSVD|RSVD|RSVD|RSVD|RSVD|RSVD|RSVD|SOPT2|SOT2D|SOT2C|CIM_A, -,0, FFFF,LCL:DF_Help_z60z65_RevA1.html#Saved_PF_Flags2,0, - </Data>
    

    [/code]

    The rest of the file is...the same. The best part is the upper level tags

    [code]

    <ClsDesc> Calibration

       <SubCls SubClsId= 104> Data
    
               <Data> CC Sense Resistor Gain,0, CC Gain, F4,0.9419,0.9419, 9.419/x, 9.419/x, f.fff, -,1.00E-01,4.00E+00,LCL:DF_Help_z60z65_RevA1.html#CC_Gain,0, - </Data>
    

    [/code]

    Yep, there is text outside elements in the ClsDesc tag with more elements continuing. Suddenly it's HTML.


  • BINNED

    I've seen some shit man. Hell, I did some of that shit (limitations, XML would be a godsend to me if I could use it). But this...

    And I thought Cisco's XML phone configs were a pain. You have my condolences.



  • That's not bad. There's no SQL in there. Not even an exec('rm -rf /');.



  • Funny enough, I knew I was in for shit when I noticed their VB6 app had a C compiled DLL for decrypting the config file and parsing it. I knew I was in for even more shit when I noticed that their XML parser had an awful lot of hardcoded strstr calls with the tag names in the assembly.

    On the bright side, the manufacturer for their newest ICs has a brand new software. This time they took Eclipse the code IDE and turned it into a IC management system with absolutely nothing to do with code! Because that's a thing now.

    They did also roll a new format, proper XML....except now, data items are typically at certain offsets of the IC memory, i.e. 0x02 or 0x04. Well they decided to forgo an <offset> tag or attribute and theybasically must put all the item elements in the right order and byte count or else fuck you that's what.
    Their encryption for the new format is an passworded zip that took all of 5 minutes to dump the password from memory.



  • Why weren't they using str_real_strstr()?



  • OOoo and here's the best part.

    [code]
    <Comment>

    This is the new format bcfg file.

    [/code]

    There is something worse.



  • @Arantor said:

    Dear $deity, that's some of the worst XML bastardy I've ever seen.

    You ain't seen shit...

    <?xml version="1.0"?>
    <session version="1" name="profiling" hscale="1.1028762841514688E-7">
    <executable file=(...)> <envs> </envs> </executable>
     <expert>(...some event ids...) </expert>
    (...some column names or something...)
     <timelinestate>(...some kernel headers...) </timelinestate>
     <pdm>
    H4sIAAAAAAAAAJTdZVwV397GYcoCWxRU7BYVsVFUDLC7QFQEUezARjCwG/Vvdxd2dyeomNhgd2C3x5mbhfO7t/vFOZ/HZx3mey3ds87McrMFtoWNhcWf/7Ow/PP/LP+M6f78qtXRq1dwh465azX3zl2+XEmrP4ey//nl19q1XI/+3ft16d19cLN+/j0D/YMDG3dq5tfMr4v1n5wVpEwS6ewf3FEBGwPoq4c6PXoH9xqQBJL9iRl1UKFHr+COrXoFd0ssfw5aaDX9n1+FC6XUPrToVxVjsRwWFtcdLSyaWOA/yRIf6tKh6E7uNmmNrim5Lx7oXTtK14ycV1f0ZKula25w9n9+jfBKrvfr01MJ14Kc10z0TDula8m/3270Spela2Vwuf782pTSVu8dRtoK501u3ih0/2R2wvkYnLZyE0ujX60oXWuDS/XnV0gV9Kce0vmSy1UdvWUN6dqQU/27l3RtyZWqja5G5doZXJo/v2Lroe/4IJ2fwdn++fXyI3rrL9K1p/Wb/hXdYkp24fxp/
    (snipped: 60KB of seemingly Base64-encoded WTF)
     </pdm>
     </session>
    

    Yep, XML, now with 99% more Base64.



  • That looks 90% like what I was dealing with yesterday. GBs of XML files, most of the data being Base64.



  • There are no words to express the amount of horror I experienced when looking at that code block.



  • O wow, the .csv.xml parser also has an arithmetic lexer!!
    [code]
    (x64)/9.419, (x9.419)/64,
    [/code]

    Yes, it reads the data items and you can specify equations it will run on data items to convert between formats.

    Is there anything this can't do?



  • "DefaultChemIDVerificationToolKey-32BitsShowsPatternsinTheOutputFileSoIncreasingTheSizeTo>100,LetSizeBe =107"

    Self documenting security keys when I was reversing the encryption for this file.

    I've also been pondering this 107 character length vs. 32bits mention...it was an XOR cipher anyway.


  • ♿ (Parody)

    @Arantor said:

    Dear $deity, that's some of the worst XML bastardy I've ever seen.

    Have you guys seen (or recognize) this?

     <d:Dependencies>Microsoft.AspNet.WebApi.Client:4.1.0-alpha-120809:|2a486f72.Useful:1.0.4:|Microsoft.Net.Http:2.0.20710.0:|Newtonsoft.Json:4.5.11:|Microsoft.Bcl.Immutable:1.0.8-beta:</d:Dependencies> 
    

    It's the tip of the wtfberg...


  • BINNED

    Allow me to share my own personal monster then. Please note:

    • I have no luxury of XML, JSON or any such formats. Unless I want to write my own parser in a very WTF-y lanugage
    • I have no luxury of dictionaries, or even arrays

    So, for your consideration, exhibit A:

    (09.+)|(^0[1-8].+)|([1-9][0-9]{5}.*)|(^2[0-9]{2})|(00.*):SOMETRUNK;(^6[0-9]{2}):PBX;.+@.+:Local

    Explanation of the unsightly mess: it's a "two level" CSV. First level is separated by semicolons, so this is one of the values (picking a shorter one so it's easy to see):

    (^6[0-9]{2}):PBX

    Within this, there is a second second separator, a colon. This is used thusly:

    1. Grab a value from the CSV (separator: ;)
    2. Grab the first value from the new CSV (sparator: :)
    3. Compare the regex you just got with the number dialed. If it matches, return the second value from the CSV in step 2 (in this example PBX)
    4. Otherwise, goto 1

    This is used to route calls to different SIP trunks depending on the dialed number.

    A slighty "nicer" one:

    SOMETRUNK:0(0[0-9]{6}[0-9]+|[0-9]{6}[0-9]+)_$1;OTHERTRUNK:385(.+)_0$1

    This one adjusts the caller ID based on the trunk the call came in. Yes, one of the separators in an underscore. Running out of characters here!

    Comparing to stuff I see here it doesn't look as bad. But I still cringe when I see it.


  • Discourse touched me in a no-no place

    @delfinom said:

    Yes, it reads the data items and you can specify equations it will run on data items to convert between formats.

    Is there anything this can't do?

    Paged topics?



  • Actually the software that reads the file breaks down the configuration file to a display of items. The items are broken down by class and then subclass. Classes are just tabs/pages. But then the inside of those....
    Normally you would have a single listview and just group it.
    Even if it was VB6 it shouldn't be that hard.

    Nope!

    Let's go fancy.
    It takes your window width, finds out how many fixed width listviews can fit. Then it also computes how many items there should be per listview to minimize the overall height.
    Then it starts populating the listviews. Mind you, it inserts the items and splits them between listviews with subclasses having their header in one listview and then their items in the rest if it must.
    It recomputes the mess everytime you resize the window.
    If your window height is small, you end up with 2 listviews with split items where theres apparently some random max number of items and scrollbars.

    Because you know, a listview with a single scrollbar is hard.


Log in to reply