Parsing JSON from C++



  • I've been learning the JSON API that comes with Qt recently. I haven't really dealt with JSON before, but I am slowly getting used to the Javascriptyness of it. But, I feel like I must be using the API mostly wrong when I start to have lines that are long enough that I get completely lost by the end of them. Sadly, this line is my fault, rather than a WTF I can blame on somebody else, but I figure some of you may enjoy poking fun at me...

    [code]
    QJsonObject jsonTrack = TimelinesArray.first().toObject().value("Master").toObject().value(trackName).toObject();
    [/code]

    And then it goes from there in about the same way to parse out all the stuff in the track...



  • @forkazoo said:

    I've been learning the JSON API that comes with Qt recently. I haven't really dealt with JSON before, but I am slowly getting used to the Javascriptyness of it. But, I feel like I must be using the API mostly wrong when I start to have lines that are long enough that I get completely lost by the end of them. Sadly, this line is my fault, rather than a WTF I can blame on somebody else, but I figure some of you may enjoy poking fun at me...

    <font face="Lucida Console" size="2"> QJsonObject jsonTrack = TimelinesArray.first().toObject().value("Master").toObject().value(trackName).toObject(); </font>

    And then it goes from there in about the same way to parse out all the stuff in the track...

    It's not you. Handling JSON is usually painful but if you were working with a more modern language (like C#) you could use something like json.net or json2csharp.



  • What the heck does value("Master") return that has to be converted to an object? A proper JSON library should deserialize the JSON blob into a native object graph.

    The way I use it in C#, the equivalent code would be:

    <font face="Lucida Console" size="2"> Track atrack = Timelines[0].Master.Track[trackName]; </font>


  • @Jaime said:

    What the heck does value("Master") return that has to be converted to an object? A proper JSON library should deserialize the JSON blob into a native object graph.

    The way I use it in C#, the equivalent code would be:

    <font face="Lucida Console" size="2"> Track atrack = Timelines[0].Master.Track[trackName]; </font>

    Looking at http://qt-project.org/doc/qt-5.0/qtcore/qjsonobject.html#value and http://qt-project.org/doc/qt-5.1/qtcore/qjsonvalue.html it would seem that .value() returns a JSON "value" that may be any of the JSON types (string, boolean, number, object, undefined, null, array) and .toObject() is a cast to object. Objects are implemented as a dictionary of string to "value"s.



  • @Arnavion said:

    @Jaime said:

    What the heck does value("Master") return that has to be converted to an object? A proper JSON library should deserialize the JSON blob into a native object graph.

    The way I use it in C#, the equivalent code would be:

    <font face="Lucida Console" size="2"> Track atrack = Timelines[0].Master.Track[trackName]; </font>

    Looking at http://qt-project.org/doc/qt-5.0/qtcore/qjsonobject.html#value and http://qt-project.org/doc/qt-5.1/qtcore/qjsonvalue.html it would seem that .value() returns a JSON "value" that may be any of the JSON types (string, boolean, number, object, undefined, null, array) and .toObject() is a cast to object. Objects are implemented as a dictionary of string to "value"s.

    Methods that return any one of seven types (without a common interface) are TRWTF.

    When are people going to learn: in a dynamic language, be dynamic. In a strongly typed language, be strongly typed. The job of a JSON library in C++ should be to encapsulate the blob in a type-safe wrapper.



  • @Jaime said:

    Methods that return any one of seven types (without a common interface) are TRWTF.

    Well the .value() method technically returns a single type - QJsonValue. It just happens to be an encapsulation of one of seven types. The fact that it is encapsulation instead of inheritence means the OP had to write

    value.toObject()
    instead of
    dynamic_cast<QJsonObject>(value)
    and you'll admit the two are equivalent(ly verbose).

    @Jaime said:

    The job of a JSON library in C++ should be to encapsulate the blob in a type-safe wrapper.

    C++ doesn't have introspectable classes at runtime to be able to deserialize blobs in a type-safe manner. Admittedly this is Qt, where classes inheriting from QObject do have run-time introspection.



  • @Arnavion said:

    @Jaime said:
    The job of a JSON library in C++ should be to encapsulate the blob in a type-safe wrapper.

    C++ doesn't have introspectable classes at runtime to be able to deserialize blobs in a type-safe manner. Admittedly this is Qt, where classes inheriting from QObject do have run-time introspection.

    It can be done without reflection, good old fashioned code generation from a schema definition would work just fine.

    The library goes about solving the problem in the worst possible way. All the type-safety of JavaScript while still keeping all the bad parts of C++. If I had to write a library that fills the same role as QJsonObject, I'd just load the data into XML and use XPath to get to it. Something like:


    <font face="Lucida Console" size="2">
    // Loads the graph represented by the JSON blob into a private XML document encapsulated in a thin API.

    QJsonObject timeLines = new QJsonObject(JSONblob);

    // Internally invokes an XPath query and wraps the returned XML fragment in a QJsonObject.

    QJsonObject jsonTrack = timeLines.SelectSingle("TimeLine[0]/Master/Track[@name='trackname']");

    </font>

    It would probably take a few hundred lines of code to write the whole library and you'd get plenty of dynamicness, a rich query syntax, and readable code. If you want speed and type safety, then build a proper deserializer.



  • @Jaime said:

    @Arnavion said:
    @Jaime said:
    The job of a JSON library in C++ should be to encapsulate the blob in a type-safe wrapper.

    C++ doesn't have introspectable classes at runtime to be able to deserialize blobs in a type-safe manner. Admittedly this is Qt, where classes inheriting from QObject do have run-time introspection.

    It can be done without reflection, good old fashioned code generation from a schema definition would work just fine.

    The library goes about solving the problem in the worst possible way. All the type-safety of JavaScript while still keeping all the bad parts of C++. If I had to write a library that fills the same role as QJsonObject, I'd just load the data into XML and use XPath to get to it. Something like:


    <font face="Lucida Console" size="2">
    // Loads the graph represented by the JSON blob into a private XML document encapsulated in a thin API.

    QJsonObject timeLines = new QJsonObject(JSONblob);

    // Internally invokes an XPath query and wraps the returned XML fragment in a QJsonObject.

    QJsonObject jsonTrack = timeLines.SelectSingle("TimeLine[0]/Master/Track[@name='trackname']");

    </font>

    It would probably take a few hundred lines of code to write the whole library and you'd get plenty of dynamicness, a rich query syntax, and readable code. If you want speed and type safety, then build a proper deserializer.

    Or you use the high quality code already available.

    @Mozilla Dude said:

    Use these functions to parse a sequence of characters as JSON. The parsing rules applied by these methods are exactly those specified by ECMAScript 5. Various JSON extensions like trailing commas, unquoted property names, more generous number parsing, and so on are not supported.



  • @Jaime said:

    It can be done without reflection, good old fashioned code generation from a schema definition would work just fine.

    Yes, of course it can be done at compile-time. Generating a deserializer at compile-time is the least flexible implementation since you can only deserialize an object of a type you know about (and have the class definition of) at compile time. Dynamic deserialization at runtime (the one I mentioned) is slightly more flexible since it would work for unknown but introspectable types. The current implementation is the most flexible implementation possible at the cost of being verbose to code against.

    @Jaime said:

    If I had to write a library that fills the same role as QJsonObject, I'd just load the data into XML and use XPath to get to it.

    Ew.

    @Ronald said:

    Or you use the high quality code already available.

    That API is fundamentally equivalent to what Qt has so I don't think Jaime will like it either.



  • @Ronald said:

    Or you use the high quality code already available.

    @Mozilla Dude said:

    Use these functions to parse a sequence of characters as JSON. The parsing rules applied by these methods are exactly those specified by ECMAScript 5. Various JSON extensions like trailing commas, unquoted property names, more generous number parsing, and so on are not supported.

    Good idea... load up an entire JavaScript runtime just to deserialize the world's simplest data format. Also, it would deserialize it into a JavaScript object in the runtime you just loaded, which would require code even more cumbersome than the code that started this thread to interact with. Well, unless you just write your entire program in JavaScript and Eval() it. That would be quite an "inner platform".



  • @Jaime said:

    The library goes about solving the problem in the worst possible way. All the type-safety of JavaScript while still keeping all the bad parts of C++. If I had to write a library that fills the same role as QJsonObject, I'd just load the data into XML and use XPath to get to it. Something like:

    Ah, of course. After an afternoon spent bashing my head against JavaScript in C++, that vague sense of something missing must have been the lack of XML involved. You never can have too much XML, right?

    Anyway, I'm glad it's not just me and some other folks are finding the Qt JSON API to be... Slightly other than intuitive. That said, I did get my parser working. Well, it passes a simple test case anyhow. It needs a bit of cleanup and some tinkering so that it works with more general input. As an old-skool guy, learning all this new stuff is certainly interesting. It'll be fun when I have my native app being populated with JSON data pulled directly from a shiny new Ruby On Rails web app. Assuming I don't shoot myself in the face by then...

    Anyhow, on to figure out Boost.Python.



  • @Arnavion said:

    @Jaime said:
    It can be done without reflection, good old fashioned code generation from a schema definition would work just fine.

    Yes, of course it can be done at compile-time. Generating a deserializer at compile-time is the least flexible implementation since you can only deserialize an object of a type you know about (and have the class definition of) at compile time. Dynamic deserialization at runtime (the one I mentioned) is slightly more flexible since it would work for unknown but introspectable types. The current implementation is the most flexible implementation possible at the cost of being verbose to code against.

    Ummm..... if you deserialize an object of a type you don't know about, how would you know how to use it? Exactly what flexibility are you looking for? The flexibility to deserialize an object of an unknown type and access members that will be defined at some time in the future, using code that you wrote in the past (or present)?



  • @Jaime said:

    Ummm..... if you deserialize an object of a type you don't know about, how would you know how to use it?

    Most serializers include a type tag field of some sort that tells the deserializer what type to deserialize it as. So if I receive a JSON blob that was serialized from an object of type Foo, then the deserializer will deserialize it as an object of type Foo even if I don't tell the deserializer to deserialize it as type Foo. If I don't care about the actual type Foo but merely some interface IFoo that I know it implements, I don't even need to know that it is of type Foo. I'll simply treat the pointer as IFoo*

    @Jaime said:

    The flexibility to deserialize an object of an unknown type and access members that will be defined at some time in the future, using code that you wrote in the past (or present)?

    No need to be snarky. This is exactly how remoting is implemented in Java etc. where entire class definitions are streamed from the server to the client runtime but the client code only accesses these deserialized instances via an interface.



  • @Jaime said:

    @Ronald said:

    Or you use the high quality code already available.

    @Mozilla Dude said:

    Use these functions to parse a sequence of characters as JSON. The parsing rules applied by these methods are exactly those specified by ECMAScript 5. Various JSON extensions like trailing commas, unquoted property names, more generous number parsing, and so on are not supported.

    Good idea... load up an entire JavaScript runtime just to deserialize the world's simplest data format. Also, it would deserialize it into a JavaScript object in the runtime you just loaded, which would require code even more cumbersome than the code that started this thread to interact with. Well, unless you just write your entire program in JavaScript and Eval() it. That would be quite an "inner platform".

    Or you could instantiate a browser and use its javascript engine.



  • @Ronald said:

    It's not you. Handling JSON is usually painful but if you were working with a more modern language (like C#) you could use something like json.net or json2csharp.

    I have recently been working on a project that involved passing complex data back and forth over a WebSocket connection. Say what you like about those folks at Google, but their GSON library does a neat and intuitive job of converting JSON to JAVA and back.


  • Discourse touched me in a no-no place

    @Ronald said:

    Or you could instantiate a browser and use its javascript engine.

    Save that for OMGWTF3…



  • @dkf said:

    @Ronald said:

    Or you could instantiate a browser and use its javascript engine.

    Save that for OMGWTF3…

    I have to say, the week they started posting the results I kinda stopped reading the frontpage after a while because I did not find the submissions funny. I don't even remember any of those I did read. I'm sure everyone had a laugh coding their shit but I don't think the WTF other people do on purpose is interesting.



  •  Of course as with all problems in C++, this can be solved easily and neatly with macros (sadly not with templates)

     

    #define OBJFIRST .first().toObject()
    #define OBJVALUE(key) .value(key).toObject()
    // add defines that do .toInt and so on
    

    // and now your code is just
    auto jsonTrack = TimelinesArray OBJFIRST OBJVALUE("Master") OBJVALUE(trackName) ;

     


  • Discourse touched me in a no-no place

    @Mo6eB said:

    Of course as with all problems in C++, this can be solved easily and neatly with macros (sadly not with templates)
    I'm not at all convinced that that's making things better.



  • For a period of about 3 years, the PHP documentation comments (evidently considered canon by the PHP community at large) contained this masterpiece of EvilEval for parsing JSON in older versions.

    I came across it shortly after its posting, and dutifully reported it, as well as making a comment pointing out how dangerous it was. The documentation maintainers responded by promptly removing the rogue code doing absolutely nothing.

    Largely ignored, I later posted a Proof-of-Concept demonstrating how easy it was to exploit. Finally realising the grave severity of the issue, the documentation maintainers jumped into action, replacing the comment with a responsible explanation as to why the code was removed deleting my PoC and prior comment warning of the danger and leaving the dangerous comment unscathed.

    Thankfully, someone later wrote a less-stupid version, and, earlier this year, the comment was finally removed for its security risk being too old, and the code was never put into widespread use is now used in over 30,000 places.


  • Considered Harmful

    @SamC said:

    For a period of about 3 years, the PHP documentation comments (evidently considered canon by the PHP community at large) contained this masterpiece of EvilEval for parsing JSON in older versions.

    I came across it shortly after its posting, and dutifully reported it, as well as making a comment pointing out how dangerous it was. The documentation maintainers responded by promptly removing the rogue code doing absolutely nothing.

    Largely ignored, I later posted a Proof-of-Concept demonstrating how easy it was to exploit. Finally realising the grave severity of the issue, the documentation maintainers jumped into action, replacing the comment with a responsible explanation as to why the code was removed deleting my PoC and prior comment warning of the danger and leaving the dangerous comment unscathed.

    Thankfully, someone later wrote a less-stupid version, and, earlier this year, the comment was finally removed for its security risk being too old, and the code was never put into widespread use is now used in over 30,000 places.


    Perhaps you could post your PoC here, and some grayhat vigilante might teach them a security lesson with a cluestick.



  • @joe.edwards said:

    Perhaps you could post your PoC here, and some grayhat vigilante might teach them a security lesson with a cluestick.
     

    I tried that once before.

     


Log in to reply