JSONx is Sexy LIESx

ben_lubar

Well, if you're using Go, you just use someone else's serialization library (the standard library has json and xml support) and your own data and it works without writing any code specific to your types.

Maciejasjmj

@ben_lubar said:

Well, if you're using Go

...then serialization is the least of your problems?

@flabdablet said:

So instead of adding serdes methods to your objects, you just bolt friends onto the sides of them?

Not really; you just stuff that simple object into a (De)serializeObject<T> or its equivalent in your library and it uses its reflection magic to figure out its layout, no coding required. That's how JSON.NET works, at least.

Of course, converting the object representation of JSON to your domain object might require some assembly.

dkf

@flabdablet said:

So instead of adding serdes methods to your objects, you just bolt friends onto the sides of them? How much complexity does that save, in practice? ("That depends" is the answer I'm expecting there.)

In practice, you do virtually all of it through the properties on the POJO/POCO, guided by annotations on the fields and the class. It's actually a form of declarative programming, and that is good for this sort of thing.

You don't want to write all the serializer/deserializer by hand, as that's really dull and yet simultaneously awkward code. And yes, it's been tried. Over and over. We use declarative programming, introspection and library serialization engines for good reasons!

Eldelshell

Well, you don't really have to use annotations unless you want to customize the output. For example, this is all the code I need (no annotations in my models) to generate some XML (or JSON) using XStream:

xstream.toXML(myObj, outstream);

The only annotation I'm using in this case is:

@XStreamOmitField 
private transient Boolean failed;

Of course, myObj is a standard POJO with some custom datatype fields but those are configured on an application level:

xstream.registerConverter(new LocalDateTimeConverter());

It's not as easy as in JavaScript, but it's pretty straight forward and works with XML or JSON indistinctly.

dkf

@Eldelshell said:

Of course, myObj is a standard POJO with some custom datatype fields but those are configured on an application level:

xstream.registerConverter(new LocalDateTimeConverter());

I'd be more used to configuring that sort of thing via annotations. The aim would be to avoid having to explicitly say anything about the serialization engines anywhere in the business logic, leaving all that to the framework (which could use a very simple registration approach since the complexity would be handled via discovered declarative metadata of some nature).

Pedantry ahoy!

@Eldelshell said:

It's not as easy as in JavaScript, but it's pretty straight forward and works with XML or JSON indistinctly.

I know what you're trying to say, but that use of “indistinctly” doesn't work in English; “works equally well with XML and JSON” would be better phrasing. English is funny sometimes.

cartman82

@VaelynPhi said:

I am aware of the JSON spec. My point is that undefined is a valid value in Javascript, it is a value value for an object in Javascript, and it is a valid value for an array in Javascript. That "Javascript Object Notation" lacks an ability to handle it by specification should tell you something about whoever wrote the spec.

Also NaN, Infinity and FUCKING COMMENTS.

Fuck json.

EDIT:
Yes, I know it's been 4 days. Thread could be about bunnies by now, for all I know. Whatever. I wanted to say that, so there.

tarunik

Comments I can understand the lack of, but not being able to serialize the IEEE 754 exceptional values that have been supported since the late 80s? SHAME ON YOU, JSON!

Jaime

@tarunik said:

Comments I can understand the lack of, but not being able to serialize the IEEE 754 exceptional values that have been supported since the late 80s? SHAME ON YOU, JSON!

You guys are picking on the inability to handle infinity, but forgot to bitch about JSON's lack of date support? None of the other problems mentioned here matter to me as much as the lack of real date support.

As far as I'm concerned, any sane serialization format should handle the data that I regularly move around. I don't send undefined or infinity all that often (actually, never). I should simply have to look at the list of formats supported by my client and server stacks, and pick one from the intersection of those two based on size, performance, and debugability. There seems to be a bunch of people trying to tie JSON closer to JavaScript instead of using it as a way to get stuff from JavaScript to/from anywhere else. It's pretty telling that JavaScript is weak enough in this area that it's moving the entire market. Any sane platform just says "Oh, that's the new trend - here's a serializer for it".

tarunik

@Jaime said:

You guys are picking on the inability to handle infinity, but forgot to bitch about JSON's lack of date support? None of the other problems mentioned here matter to me as much as the lack of real date support.

It's a lot easier to work around "we don't have a native date type in this serialization format" -- you just use ISO 8601 strings or Unix timestamps as a convention -- than "we have no way under the sun to serialize parts of the value space of one of our fundamental types".

boomzilla

@Jaime said:

It's pretty telling that JavaScript is weak enough in this area that it's moving the entire market.

Is that it, or is it the prevalence of doing stuff in browsers? That's what my assumption is.

Jaime

@boomzilla said:

Is that it, or is it the prevalence of doing stuff in browsers? That's what my assumption is.

If JavaScript had enough plumbing to make a decent serializer back when it was taking off, someone would have invented and actually better format rather than inventing a format who's main design goal was to be easy to implement in JavaScript.

In other words, the only problem JSON solved is that SOAP was hard to do in JavaScript. If the real problem was that SOAP is verbose or hard to read or not standardized well enough, the end result should have looked quite different.

flabdablet

@tarunik said:

we have no way under the sun to serialize parts of the value space of one of our fundamental types

{
    "NaN": "vs0yeEUYnu2RFdYEIUL3",
    "Infinity": "EMAyL5fyo1bmk5ra5ckK",
    "NegativeInfinity": "SAju182HTodq7jXcnEge"
}

You're welcome.

Jaime

@tarunik said:

It's a lot easier to work around "we don't have a native date type in this serialization format" -- you just use ISO 8601 strings or Unix timestamps as a convention -- than "we have no way under the sun to serialize parts of the value space of one of our fundamental types".

Both NaN and infinity can be serialized as strings too. Why are you OK with that solution for dates, but not those two? I'm not OK with serializing dates as undecorated strings because it leads to ambiguity. A deserializer can't tell a bad date from a good string. It also creates an extra step on both ends to properly consume the data.

EvanED

And trailing commas in lists.

tarunik

@Jaime said:

Both NaN and infinity can be serialized as strings too. Why are you OK with that solution for dates, but not those two? I'm not OK with serializing dates as undecorated strings because it leads to ambiguity. A deserializer can't tell a bad date from a good string. It also creates an extra step on both ends to properly consume the data.

I'm OK with them being serialized as textual content -- but as strings they create a data-dependent ambiguity (vs. "I always expect a date here").

Jaime

@tarunik said:

I'm OK with them being serialized as textual content -- but as strings

You're going to have to give me more than this. What format can textual content have other than "string"?

tarunik

@Jaime said:

You're going to have to give me more than this. What format can textual content have other than "string"?

It's the difference between syntactic text and quoted text.

Ragnax

@Jaime said:

Yeah, but that treats the whole data structure as an untyped mess. Instead of getting serialization errors on bad formats, you get null reference exceptions on use. That defeats the whole point of strong typing. It's deserialization to a POJO or POCO that I was referring to.

In .NET, if you use the excellent JSON.NET library, it will look at the types of the POCO's properties and decide accordingly whether an object literal should be deserialized to another strongly typed POCO, or should be treated as a Dictionary<string,T> where T is possibly another strongly typed POCO.

If deserialization fails at any point in the object graph, you get the equivalent of a parse error.

You don't get null-reference exceptions on use and you're not confronted with an 'untyped mess' being forced down your throat if you don't want it. (You can still get dynamic access through a special subclass of C#'s DynamicObject and you even get some form of LINQ operations on it for querying a complex object graph.)

@Jaime said:

Both NaN and infinity can be serialized as strings too. Why are you OK with that solution for dates, but not those two? I'm not OK with serializing dates as undecorated strings because it leads to ambiguity. A deserializer can't tell a bad date from a good string. It also creates an extra step on both ends to properly consume the data.

You can have an agreed upon property carrying 'metadata' describing the type(s) to use for deserialization, if you really need to. Ofcourse; if you're always deserializing to a well-defined class, you will always know the type in advance and won't need such metadata. Then it just becomes a matter of writing a proper value converter. E.g. one that knows how to interpret a string value of "NaN" or "Infinity" as a numerical NaN or Infinity in whatever your target language is.

JSON itself is meant as a basic serialization format to fullfill the base needs. And more often than not, those base needs suffice and forego a lot of the complexities, headaches and plain stupid of something like SOAP-wrapped XML. If you need more, you are expected to bolt it on top.

Jaime

Are you responding to me or someone else? For example:

@Ragnax said:

Then it just becomes a matter of writing a proper value converter. E.g. one that knows how to interpret a string value of "NaN" or "Infinity" as a numerical NaN or Infinity in whatever your target language is.

You said this in response to me saying that an extra step is required to handle the data. You didn't refute the point, you just described what the extra step is and pretended I didn't already know what it was.

Jaime

@tarunik said:

It's the difference between syntactic text and quoted text.

OK, still not getting your point across. How about explaining why it's OK to use regular strings for dates, but not infinity?

tarunik

@Jaime said:

OK, still not getting your point across. How about explaining why it's OK to use regular strings for dates, but not infinity?

A string in JSON is a quoted string: {"foo": "abc"}. The syntax I am talking about is {"bar": +Infinity} -- note the lack of quotes here; it is no longer a string value, but a string syntax element treated as a numeric value. Representing it as {"bar": "+Infinity"} changes the type of the value, and is thus problematic because you now have a value with a dependent type. This doesn't come up for dates because the representation of the date as a JSON-supported type (either a string or a numeric timestamp) is independent of the value of the date.

ben_lubar

Go constants are 256-bit floats without the ability to have NaN or ±Infinity.

Thanks Discourse for having ± show up in the preview but not the actual post. @discoursebot

Bulb

@ben_lubar said:

±

Just type Compose+-. What's so hard?

PS: @discoursebot, did you miss the summon from (Hanzo'd) edit or did Dicsource forget to tell you?

dkf

@tarunik said:

A string in JSON is a quoted string: {foo: "abc"}.

You've got to quote the key too in JSON (unlike JS): {"foo": "abc"}

accalia

@Bulb said:

PS: @discoursebot, did you miss the summon from (Hanzo'd) edit or did Dicsource forget to tell you?

hmm.... possibly offline....

@loopback0

Zoidberg

Hooray! People are paying attention to me!

boomzilla

@Bulb said:

Just type Compose+-. What's so hard?

±

Actually had to hold down shift, then compose+, release shift and -. If I didn't have shift already down before, I just got +.

tarunik

Thanks -- fixed now.

Jaime

@tarunik said:

This doesn't come up for dates because the representation of the date as a JSON-supported type (either a string or a numeric timestamp) is independent of the value of the date.

But it's still ambiguous. +Infinity is obviously a number with your rules. "2014-12-08T07:37:37Z" could be either a string or a date. JSON needs something like #2014-12-08T07:37:37Z# to represent a date.

tarunik

@Jaime said:

But it's still ambiguous. +Infinity is obviously a number with your rules. "2014-12-08T07:37:37Z" could be either a string or a date. JSON needs something like #2014-12-08T07:37:37Z# to represent a date.

You're asking "should my serialization format have a self-type-describing date representation?" I'm saying "that doesn't matter -- from the viewpoint of receiving software, either you're expecting a date or you're expecting a text string/number, and if you are expecting a date, you simply squawk at non-dates (admittedly, this is much easier for strings than it is for numeric timestamps)."

In contrast -- the Infinity difficulty is far harder to get out of with a string due to either a) the introduction of dependent typing into the mix, or b) having to convert the value of a fundamental type in and out of strings yourself -- which can lead to bugs if you, say, don't carry quite enough precision with you (an easy mistake for someone who isn't FP-literate to make).

dkf

@Jaime said:

But it's still ambiguous. +Infinity is obviously a number with your rules. "2014-12-08T07:37:37Z" could be either a string or a date. JSON needs something like #2014-12-08T07:37:37Z# to represent a date.

You don't need that. JSON gives you exactly what you need already.

{
    "@typeURN": "iso:8601:2004:timestamp",
    "year": 2014,
    "month": 12,
    "day": 8,
    "hour": 7,
    "minute": 37,
    "second": 37,
    "timezone": 0
}

Jaime

Where is that from? It's certainly not part of the standard.

dkf

@Jaime said:

Where is that from? It's certainly not part of the standard.

I invented it for the post but it's using an obvious partition of the fields and then encoding those as a JSON object. The timezone is an offset in minutes following the usual sign convention (Z corresponds to 0 of course; EST would be -300). The @typeURN is the only truly arbitrary thing in there, but I couldn't be bothered to figure out anything better. (I suppose I could write up some JSON Schema for it, but meh.)

Jaime

Just like a string, that would require some type of step or else it would just be deserialized as an object with date-like properties. My problem isn't that it isn't possible to represent a date. My problem is that the JSON spec doesn't describe a representation, so it fails in it's primary role of defining a data interchange specification.

This shows up in the real world as a half dozen different solutions built into various technology stacks that don't work and play well together.

accalia

and this whole date issue is why sockbot stores dates as timestamps.

(new Date()).getTime()

PJH

@flabdablet said:

```
{
"NaN": "vs0yeEUYnu2RFdYEIUL3",
"Infinity": "EMAyL5fyo1bmk5ra5ckK",
"NegativeInfinity": "SAju182HTodq7jXcnEge"
}
You're welcome.</blockquote>

Those don't *look* like MD5sums....

Oh. I see....

accalia

@PJH said:

Oh. I see....

pulling a guess out of my tails.... base64 encoding of the internal state of the IEEE NAN values?

PJH

No. Random mashing at the keyboard. Or something akin to it.

Since Google hasn't picked up on this thread yet, none of the three turn up any results.

accalia

oh.

well then.

it's not often answers pulled out of my tails are right. this time was no exception.

;-)

Bulb

@Jaime said:

My problem is that the JSON spec doesn't describe a representation

Yeah, colleague's been dealing with some Amazon API and the timestamps were represented as strings in whatever the Java's default Date-or-whatever-they-used spits out by default. With textual timezone names. Pretty crappy to deal with in any other language especially if the system does not provide locale data for the standard locale interface (so we resigned and converted it in Java first—the app is in C++, but the API is Android-specific so there is some Java layer for connecting to the Java-only interfaces)

@Jaime said:

so it fails in it's primary role of defining a data interchange specification

There are two paths. Start ambitious like XML and create a huge complicated specification that many developers won't understand and will misuse the thing, or start humble like JSON and create a simple specification which the developers will have to extend to meet their needs, often inevitably in crappy ways. Either way you end up with lot of crap.

I don't think JSON ever had the ambition for the interchange part. The intended use was primarily for serialization over network within project. XML was intended for the interchange part. Unfortunately because it's too complicated and too ugly, many projects switched to JSON. There is also YAML, but apparently most developers got scared by it's use of significant whitespace (it has some advantages like there is a standard way to annotate the strong types).

VinDuv

@accalia said:

```javascript
(+new Date())

<abbr title="Golfed That For You">GTFY</abbr>

<hr>Filed under: <a>Quoting a code block still sucks</a>

Magus

@VinDuv said:

@accalia said:
(+ (new Date))

GTFY

LTFY

Jaime

@Bulb said:

I don't think JSON ever had the ambition for the interchange part.

And that's why it's stupid that people are using it as such. That doesn't fix the fact that dates are a bitch in JSON - even for same-app use.

VaelynPhi

@Rhywden said:

Erm, if the data store is filtered then I don't really see the point in your complaint...

There are two stores here: the online server memory and the flat file. These will eventually be replaced by a database. The purpose of the current system is to stick as close to the semantics the database will use as possible.

@Rhywden said:

Also, my example was just one of the problems. Just imagine a JSON object, where every other entry was undefined and similar issues.

This is easy: [1,,3,,5,,7,,9]... and so on. The point @tarunik made earlier about JS arrays already having a notation for this and JSON for some reason not supporting it is very relevant here.

@Rhywden said:

Done. Simply use my_array.getByKey(1) or something. I mean, even if your original proposal worked you'd still have to deal with the missing entries. Prototype replacement functions for the methods you need once and you're done.

Searching the entire array for an object's inner key is much slower than accessing that object at an element's position. Also, it doesn't reflect the semantics of the database at all. Simply keeping the nulls I'm stuck with and adding a few lines to deal with them is far less work than creating the machinery necessary to support this manual indexing proposal. With plain arrays, I just push; with your proposal, I need to keep track of the next key manually, which is another thing the DB handles that I won't have to.

@ben_lubar said:

Arrays, like everything else in JavaScript, are hashes.

On the bottom level, yes; top-level semantics are relevant here. For instance, you can't just arbitrarily call members of Array on any old Object.

@flabdablet said:

is fine; if JSON allowed its arrays to be sparse and used the same notation as Javascript does, it wouldn't need an extra keyword.

This is true; I should have said "value" instead of "keyword". I can't off the top of my head think of a reason to include the keyword.

@flabdablet said:

In that case, you could tidy things up a bit by just accepting the use of null as the JSON placeholder value for undefined JS array elements, and passing JSON.parse() a reviver function that returns undefined whenever it sees a null.

Indeed, but instead of bothering with these (which would also require me to modify the middleware here, which I have not even looked into doing), I'm just skipping nulls in the client code for array-like data "manually" just as the functions walking those arrays would have skipped undefined.

@Eldelshell said:

eah, I meant unsorted not unordered. Anyway, the point stands on relying on an array's index of an element to base decisions upon.

That index is the unique identifier of the data at that index. Please explain in depth how basing access to that data on its index (sort of the very definition of an array) is stupid. Also, the array is indeed sorted--everything gets pushed onto the back. The order is the order in which items were added. (Sort of like a DB table with a unique, monotonously increasing key... hrm...)

@flabdablet said:

Yes, this is treating a sparse array as if it were mostly a hashmap. But in Javascript, as in many other languages that support sparse arrays, arrays implement hashmap semantics as well as the indexing, ordering and all-at-once processing you typically get with non-sparse arrays, and using them can result in cleaner and more concise code than generalizing to non-numeric keys and using straight-up hashmaps would do.

I could not have said this better.

@flabdablet said:

Inner-platforming Javascript in order to avoid its perfectly usable O(1) array lookup operation in favour of a verbose O(n) EAV-style replacement? Very enterprisey.

The burn is strong with this one.

@Rhywden said:

That yielded the following in Chrome on an i5 3570

In duplicating your test I found that the #1 slow thing was creating those arrays. Interestingly, reading them in from JSON was way faster.

Object size: 10^2 or ~2^7 vs 10^7 or ~2^23 is about 2-3x... looks more like O(log2(n)) to me. I wonder what searching that large object for your inner key would have been...

@Rhywden said:

Yes, your point being? If you can't use the "perfectly usable" JS arrays due to deficiencies in your data source, then I don't quite see why you criticise me for posting an alternative?

You seem to have missed my original point: JSON does not support the full range of JS values, making processing arrays sent via JSON require logic that does not exist for arrays not sent via JSON. This could be fixed with a small adjustment to the spec.

@flabdablet said:

But you can, and @VaelynPhi already is.

Indeed, I am. The magic code looks something like this:

map(sparseObj, function(v){ if(v===null || v===undefined) return; })

@Rhywden said:

Secondly, if the stringify-method does indeed accept such functions, then I don't understand the OP's complaint at all. I mean, we're all familiar with the concept that casting from one data type to another doesn't always happen with 100% accuracy and without any data loss.

As I pointed out earlier in this reply, I do not have (easy) access to the middleware that is doing this translation to and from JSON; in fact, it's only this issue that made me aware of the fact that it is using JSON for this purpose. Also, it is highly unlikely that a developer would have access to this kind of middleware, and often would not be able to change it.

@ben_lubar said:

var foo = function(){};
console.log(JSON.stringify(foo)); // undefined

I wasn't going to bring up JSON's other oversights, since I think it's obvious that many of them would require significant respeccing.

@Jaime said:

A spare array is something that probably shouldn't exist in your data interchange format.

Except that sparseness is a semantic feature of data that one would possibly like to interchange. As a rule, if you want to communicate it between parts of your program, somewhere there is someone who wants to communicate it between machines, probably for about the same reasons.

@Jaime said:

The accepted way to serialize the type of data that JavaScript can handle as a sparse array is a collection of name-value pairs.

This is a common approach, yes; whether it is good is another argument entirely.

@accalia said:

fun fact. in C array index IS an O(1) oepration.

funner fact there are three different speeds that O(1) operation can complete in.

listed in fastest to slowest speeds

Index value is in CPU cache (L1, L2 or L3 (which are different speeds, but they're very close))
Index value is in Core memory (RAM)
Index value is in RAM that has been paged to disk.

I LOLed, and scared my cat, when I originally read this. You forgot one:

Index value is stored in a database on IIS and isn't available because the server is rebooting for updates.

@ben_lubar said:

It is O(n).

The data do not seem to support this.

@FrostCat said:

Maybe it is, but n is something like .0000001, so who cares?

For Big-O, n is generally an integer representing operations, iterations, or data elements. A fractional value is... novel here.

@dkf said:

Jaime:
Yeah, but that treats the whole data structure as an untyped mess.

Yes, it's an honest representation.

I LMAOed.

@flabdablet said:

In fact any decent hash table implementation is likely to be closer to O(log n) than O(n), and its actual t(n) function will look something like a + b log n with b very small relative to a. The entire point of a good hash table is to achieve a nearly constant lookup time.

I tried finding something reliably showing this benchmark, but I didn't want to have to look through 140 revisions on jsperf to find the one that didn't think [{ key: 1000000, value: "bob" },{ key: 1000001, value: "sue" }] was a valid array to test lookup on.

@flabdablet said:

Personally I would still strongly prefer that internal hashmap implementation details for my client language of choice did not leak into my JSON interchange formats; {"key": "0", "value": "zero"} as a proposed replacement for {"0": "zero"} just smells horrible to me.

Which itself is a poor way to represent ["zero"]. (I know, you said hashmaps... I was just saying.)

@Jaime said:

In order to end up with the problem the OP had, you almost certainly are in a situation where you should model it as a map or a key/value pair.

This is very not true. Given that clients are able to add or remove elements, the arrays are dynamic by nature. And, as I have already said, I am not using a map and everything works fine. The issue is that functions expecting true arrays (which have undefined values) require extra handling when those values are null instead. JSON simply supporting sparse arrays would complete remove this issue.

@Jaime said:

I don't understand how people end up with these JSON blobs that don't have a defined schema. How would you expect to do anything meaningful with the data if you don't even know its structure? How does a property "go missing"?

For objects it would be passing strange for there to be no schema or no idea what to expect. For an array, there are two possible "empty" values: null or undefined. One is, in effect, "this is empty because nothing goes here", while the other is "nothing is what should be here".

@flabdablet said:

Exactly, which is why using the untidy version smells bad.

Unnecessarily prolix data interchange formats give me hives. The entire point of text-based data interchange is easy human readability; obscuring the actual data in redundant boilerplate is an error, to my way of thinking.

And in my case, it completely bypasses the builtin methods for dealing with array objects, making them harder to work with, which was my original complaint about JSON not preserving undefined.

@ben_lubar said:

Well, if you're using Go, you just use someone else's serialization library (the standard library has json and xml support) and your own data and it works without writing any code specific to your types.

Okay, now you're starting to sound like a Jehovah's Witness even to me.

@cartman82 said:

Also NaN, Infinity and FUCKING COMMENTS.

Fuck json.

I'm actually a little surprised these aren't represented... but I suppose portability might be an issue there... but I suppose that brings up all the other arguments in this thread....

I think the JSON spec needs to grow a little.

@ben_lubar said:

Go constants are

Seriously, WTH? I think we need a @lubarbot to automatically translate all his posts into lojban.

Jaime

@VaelynPhi said:

Except that sparseness is a semantic feature of data that one would possibly like to interchange.

I doubt it. In languages without sparse arrays, key-value pairs fill the same logical role. The true access semantics are either lookup by key or lookup by index. If you intended to create something that behaves exactly as JavaScript sparse arrays do, then you need professional help.

fwd

@accalia said:

and this whole date issue is why sockbot stores dates as timestamps.
(new Date()).getTime()
```</blockquote>

    Date.now()

accalia

@fwd said:

Date.now()

supported by V8...

must remember to switch to using that.

ben_lubar

ECMAScript Language Specification - ECMA-262 Edition 5.1

It's from 2011. If your interpreter doesn't support that, you should probably update it.

flabdablet

@VaelynPhi said:

Which itself is a poor way to represent ["zero"]. (I know, you said hashmaps... I was just saying.)

Yeah, it is, but it's about as good as JSON can do for a sparse array. Of course, now that I have finally understood that you don't get any control at all over the actual JSON stringify/parse process, it's all moot; you can't do better than the post-facto null-stripping you're already doing with map().

flabdablet

@PJH said:

Random mashing at the keyboard. Or something akin to it.

If I'm going to generate random values with effective probability 0 of ever being generated anywhere else by any process ever again, I don't trust my keyboard. Those all came from random.org.