Postgres JSON :wtf:


  • Impossible Mission - B

    Postgres has a JSON type. This is a good thing...

    ...right up until you try to run a select distinct operation and it fails, because there's no equality operator defined for the JSON type.

    Seriously?!? How can this possibly be a problem, when JSON is textual data? The equality operator is string comparison; is there any reason at all why it needs to be more complicated than that?


  • Winner of the 2016 Presidential Election

    @masonwheeler said in Postgres JSON :wtf::

    Postgres has a JSON type. This is a good thing...

    ...right up until you try to run a select distinct operation and it fails, because there's no equality operator defined for the JSON type.

    Seriously?!? How can this possibly be a problem, when JSON is textual data? The equality operator is string comparison; is there any reason at all why it needs to be more complicated than that?

    Given JSON represents a not-necessarily-ordered serialization of an object, I could see a case being made that JSON equality should account for objects being the "same" but with fields in a different order. Or something like that.



  • @masonwheeler said in Postgres JSON :wtf::

    Seriously?!? How can this possibly be a problem, when JSON is textual data? The equality operator is string comparison; is there any reason at all why it needs to be more complicated than that?

    There is no guaranteed string order when converting to JSON.

    {"a": "b", "c": "d"} and {"c": "d", "a": "b"} are equivalent objects but not equivalent strings.

    At least that was the official explanation when I looked into it. Personally, it seems (to me) you can get around this by having a well known stringifying order, but there are probably deeper nuances that I don't grasp.


  • Impossible Mission - B

    @dreikin ...huh. That's actually a decent point.

    Doesn't apply to the scenario I discovered this in, (trying to determine whether or not two rows in a denormalized table that was created by copying values from other tables are actually duplicates,) but yeah, I guess that's a valid point.



  • I think SELECT DISTINCT works on jsonb, where the ordering is well-defined.



  • @dreikin said in Postgres JSON :wtf::

    @masonwheeler said in Postgres JSON :wtf::

    Postgres has a JSON type. This is a good thing...

    ...right up until you try to run a select distinct operation and it fails, because there's no equality operator defined for the JSON type.

    Seriously?!? How can this possibly be a problem, when JSON is textual data? The equality operator is string comparison; is there any reason at all why it needs to be more complicated than that?

    Given JSON represents a not-necessarily-ordered serialization of an object, I could see a case being made that JSON equality should account for objects being the "same" but with fields in a different order. Or something like that.

    Postgres has two types of JSON.

    There are two JSON data types: json and jsonb. They accept almost identical sets of values as input. The major practical difference is one of efficiency. The json data type stores an exact copy of the input text, which processing functions must reparse on each execution; while jsonb data is stored in a decomposed binary format that makes it slightly slower to input due to added conversion overhead, but significantly faster to process, since no reparsing is needed. jsonb also supports indexing, which can be a significant advantage.

    Because the json type stores an exact copy of the input text, it will preserve semantically-insignificant white space between tokens, as well as the order of keys within JSON objects. Also, if a JSON object within the value contains the same key more than once, all the key/value pairs are kept. (The processing functions consider the last value as the operative one.) By contrast, jsonb does not preserve white space, does not preserve the order of object keys, and does not keep duplicate object keys. If duplicate keys are specified in the input, only the last value is kept.

    In general, most applications should prefer to store JSON data as jsonb, unless there are quite specialized needs, such as legacy assumptions about ordering of object keys.

    If I'm reading the documentation correctly, the regular comparison operators should work for jsonb type...

    For json, you'd need to cast it to text first; you could do anything you'd normally do with text then.

    Another thing that might help is using ->> to extract JSON fields as text, if what you're comparing them to is text.


Log in to reply