Fear ye, all DBAs who dare to tread here (or: The Column of Doom™)


  • ♿ (Parody)

    Probably, but that's just hiding the problem under an abstraction. Maybe a materialized view would "solve" the problem, but now you're doing a less efficient method of normalizing the data on import, which would make everything much easier to work with.



  • You could write a materialized view like this, but that doesn't solve the problem of getting the text files back out of the system after you've changed them. For that, you need to write some code that can reconstruct the file-- and if you're writing the code anyway, you might as well do the database schema correctly in the first place.



  • @blakeyrat said:

    You could write a materialized view like this, but that doesn't solve the problem of getting the text files back out of the system after you've changed them.

    The only processing that we have currently planned for these records should be able to work on them in their denormalized form -- it's the reporting queries that require the renormalization, and a materialized view would be quite suitable here; normalize-on-import requires slightly fiddly special casing, and a separate processing pass for this would be more trouble than it's worth I think.



  • @tarunik said:

    The only processing that we have currently planned for these records should be able to work on them in their denormalized form -- it's the reporting queries that require the renormalization, and a materialized view would be quite suitable here;

    Oh well do that then. I think the keyword you want is PIVOT in T-SQL-speke. EDIT: or you can trick the XML parser built-into SQL Server to do it for you, that's a lot faster IIRC.

    You were the one who said earlier that you need to write the "said structured text files" back to disk.



  • @blakeyrat said:

    You were the one who said earlier that you need to write the "said structured text files" back to disk.

    Yes -- but we can do that from the denormalized form.


  • Discourse touched me in a no-no place

    @boomzilla said:

    That's awful and you should be ashamed of yourself. Still, you're joining on CLOBs. While I didn't specify, the spirit of my imagination was something worse that didn't involve CLOBs.

    RDF data, stored in a BLOB bit for bit as it's been pushed in from some shitty web server that didn't bother to store the encoding of the text data. The things to index on are stored as relative URLs inside the RDF data, and some of the critical things you need to extract need both an OWL reasoner and some ridiculous self-JOINed monstrosity to build up a sort of transitive closure of the relations described within.

    The intermediate-level encoding of the RDF data may be XML. Or TTL. Or JSON-LD (this is distinct from the low-level encoding which might be UTF-8 or UCS-16 or any number of weird things some crazy client specified).

    If you've no idea why I would propose this horrible thing, you are ever so lucky.



  • @dkf said:

    UCS-16

    ITYM UCS-2. UCS-16 would be like UTF-128.


  • Discourse touched me in a no-no place

    @ben_lubar said:

    ITYM UCS-2. UCS-16 would be like UTF-128.

    I suspect I was conflating UCS-2 and UTF-16.



  • @ben_lubar said:

    UCS-16 would be like UTF-128.

    For encoding every written language in the galaxy.



  • Update -- the materialized view thing won't work, because the query I was able to find to do all the string-splitting needed to populate the materialized view:

    1. still takes half an hour (although that's an order of magnitude improvement over the OP, it's still far too long to refresh a materialized view), and
    2. confuses Oracle so badly that it has no idea how to fast refresh the materialized view based on the query, even if it's a R/O materialized view. (The query cross joins to a subquery, which is what trips up Oracle, apparently -- having a complicated mess of subquery in your FROM clause just causes Oracle to complain and say "nope, can't fast refresh this, there's this big blob of subquery in your defining query I can't get rid of".)

    So, I'm going to have to write out the side table myself, instead of letting Oracle maintain it for me. :sigh:


Log in to reply