Representing probability set data in an SQL database (or is there a better way)?



  • My personal project involves tons of probability mappings. Basically they're fancy collections of entities (well, really strings right now, but they could be ids just as well) with a weight assigned. These are used as input when generating the actual output entities.

    Effectively, I have Generator entities and Output entities (I don't use those names, but that's the concept). Each Generator has a bunch of data in the form of these mappings and rules for constructing the Output entities by choosing "one from mapping A, 3 from mapping B unless X, in which case N from mapping C, etc" and the Output entities are mostly just dumb, static, immutable DTOs that get displayed on screen or saved to a text format/printed/whatever. So all the interesting parts are in the Generators and the related data.

    Currently, the raw data (the mappings) are stored in deeply-nested JSON files. But that's becoming cumbersome and error prone (requiring hand-editing of JSON) and I want to upgrade how I edit and store them. I also want to learn and practice more with various technologies like Entity Framework (it's a C#/Windows/WPF project). So I'd like to re-implement the current spaghetti design backed by a "real" database (ok, probably just SQLite, because it's a personal project and doesn't need much more than that).

    How would you go about creating these mappings? Are they just really a mapping class containing something like an entity ID and a probability? Which then gets translated into a crap-ton of tables? Or should I just stop worrying about that.


  • Discourse touched me in a no-no place

    @Benjamin-Hall FWIW, SQLite has stuff to work with JSON these days. (I've not used that stuff — my data model only requires storing the JSON and nothing else, as all the info I care about is in other columns — but I've read about it.) That might help you when you ingest stuff; you can put things into a table and process that to build your operational data model.


Log in to reply