NoSQL candidate?


  • Discourse touched me in a no-no place

    I have little to no knowledge of NoSQL, but it has been proposed as an alternative to what we have at the moment:

    • We have telemetry data coming in from devices. A single device will have different 'groups' of data at different periods.
    • There is a limited (but expected to grow) number of such 'groups,' for example
    • one group of data may be GPS co-ords, speed and direction (maybe every second or 5 seconds.)
    • another group might be how much bandwidth has been used (say every minute)
    • a third may be system related stuff - CPU temp, load averages, disk space (every 5/10 minutes)
    • another may be customer usage data
    • At the moment each group is stored in its own MySQL table in the database. If a new group is required, a new table is created for the data to go live in.

    The database is meant to be the source of data for reporting tools, monitoring etc.

    Someone has suggested that due to the nature of the changing types/amounts of data being supplied, a NoSQL database would be better.

    Hoping I've supplied sufficient background/information, anyone with NoSQL experience have any thoughts before I go digging around for tutorials and the like?


  • sockdevs

    mongoDB is the current winner for nosql at the moment as far as i know.

    the docs are decent, and it'll certainly get the job done but.... well i don't like it. it's a reporting nightmare, as any nosql data store will be.



  • How are you going to query it? For stats, or per user, or olap/oltp?
    Your data model sounds like a good fit for a relational db.



  • @PJH said:

    The database is meant to be the source of data for reporting tools, monitoring etc.

    Are these doing live reporting or just generating static reports? How long is acceptable for data retrieval? I always default to structured RDMS's, so personally I'd be wary. Also, I am not sure what your current pain points are. Why are people suggesting an alterative?



  • @accalia said:

    mongoDB is the current winner for nosql at the moment as far as i know.

    the docs are decent, and it'll certainly get the job done but.... well i don't like it. it's a reporting nightmare, as any nosql data store will be.

    Mongo is the definition of mediocre IMO.


  • sockdevs

    @cartman82 said:

    Mongo is the definition of mediocre IMO.

    got a better nosql db engine?

    because mediocre defines my experience with nosql full stop. i still plan to eventually replace the nosql data store i use in @todo with a sqlite data store


  • sockdevs

    @PJH said:

    We have telemetry data coming in from devices

    I've done vehicle telematics stuff in the past; the system had a combination of live reporting and historical. And I can't think of a single reason why a NoSQL DB would be preferable to a well-designed relational DB.



  • A different team at my company had a similar use case (timed stream of events, different groups). They picked InfluxDB. It seems to be optimized for exactly what you need.

    My brief experience with it was building one of the data providers. Influx seemed nice. Communication through http. Web interface. SQL-like query language for examining the data.

    I'm not on that project any longer, but I heard grumblings from the permanent maintainers. Could be about influx, not sure. If you're interested and can wait until Monday, I can ask around.

    Other than that, IMO, there's nothing wrong with stuffing this into MySQL. The only downside is, if you want sharding and stuff like that, you need to implement this in your own code instead of have the db do it for you.



  • @accalia said:

    mediocre defines my experience with nosql full stop

    Qft.
    Also I think mediocre is being generous here...



  • @accalia said:

    got a better nosql db engine?

    because mediocre defines my experience with nosql full stop. i still plan to eventually replace the nosql data store i use in @todo with a sqlite data store

    Other document based NoSQL-s at least provide some special features, like temporal versioning in CouchDB. With mongo, other than the ease of "getting things going", there's nothing there. For real projects, I'd always go relational. Hell, even PostgreSQL is a better choice for document storage these days.

    Note that I'm excluding specialty databases, like column NoSQL-s or graph databases or caches/messaging nexuses, like redis, etc.


  • Discourse touched me in a no-no place

    @jaming said:

    Why are people suggesting an alterative?

    Because of the dynamic nature of the db schema makes querying it a pain. Apparently.

    If i could come up with a static schema for a RMDB that could cope with the requirements I'd be happier.

    What I don't want though is a huge EAV table...

    @cartman82 said:

    . If you're interested and can wait until Monday, I can ask around.

    Please. I'm not at work next week, so no rush.


  • sockdevs

    @PJH said:

    If i could come up with a static schema for a RMDB that could cope with the requirements I'd be happier.

    If you were using an objects-first ORM approach, then you could have an entire class hierarchy serialised to a single table; I know NHibernate can do this. Then you'd also be able to leave all the difficult query stuff to the ORM framework, and just work with the objects directly.

    …but I guess your system isn't designed that way. Not from what I can get out of the OP, anyway.


  • Discourse touched me in a no-no place

    @RaceProUK said:

    …but I guess your system isn't designed that way. Not from what I can get out of the OP, anyway.

    No, but something I'll look into.

    I basically have free reign to redesign.



  • @PJH said:

    If i could come up with a static schema for a RMDB that could cope with the requirements I'd be happier.

    :fa_thumbs_o_up:



  • SQLite (I use with SQLiteManger (and a few other managers I won't recommend because they are no longer supported)). I really like it's small footprint (good for transaction stuff like logging), You might have some trouble expecting it to do things it can't because you just got through writing robust queries against whatever your bigger DB is). Plus no user/pass authentication....although it can be compiled with that feature.


  • mod

    @PJH said:

    If i could come up with a static schema for a RMDB that could cope with the requirements I'd be happier.

    We had a similar issue with trying to design a system for a completely different purpose: we needed to store information that all needed to be used for a single report, but the structure of that data varied widely depending on the source of that data. Our solution ended up being along the lines of:

    • Put everything into one table table, with multi-purpose fields.
    • There would be one field that would identify the source (the group in your case), which in turn determined how the data is stored in each record.

    The structure looks something like (this is in a SQL Server, so adjust accordingly):

    Dynamic_Data             
     column                   | data type      | Notes
    --------------------------+----------------+----------------------------------------------------
     Dynamic_Data_ID          | bigint         | Primary Key
     Foreign_Table            | int            | References a master table that indexes all the
                              |                |  tables in the DB. Not my preference, but oh well.
     Foreign_ID               | bigint         |
     Received                 | datetime       |
     Source_Type_ID           | int            | We have a keyed lookup list table. This links
                              |                |  to the appropriate list in that table.
     Numeric_1                | numeric(15,10) |
     Numeric_2                | numeric(15,10) |
     Numeric_3                | numeric(15,10) |
     Numeric_4                | numeric(15,10) |
     Numeric_5                | numeric(15,10) |
    

    Admittedly the actual table is more complex, with fields for additional data types. Using this architecture, we only need to add a Source_Type to our lookup list, define how its records use the schema any time a new source type is created, and adjust the appropriate stored procs when a new source type is created. In the past 5 years, we've added about a dozen source types and we haven't had to adjust the table schema once.

    If you designed the procs with a way to automate selection of a view for each group, that would simplify maintenance even more. :P



  • Cassandra was the only NoSQL database worth looking when I was reading about this stuff.



  • @PJH said:

    NoSQL, but it has been proposed as an alternative to what we have

    <Lional Hutz>
    Oh! They got this all screwed up...
    No. SQL!
    </Lional Hutz>



  • cassandra?

    [Edit for more context] A bit of an older document related to benchmarking, but more or less still relevant today: http://www.planetcassandra.org/nosql-performance-benchmarks/#Toronto


  • sockdevs

    @Matches said:

    cassandra?

    never dealt with it personally. i'll give it a look the next time i get the brillant idea that document storage is a better solution than relational DB for a problem.


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.