Help me pick a persistence solution

cartman82

I need a quick and dirty way to store current state of my node.js service, so that it can recover after crash / being shut off unexpectedly.

Basically, I need just 3 operations:

save all
save single key (presuming the whole dataset is something like a hashmap)
load all

That's it. No queries, search etc.

Right now I'm using redis, because that's what I had available, but that's less than ideal. I'm not getting the advantage of things redis is good at (fast in-memory access, pub/sub) and I'm getting all the disadvantages (memory usage, another process to think about, crappy persistence).

I'm definitely getting rid of the process and going with a database as a library.

So, my first thought was SQLite. But even that seems like an overkill. I don't really need any of its querying capabilities.

My second thought was, just use simple JSON files, one per "table" (or object/hash/array) I want saved. But that doesn't give me good way to do key saves. I'd have to rewrite the whole file each time (which might be more efficient than update?).

My third thought, just hack together an ad-hoc file storage system myself. Like, directories are "tables" and files inside are individual keys. Something like that. But that sounds WTF-y and not something I should be wasting my time on if there are other solutions available.

My primary concern is write safety. I need to do "save state" and know the state is carved in stone, and will be there when I get back.

My secondary concern is write speed, I don't want to get bogged down into this slow-ass VFS storage system I'll probably be working with.

Right now, I'm closest to doing just save JSON every time and forget about keys, but I'm afraid that could turn out to be too expensive (up to something like 1000 saves per second, each a hash with like 1000 objects).

Your thoughts?

Matches

You're missing some really important info, like size of data to be stored and frequency of change. 1000/1000 is a great goal, but is that deeply nested objects?

Also, concurrency. Do multiple processes write?

Rdbms are specifically designed for atomic commits, so something like sqlite would be a quick local data store that guarantees atomic.

dkf

@cartman82 said:

So, my first thought was SQLite. But even that seems like an overkill. I don't really need any of its querying capabilities.

Maybe not yet. But they're there if you want them. However, SQLite was specifically designed to be used as a way to save some shit out easily (so much so that I've seen someone suggest putting all their program's data structures in SQLite and not been able to figure out if they were taking the piss) and it handles all the atomic write stuff for you which it is otherwise really difficult to get right. (Seriously, getting disk syncs right is one of these things that you want someone else to code up correctly for you. There's a lot of wrong stuff out there.)

cartman82

@Matches said:

You're missing some really important info, like size of data to be stored and frequency of change. 1000/1000 is a great goal, but is that deeply nested objects?

Hash with ~20 keys, 2 nested hashes with ~10 keys each. 1KB when saved to HDD. Nothing special.

EDIT:

Frequency of change, we are talking 1000 / second "key updates" across different collections (so, I wouldn't be hitting the same file 1000 per sec, no debouncing possible). I'd need a full collection dump much more rarely than that. That's why some kind of "per key" solution appeals to me.

@Matches said:

Also, concurrency. Do multiple processes write?

No, single process owns its data.

@Matches said:

Rdbms are specifically designed for atomic commits, so something like sqlite would be a quick local data store that guarantees atomic.

@dkf said:

and it handles all the atomic write stuff for you which it is otherwise really difficult to get right. (Seriously, getting disk syncs right is one of these things that you want someone else to code up correctly for you. There's a lot of wrong stuff out there.)

What's the magic sauce? Why is it better than writing a string to a single file in one go, synchronously?

dkf

@cartman82 said:

Why is it better than writing a string to a single file in one go, synchronously?

It's something relating to doing a sync() on the right thing, together with the right write pattern and the right locking pattern. I was told once and I remember thinking “well that's not obvious at all” and promptly forgetting the details. And it's different on Windows.

I vaguely remember that the actual hard part (on non-networked filesystems) is ensuring that the writes have got out of the disk's internal firmware cache. Expensive parts might have internal battery backup to ensure that things can be cleaned up even if a power outage happens at exactly the wrong moment, but most people don't do that.

Networked filesystems are a whole 'nother level of pain.

Matches

If you have a micro storage requirement for a personal app, sure, but as soon as you introduce concurrency or huge data, nope the fuck out.

dkf

@Matches said:

If you have a micro storage requirement for a personal app, sure, but as soon as you introduce concurrency or huge data, nope the fuck out.

If you need lots of data (more than a few gigabytes) or you have plenty of concurrent load, use a database server. But installing a full PostgreSQL server to hold your app's settings is far into overkill territory…

Matches

There are LOTS of failure modes for disk writes, that you generally never see. Power failure, storage full, another file ducked with your file, etc. Rdbms create their own file, and many grow by a set size (pre allocate space) which gives you an early alert something is about to happen. They write log files that say the expected commit rows and details, then compare when it's done to make sure the commit was successful. The phrase atomic commit refers to full success, or failure with errors. You will never get partial state committed data unless you've forced something very bad through configuration options, AND forced an error state (think turning off the transaction log to save space, then hit the power button during a commit)

ben_lubar

I've used this in the past:

GitHub - steveyen/gkvlite: Simple, ordered, key-value persistence library for the Go Language

Simple, ordered, key-value persistence library for the Go Language - steveyen/gkvlite

cartman82

@ben_lubar said:

I've used this in the past:

This actually looks exactly like what I need. I'll do some googling for a similar node.js lib.

NTAuthority

LevelDB is key/value, like Redis, but is on-disk by definition? Then again, I know nothing about proper storage choices, so no warranty is provided for this suggestion.

cartman82

In the meantime, I actually did a little test instead of just guessing.

GitHub - thecartman82/dbsync: Test SQLite vs direct file writing

Test SQLite vs direct file writing. Contribute to thecartman82/dbsync development by creating an account on GitHub.

Basically, I generate about 1000 records in a hash and then write them out in a tight loop. It simulates CPU pressure I might experience, although I'd need another layer to fully simulate the amount of data I want handled.

You can start it with an sqlite or file adapter.

These are the hot paths:

file.js

exports.set = function (collection, id, data, callback) {
	data = JSON.stringify(data);
	fs.writeFile("/tmp/data/tasks_0001/" + id, data, function (err) {
		//...
	});
};

sqlite.js

exports.set = function (collection, id, data, callback) {
	data = JSON.stringify(data);
	db.run("INSERT OR REPLACE INTO Tasks_0001 VALUES($1, $2)", [id, data], function (err) {
		//...
	});
};

Code complexity in general is pretty similar.

Results:

https://imgur.com/tQ5VnrC

Initially, sqlite takes about double the RAM (60MB vs 30MB) and CPU.

After 10-20 minutes, the resident memory of the file implementation jumps up to a similar level.

https://imgur.com/bKt192Z

Not sure what conclusions, if any, I can draw from this. It seems sqlite library itself takes about 5-10 MB ram, everything else is node and linux messing with stuff.

CPU, on the other hand, seems to be on the file adapter's side. Although, this could be the case of the file adapter offloading some of the work to linux (eg. file lookup, buffering etc.), so it is not shown here. Sqlite needs to do all that stuff on its own.

PleegWat

The higher CPU usage is only really important if you're going to be CPU bound. How many operations per second does each mechanism accomplish? Do you expect other disk accesses (EG config reads) from the same process, and how does it perform if those are going on as well?

blakeyrat

How much data are you storing? How frequently are you storing it?

blakeyrat

... wait what? What are you testing for? Are you RAM or CPU-bound, is this an issue you might face?

cartman82

@NTAuthority said:

LevelDB is key/value, like Redis, but is on-disk by definition? Then again, I know nothing about proper storage choices, so no warranty is provided for this suggestion.

Not very impressed by this, seems to have single key space, no bulk get/set, etc. Ben's thing fits better.

@blakeyrat said:

How much data are you storing? How frequently are you storing it?

Right now, I'm closest to doing just save JSON every time and forget about keys, but I'm afraid that could turn out to be too expensive (up to something like 1000 saves per second, each a hash with like 1000 objects).

Hash with ~20 keys, 2 nested hashes with ~10 keys each. 1KB when saved to HDD. Nothing special.

EDIT:
Frequency of change, we are talking 1000 / second "key updates" across different collections (so, I wouldn't be hitting the same file 1000 per sec, no debouncing possible). I'd need a full collection dump much more rarely than that. That's why some kind of "per key" solution appeals to me.

@PleegWat said:

The higher CPU usage is only really important if you're going to be CPU bound.

@blakeyrat said:

... wait what? What are you testing for? Are you RAM or CPU-bound, is this an issue you might face?

The service itself is pressured from both ends. High RAM usage is the main reason I'm getting rid of redis. But also, I'll probably be hovering around 100% cpu, so every little reduction counts.

RAM-bound vs CPU-bound isn't the right terminology here.

@PleegWat said:

Do you expect other disk accesses (EG config reads) from the same process, and how does it perform if those are going on as well?

Yes. Other "threads" will be doing their own writes, but with lesser frequency. A loop like this will be the main pressure point.

Can't test all that without implementing it for real.

@PleegWat said:

How many operations per second does each mechanism accomplish?

It's very close, but file sync is slightly faster. After about 40 minutes, file has 240000 updates vs 226000 by sqlite. Probably due to CPU usage.

Matches

If you really are that pressured, have you considered a message queue like rabbitmq (or whatever equiv for node) that can redirect work to additional worker machines?

It would provide a way to offload work and improve access and storage requirements by reducing load across the board

(Where you end up processing more by handling less per machine)

Onyx

Are you sure it might not be a good idea to move the heavy processing stuff to something more performant then? I'm not saying port the whole project, but can the part of the service that crunches all that data be ported to something like C++ that would, hopefully, be more performant if done right? Or is it too much code / too hard to decouple?

Matches

Porting to a different language is rarely a good idea, often you would completely screw your debugging abilities and confuse your coworkers.

Onyx

True, and I wouldn't even suggest porting the whole thing. But it seems there's a bit of it that does some heavy crunching. Now, if it's a tiny bit that can be implemented in not much code (like, all it does is parse some stuff and convert it to a different format for the rest of the service to consume), it might be an option. If you can't decouple it from the rest easily then yeah, you shouldn't mess with it.

cartman82

I'm already distributed. We are talking here about optimizing a single worker machine, so we'd need to pay for fewer.

cartman82

Not really applicable in my case, for multiple reasons.

Matches

If your single worker machine is pegged at 100cpu, 100 memory you are doing too much on the machine. 1000 updates to saved vars and millions of reads per second shouldn't peg like that so you have a problem:

your hardware is insufficient for your needs
your logic is inefficient (application)
your distribution isn't optimized
your core processing is too tightly knitted with storage and retrieval

Both boil down to

your hardware is insufficient
your software is poorly optimized

I don't say it as a jab, but rather the need to classify the issue. Often being cpu and memory bound while being distributed suggests you're poorly implementing both solutions. (Your logic isn't optimized for storage/retrieve and sharding work) or your hardware is simply overtaxed, so throw more hardware at it.

Lorne Kates

Just outsource to a third world country. Set up terminals that will receive the GUID stored in a cookie, and the serialized session state. The workers will write it down on a whiteboard.

When the session state needs to be loaded again, the worker will find the correct whiteboard, and retype the serialized session state into a terminal, which will be returned to your server.

It'll be faster and less error prone than any solution you'll dream-hack up yourself.

Filed under: And cheaper, since you won't have to be for medical benefits

cartman82

@Matches said:

If your single worker machine is pegged at 100cpu, 100 memory you are doing too much on the machine. 1000 updates to saved vars and millions of reads per second shouldn't peg like that so you have a problem

Since I'm doing more or less background processing, I wouldn't agree 100% utilization is the kind of problem it'd be in a high availability situation (eg. a web server).

Even if it is a problem, it's never an either-either. Yes, my code can be improved. Yes, we can buy more hardware. But my job ultimately is to improve the code as much as possible, so more money can go to me instead for the servers :-)

So with that in mind, I have to determine what to do with the time I'm given. Sure, rewriting huge swaths or reachitecting the whole system would be great. But I might get maybe 3-4 days top to do some refactoring / consolidation, before going back to feature churn. So in that time available, this redis refactoring is basically the most bang I can get for my buck.

@Lorne_Kates said:

Just outsource to a third world country. Set up terminals that will receive the GUID stored in a cookie, and the serialized session state. The workers will write it down on a whiteboard.

When the session state needs to be loaded again, the worker will find the correct whiteboard, and retype the serialized session state into a terminal, which will be returned to your server.

It'll be faster and less error prone than any solution you'll dream-hack up yourself.

I AM the third world.

Stupid yanks already think we are doing that.

cartman82

Ok, so I managed to reserve the next 2-3 days for consolidation and refactoring. So it's time to decide.

As I haven't found anything premade I like, it's between SQLite and my own custom key-value store. I wrote down ups/downs as I see them.

SQLite

+ already coded, tested, stable
+ single file database (easy to manage, eg. copy etc.)
+ gives more functionality, if ever needed
- SQL syntax a poor fit (more surrounding code)
- Slightly less performant
- Native module, can cause problems with deployment

Custom KVS

+ better fit for what I need
+ performant
+ hackable from outside
+ would like to give it a shot in terms of coding
- need to be coded, tested
- potential pitfalls of designing own dbms (sync?)
- multiple files / directories, feels dirty
- constrained in design, can't be nothing more than needed

It was close. Very close.

I almost started setting up the git project for my new KVS. But in the end, I think I'll just do the quick SQLite integration.

Prevailing factor against my KVS:

- need to be coded, tested

With 3-4 days (stretching to a full week, with some politicking), I'd have a decent shot. With just 3 tops, I'll have to code in my free time. No.

Prevailing factor for SQLite:

+ gives more functionality, if ever needed

Once I started thinking about it, I can already see how I can gain further optimization by dumping some of the stuff I keep in RAM onto the disk. And once we are in that territory, SQLite's disadvantages in terms of syntax and higher complexity become advantages.

So, thanks @dkf

As always in the Help section, if things go wrong, it's all your fault.

swayde

@cartman82 said:

But in the end, I think I'll just do the quick SQLite integration.

Wise decision methinks...

tufty

You've made the right decision. You should remove this from the Custom KVS list, too. It would almost certainly not end up being true.
+ performant

cartman82

@tufty said:

You've made the right decision. You should remove this from the Custom KVS list, too. It would almost certainly not end up being true.+ performant

Read further up, tests show it would.

accalia

@cartman82 said:

Read further up, tests show it would.

in its current configuration, yes.

can you imagine it staying so as you add functionality to it in the future?

.... actually i can, so long as it's still just serializing a standard data object to JSON.... the JSON library in node is pretty optimized.

okay... so scratch that argument.