Mongofail
-
I know some people here deride MongoDB for claiming to be a database.
I'm astonished at the reckless attitude shown by MongoDB. My first tought was "aren't they using persistent data structures for the indexes?" That way they could grab the index when the query starts and it would be guaranteed to remain unchanged throughout the query. Then I realized I know shit-all about database engineering, just like the MongoDB team.
-
@gleemonk MongoDB is a disaster, that shouldn't ever be used by anyone. So disregard anything these morons do.
In a side note, in pt_BR the word "mongo" is a slang for something like "retard".
-
@fbmac I wouldn't care if I didn't have a few hundred lines of code I'm suddenly unsure about. Just when I thought I'd adapted to the Mongo ways of doing things, they spring that on me. But yeah, the mongo way says you just keep an aggregate field.
Funny to see how "mongoloid" was shortened to a slur in different languages.
-
a few hundred lines of code I'm suddenly unsure about
Take this chance and switch to Postgres, I heard it even has a JSON mode if you don't want to change your data structures.
-
@gleemonk Mongo is explicitly a "eventual consistency", this is expected behavior.
-
@blakeyrat said in Mongofail:
@gleemonk Mongo is explicitly a "eventual consistency", this is expected behavior.
I feel like that's somehow even worse.
"It's fine that it's broken. It's broken by design."
-
@error It's not designed to be used for what they're using it for. It's designed to be used for stuff like Twitter. Where is a tweet is a couple minutes "late" nobody cares.
-
@blakeyrat If I understood correctly, it's not like you get a stale version of the object. That's kind of the expected behavior in ACID databases too. The problem is, you get NO record, as if there is no record at all matching the query. Even if the old version should satisfy it.
That's a WTF.
EDIT: Read up the entire article. I was correct, well, more or less. The problem is, mongo doesn't isolates its queries, which wrecks havoc in certain types of ordered index searches.
The problem isn't so much mongo's implementation (which is kind of naive), as this whole idea that we will support SOME kind of indexes and atomicity, but do a half-assed job at it.
-
The article explains it pretty well. Queries are guaranteed to not grab multiple versions of a document. They have no guarantees about the database itself. Which means that unless you force it to do a full table scan every time, an index can change while the index scan is running. If you need full-table or full-database consistency, you shouldn't be using a document-based database.
-
@ben_lubar said in Mongofail:
They have no guarantees about the database itself.
function query( ...args ) { return []; // no results, ever. technically compliant? }
Filed under:
O(1)
-
@error The difference is that MongoDB works in cases where it's not under heavy load and your code never has heavy load but also doesn't work.
-
@ben_lubar so for some cases our code does the same thing?
-
@ben_lubar so for some cases our code does the same thing?
Yes. For cases where the database query has no results they do the same thing.
-
@ben_lubar Well, we both make no guarantee that there will be results, even if some records could seem to match.
-
@ben_lubar Well, we both make no guarantee that there will be results, even if some records could seem to match.
Let's say you're on page 4 of a book and someone removes page 5 and puts it between pages 2 and 3. What's the next page you'll read?
-
@ben_lubar The web page to file a complaint with the library.
-
@ben_lubar said in Mongofail:
Let's say you're on page 4 of a book and someone removes page 5 and puts it between pages 2 and 3. What's the next page you'll read?
None...
Fuck that book.
-
@ben_lubar said in Mongofail:
If you need full-table or full-database consistency, you shouldn't be using a document-based database.
Strictly, you shouldn't be using MongoDB in that case. Higher levels of consistency are possible while managing documents, though at a cost of requiring more synchronization, and hence slower writes. Sometimes that's an acceptable trade-off.
-
@error You can't have a database that's fast, scalable, and returns fully accurate results all the time. It's mathematically impossible. You have to sacrifice one thing or another. Different databases sacrifice different things.
The problem here is everyone rushed to use MongoDB because it was new and cool without understanding what it is for.
-
@anonymous234 said in Mongofail:
@error You can't have a database that's fast, scalable, and returns fully accurate results all the time. It's mathematically impossible. You have to sacrifice one thing or another. Different databases sacrifice different things.
The problem here is everyone rushed to use MongoDB because it was new and cool without understanding what it is for.
In what cases, would data inaccuracy would be an acceptable trade-off?
-
In what cases, would data inaccuracy would be an acceptable trade-off?
Twitter or Facebook timelines? Missing an entry for a while (until the index rebuilding catches up) isn't the end of the world there.
-
@WPT As they said before, Twitter, where it doesn't matter that much if a new tweet takes a while to appear or if the page says a user has 15,803 tweets but you can only see 15,801. Or any analytics system (think Google Analytics) where you need to aggregate millions of requests (that are coming in real time) into a few numbers.
-
@anonymous234 said in Mongofail:
fast, scalable, and
returns fully accurate results all the time.Mongo DB
@anonymous234 said in Mongofail:
fast,
scalable, and returns fully accurate results all the time..... i dunno..... mariaDB? maybe?
@anonymous234 said in Mongofail:
fast, scalable, and returns fully accurate results all the time.Postgres
@anonymous234 said in Mongofail:
fast,scalable, and returns fully accurate results all the time.MsSQL
@anonymous234 said in Mongofail:
fast,scalable, andreturns fully accurate results all the time.Oracle.
-
@error the eventual consistency isn't mongo's worst fault. mongo used to lose data, and it's performance and use of memory stink
-
@accalia
@anonymous234 said in Mongofail:fast, scalable, and
returns fully accurate results all the time.
-
@anonymous234 said in Mongofail:
fast, scalable,and returns fully accurate results all the time.Mongo DB
FTFY
-
@anonymous234 said in Mongofail:
fast, scalable, andreturnsfully accurate resultsall the timeHerpes
-
@anonymous234 said in Mongofail:
fast,scalable, andreturns fully accurate results all the time.Oracle.
Oracle is incredible at scaling.
As in, you won't believe how much the license cost goes up.
-
@Dragnslcr said in Mongofail:
@anonymous234 said in Mongofail:
fast,scalable, andreturns fully accurate results all the time.Oracle.
Oracle is incredible at scaling.
As in, you won't believe how much the license cost goes up.
Typically, when a cost function scales exponentially, it's considered to be very poor at scaling.
Filed under: Or at least you'll be very poor., I had a problem, so I used Oracle. Now I have
two problemsno money.
-
@Adynathos said in Mongofail:
Take this chance and switch to Postgres, I heard it even has a JSON mode if you don't want to change your data structures.
That sounds like a nice option but the framework we're using (Meteor) does not support legacy interfaces such as SQL, it was done full-mongo. Actually they keep talking about adding SQL because there is a lot of pressure, but I'm not sure they're committed to actually implementing it.
-
-
@gleemonk Surely with meteor you can just wire it up to MySQL using a node-mysql module?
-
@ben_lubar the workaround they presented was pretty logical, though. Just add an indexed column for the boolean that you're testing and then query that; if you're changing a document but both the old and the new version should satisfy the condition, the boolean won't change so your query is guaranteed not to skip it.
edit: I know "column" is probably the wrong terminology for MongoDB, but ¯\(°_o)/¯
-
@cartman82 Dude. Look. Mongo is an eventual consistency database. This is not a secret.
This article just sums up to, "we used Mongo instead of a real database despite not knowing how it works or what it was designed to do because we're all hipster idiots and too stupid to learn SQL."
What's another software team who did something similar... hmm! Oh yeah, NodeBB.
Oh NOW the edit button works. I guess I just needed to wait long enough?
-
In what cases, would data inaccuracy would be an acceptable trade-off?
I ALREADY TOLD YOU applications like Twitter, where it's not a big deal at all if one tweet is a few minutes late getting to all followers.
-
@anonymous234 said in Mongofail:
@error You can't have a database that's fast, scalable, and returns fully accurate results all the time. It's mathematically impossible. You have to sacrifice one thing or another. Different databases sacrifice different things.
The problem here is everyone rushed to use MongoDB because it was new and cool without understanding what it is for.
I see NoSQL databases about the same way I see Git: they have features that are very useful for projects that are much larger-scale than the one you're actually working on, but in order to make them work they have to sacrifice a bunch of stuff your project actually needs, so don't bother.
-
@blakeyrat said in Mongofail:
What's another software team who did something similar... hmm! Oh yeah, NodeBB.
I still don't understand what madness led them to use redis as a backend for a CRUD heavy software like a forum. I mean, mongo, I could kind of understand, the marketing winds are strong there. But I don't think even authors of Redis would think that was a good idea.
-
the framework we're using (Meteor)
I see mongo was just the tip of the iceberg
-
@gleemonk Surely with meteor you can just wire it up to MySQL using a node-mysql module?
We'd lose all the nice plumbing done by Meteor. And I would go for Postgre, given the choice
@anotherusername said in Mongofail:
@ben_lubar the workaround they presented was pretty logical, though. Just add an indexed column for the boolean that you're testing and then query that; if you're changing a document but both the old and the new version should satisfy the condition, the boolean won't change so your query is guaranteed not to skip it.
I've gotten so used to that, I'm just nodding my head saying, "another day, another field on my documents". Soon they'll start to reproduce on their own. There is no schema, so nobody knows...
-
@fbmac I gather you have nice things to say about Meteor?
-
@gleemonk I tested telescopejs once, that was made on meteor, and it performed very badly, and crashed frequently.
This update everything for all users in real time is a possible cause.
There are probably ways to make it work better, but there is a serious red flag on meteor. It looks too good, and there is very few people using it. Everytime something that looked good and have too few people using, if I cared to test I ended discovering the reason.
It's like an empty, cheap restaurant that look very nice. Don't eat at one like these, you'll get sick.
-
@fbmac I understand the feeling because that's what I've seen the last years: People building small clever projects on Meteor but nothing big. On the other hand I talked to a guy using it in production for a site that does see some traffic and he didn't seem worried about scaling. Didn't mention any hiccups either, and they're running their own infrastructure.
Meteor 1.0 was released just last year, they might take a while to attract clientèle (overstretching the restaurant metaphor now).
-
@gleemonk link it to reddit with a catchy title and tell us if the server survived
-
@fbmac I'm not their marketing department, and I'm afraid of the outcome :-)
-
@blakeyrat said in Mongofail:
Dude. Look. Mongo is an eventual consistency database. This is not a secret.
Of the ACID guarantees, MongoDB provides just 'AC'.
This isn't a question of consistency at all. It's not that sometimes tweets will be unavailable for a few minutes or take time to propagate to everyone, it's that there's a race condition that causes some fairly simple queries to not return every result. If you use MongoDB to query data that's not completely at rest, your results are unreliable.
If I wanted to tweak my database schema specifically to avoid indexing problems I would just write my own database engine, because clearly that's what I find interesting.
-
@AyGeePlus said in Mongofail:
take time to propagate to everyone,
That's what eventual consistency is.
@AyGeePlus said in Mongofail:
This isn't a question of consistency at all.
It's their terminology, not mine.
-
@Adynathos said in Mongofail:
Take this chance and switch to Postgres, I heard it even has a JSON mode if you don't want to change your data structures.
That sounds like a nice option but the framework we're using (Meteor) does not support legacy interfaces such as SQL, it was done full-mongo. Actually they keep talking about adding SQL because there is a lot of pressure, but I'm not sure they're committed to actually implementing it.
Here's a bit of evidence pointing to the contrary: http://www.apollostack.com/
They currently have a developer's preview detailing how to integrate this into Meteor in order to become DB agnostic.
-
@AyGeePlus said in Mongofail:
If I wanted to tweak my database schema specifically to avoid indexing problems I would just write my own database engine, because clearly that's what I find interesting.
"If you wish to write a database engine from scratch, you must first create the universe." — Carl Sagan
Filed under: You are almost certain Carl Sagan said that.
-
@anonymous234 said in Mongofail:
Or any analytics system (think Google Analytics) where you need to aggregate millions of requests (that are coming in real time) into a few numbers.
I see you've never heard the stories of consultants who need to explain to customers why the expensive software reports 1538976 pageviews while google reports 1543245
-
@PleegWat Stories? Our analytics guy is in adjacent cubicle to mine. I hear that shit just about every day.