I converted NodeBB to use a PostgreSQL database. AMA.

ben_lubar

@blakeyrat said in I converted NodeBB to use a PostgreSQL database. AMA.:

@raceprouk said in I converted NodeBB to use a PostgreSQL database. AMA.:

You want Postgres to index something that isn't a column? I'm not sure that's po-

In SQL Server you can add a computed column and index that.

@onyx said in I converted NodeBB to use a PostgreSQL database. AMA.:

Yes, you can create indexes on expressions:

And of course SQL Server will do that too, although I don't know how complex the expressions can be. EDIT: no, nevermind, it basically just directs you to make an index on a computed column, like I first suggested. Same thing really, just the computed column has a name and will show up in SELECT *. You can put expressions in constraints, but not in indexes.

You can emulate a computed column in Postgres using triggers, but I'm probably just going to write an upgrade script that splits out the database into separate tables with at least somewhat less crazy schemas.

Tsaukpaetra

@ben_lubar said in I converted NodeBB to use a PostgreSQL database. AMA.:

tables with at least somewhat less crazy schemas.

We've discussed this before, there is no schema right now, so literally anything with a schema would be less crazy.

ben_lubar

postgres=# explain analyze verbose select 'post:' || value from sorted_sets where _key = 'posts:pid' order by score desc;
                                                                         QUERY PLAN                                                                         
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=930634.60..930637.73 rows=1253 width=64) (actual time=4944.430..6550.536 rows=1200692 loops=1)
   Output: (('post:'::text || (objects.data ->> 'value'::text))), (((objects.data ->> 'score'::text))::numeric)
   Sort Key: (((objects.data ->> 'score'::text))::numeric) DESC
   Sort Method: external merge  Disk: 38808kB
   ->  Bitmap Heap Scan on public.objects  (cost=37587.90..930570.13 rows=1253 width=64) (actual time=592.502..3089.097 rows=1200692 loops=1)
         Output: ('post:'::text || (objects.data ->> 'value'::text)), ((objects.data ->> 'score'::text))::numeric
         Recheck Cond: ((objects.data ->> '_key'::text) = 'posts:pid'::text)
         Filter: (objects.data ? 'score'::text)
         Heap Blocks: exact=15839
         ->  Bitmap Index Scan on uniq__objects__key__value  (cost=0.00..37587.59 rows=1252670 width=0) (actual time=571.343..571.343 rows=1200692 loops=1)
               Index Cond: ((objects.data ->> '_key'::text) = 'posts:pid'::text)
 Planning time: 161.706 ms
 Execution time: 7135.741 ms
(13 rows)

Time: 7383.248 ms

postgres=# explain analyze verbose select 'post:' || value from sorted_sets_materialized where _key = 'posts:pid' order by score desc;
                                                                                               QUERY PLAN                                                                                               
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Only Scan Backward using idx__sorted_sets__key__score__value on public.sorted_sets_materialized  (cost=0.56..382219.03 rows=1170064 width=41) (actual time=0.079..1073.163 rows=1200692 loops=1)
   Output: ('post:'::text || value), score
   Index Cond: (sorted_sets_materialized._key = 'posts:pid'::text)
   Heap Fetches: 1200692
 Planning time: 0.151 ms
 Execution time: 1633.143 ms
(6 rows)

Time: 1633.691 ms

hungrier

@tsaukpaetra said in I converted NodeBB to use a PostgreSQL database. AMA.:

@ben_lubar said in I converted NodeBB to use a PostgreSQL database. AMA.:

@tsaukpaetra Now to fix that!

How are you going to spay a database so that it performs better??? That seems counter-intuitive...

Why wouldn't it perform better? It'll have more cycles for databasing if it's not chasing skirt

Tsaukpaetra

@hungrier said in I converted NodeBB to use a PostgreSQL database. AMA.:

@tsaukpaetra said in I converted NodeBB to use a PostgreSQL database. AMA.:

@ben_lubar said in I converted NodeBB to use a PostgreSQL database. AMA.:

@tsaukpaetra Now to fix that!

How are you going to spay a database so that it performs better??? That seems counter-intuitive...

Why wouldn't it perform better? It'll have more cycles for databasing if it's not chasing skirt

Are you insinuating that the database is gay?

boomzilla

@onyx said in I converted NodeBB to use a PostgreSQL database. AMA.:

@ben_lubar said in I converted NodeBB to use a PostgreSQL database. AMA.:

Does anyone know if there's a way to tell Postgres to index something like rank() over (partition by foo order by bar asc)?

Yes, you can create indexes on expressions:

Oct 27, 2016

Indexes on Expressions

Ah, yes, coming from Oracland I'd call those Function Based Indexes. The view aspect of this was a red herring.

bb36e

@hungrier said in I converted NodeBB to use a PostgreSQL database. AMA.:

chasing skirt

Good ol' college days

Onyx

@boomzilla I'm pretty sure you can actually set them on actual user-defined functions as well, but I can't find the docs for it for the life of me right now, and I'm not sure what limitations (if any) are set on your functions when doing so, so I didn't want to mention it without being able to back it up.

PleegWat

@tsaukpaetra said in I converted NodeBB to use a PostgreSQL database. AMA.:

We've discussed this before, there is no schema right now, so literally anything with a schema would be less crazy.

Even full EAV?

Tsaukpaetra

@pleegwat said in I converted NodeBB to use a PostgreSQL database. AMA.:

@tsaukpaetra said in I converted NodeBB to use a PostgreSQL database. AMA.:

We've discussed this before, there is no schema right now, so literally anything with a schema would be less crazy.

Even full EAV?

I said less crazy, not less stupid.

ben_lubar

@onyx said in I converted NodeBB to use a PostgreSQL database. AMA.:

@boomzilla I'm pretty sure you can actually set them on actual user-defined functions as well, but I can't find the docs for it for the life of me right now, and I'm not sure what limitations (if any) are set on your functions when doing so, so I didn't want to mention it without being able to back it up.

I think the function has to be declared IMMUTABLE for it to work.

ben_lubar

Representative error message:

ERROR: invalid input syntax for type date: "I ALREADY TOLD YOU THAT I AM NOT A COMPUTER PERSON, YOU'RE REFUSING TO HELP ME SO I'M GOING TO HANG UP"

Arantor

@pleegwat said in I converted NodeBB to use a PostgreSQL database. AMA.:

@tsaukpaetra said in I converted NodeBB to use a PostgreSQL database. AMA.:

We've discussed this before, there is no schema right now, so literally anything with a schema would be less crazy.

Even full EAV?

It has legitimate uses for when you do stupid shit like letting your users arbitrarily add data to entities but not let them play with DDL.

masonwheeler

@arantor so what you're saying is, the legitimate use for this solution is to fix a problem you got yourself into?

Isn't that literally the class of scenario that the term "technical debt" was coined to describe?

Arantor

@masonwheeler no, I'm saying that there are times when you let the end user add arbitrary data to your entities and having the application do its own DDL to modify its own schema might be bad. But sure, you know best, right?

It might be stupid from an ivory tower perspective but sometimes you just can't model everything in advance, and you might transfer the work away from your developers to other people so developers can get on building other stuff. But sure, go with your original interpretation because you do everything so perfectly and never have to deal with users who change their minds, or changes in requirements, and having a solution that meets the bulk of needs and use cases while having a supposedly WTF design... yeah, thought so.

ben_lubar

Which one of these runs faster?

A

UPDATE "foo"
   SET "field_that_references_bar" = NULL
 WHERE "field_that_references_bar" NOT IN (SELECT "field_referenced_by_foo" FROM "bar");

B

CREATE TEMPORARY TABLE "temp_1" AS
	SELECT "field_referenced_by_foo"
	  FROM "bar"
	 ORDER BY "field_referenced_by_foo";
CREATE TEMPORARY TABLE "temp_2" AS
	SELECT "id", "field_that_references_bar"
	  FROM "foo"
	 WHERE "field_that_references_bar" IS NOT NULL
	 ORDER BY "id";
ANALYZE "temp_1";
ANALYZE "temp_2";
WITH p AS (
	SELECT t2."id"
	  FROM "temp_2" t2
	  LEFT OUTER JOIN "temp_1" t1
	    ON t1."field_referenced_by_foo" = t2."field_that_references_bar"
	 WHERE t1."field_referenced_by_foo" IS NULL
)
UPDATE "foo"
   SET "field_that_references_bar" = NULL
 WHERE "id" IN (SELECT "id" FROM p);
DROP TABLE "temp_1", "temp_2";

Answer

B finishes in minutes while A does not finish over 24 hours.

djls45

@ben_lubar said in I converted NodeBB to use a PostgreSQL database. AMA.:

Which one of these runs faster?

How does this one compare?

UPDATE "foo"
   SET "field_that_references_bar" = NULL
FROM "foo" AS "f1"
LEFT OUTER JOIN "bar" ON "f1"."field_that_references_bar" = "bar"."field_referenced_by_foo"
WHERE "f1"."id" = "foo"."id"
  AND "bar"."field_referenced_by_foo" IS NULL;

ben_lubar

@djls45 said in I converted NodeBB to use a PostgreSQL database. AMA.:

@ben_lubar said in I converted NodeBB to use a PostgreSQL database. AMA.:

Which one of these runs faster?

How does this one compare?
UPDATE "foo"
   SET "field_that_references_bar" = NULL
FROM "foo" AS "f1"
LEFT OUTER JOIN "bar" ON "f1"."field_that_references_bar" = "bar"."field_referenced_by_foo"
WHERE "f1"."id" = "foo"."id"
  AND "bar"."field_referenced_by_foo" IS NULL;

Right now I'm re-running the convert-from-a-twenty-nine-million-row-one-column-table-to-tables-with-actual-useful-columns step, and I've added ROLLBACK to the end of the foreign keys part for when I run that.

$ for i in users categories topics posts; do ./local.sh $i & done; wait
[1] 3812
[2] 3813
[3] 3814
[4] 3815
2017/08/21 09:38:46 starting transaction (posts)
2017/08/21 09:38:46 creating posts table
2017/08/21 09:38:46 starting transaction (users)
2017/08/21 09:38:46 starting transaction (topics)
2017/08/21 09:38:46 creating topics table
2017/08/21 09:38:46 creating users table
2017/08/21 09:38:46 starting transaction (categories)
2017/08/21 09:38:46 creating categories table
2017/08/21 09:38:46 clustering for improved performance
2017/08/21 09:38:46 clustering for improved performance
2017/08/21 09:38:46 clustering for improved performance
2017/08/21 09:38:47 extracting posts
2017/08/21 09:38:47 clustering table for improved performance
2017/08/21 09:38:47 extracting topics
2017/08/21 09:38:47 extracting users
2017/08/21 09:38:47 extracting categories
2017/08/21 09:38:48 setting next category ID
2017/08/21 09:38:48 creating index on (order)
2017/08/21 09:38:48 creating index on (parentCid)
2017/08/21 09:38:48 committing transaction (categories)
2017/08/21 09:39:34 setting next topic ID
2017/08/21 09:39:34 creating index on (cid, pinned)
2017/08/21 09:39:35 creating index on (uid)
2017/08/21 09:39:35 creating index on (timestamp)
2017/08/21 09:39:36 creating index on (lastposttime)
2017/08/21 09:39:37 creating full text index for topics
2017/08/21 09:39:38 committing transaction (topics)

If anyone has a big NodeBB database at home and would like to follow along, I posted my code here yesterday: https://gist.github.com/BenLubar/8d43f7aac304c42e2e5c6fc728038a51

Add tx.Rollback() to the end of the first function in step3_foreignkeys.go and adjust the giant queries in the two immediately following functions to test stuff. You'll probably also need to edit local.sh, but if you don't use Docker with a private network and dynamic IPs, you can just put the connection strings in directly.

ben_lubar

@ben_lubar

2017/08/21 10:45:08 setting next user ID
2017/08/21 10:45:09 committing transaction (users)

ben_lubar

@ben_lubar

2017/08/21 10:49:43 setting next post ID
2017/08/21 10:49:44 creating index on (tid)
2017/08/21 10:49:47 creating index on (timestamp)
2017/08/21 10:49:49 creating index on (uid)
2017/08/21 10:49:51 creating index on (toPid)
2017/08/21 10:49:55 creating index on (ip)
2017/08/21 10:49:56 creating full text index for posts
2017/08/21 10:52:57 committing transaction (posts)

SCIENCE TIME:

CREATE TEMPORARY TABLE "temp_1" AS SELECT "pid" FROM "posts" ORDER BY "pid";
CREATE TEMPORARY TABLE "temp_2" AS SELECT "pid", "toPid" FROM "posts" WHERE "toPid" IS NOT NULL ORDER BY "pid";
ANALYZE "temp_1";
ANALYZE "temp_2";
WITH p AS (
	SELECT t2."pid"
	  FROM "temp_2" t2
	  LEFT OUTER JOIN "temp_1" t1
	    ON t1."pid" = t2."toPid"
	 WHERE t1."pid" IS NULL
)
UPDATE "posts"
   SET "toPid" = NULL
 WHERE "pid" IN (SELECT "pid" FROM p);
DROP TABLE "temp_1", "temp_2";

                                                                                                                            QUERY PLAN                                                                                                                             
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on public.posts  (cost=66139.47..66147.51 rows=1 width=957) (actual time=2883.228..2883.228 rows=0 loops=1)
   CTE p
     ->  Hash Anti Join  (cost=37200.49..66139.02 rows=1 width=8) (actual time=2277.304..2652.917 rows=15 loops=1)
           Output: t2.pid
           Hash Cond: (t2."toPid" = t1.pid)
           ->  Seq Scan on pg_temp_3.temp_2 t2  (cost=0.00..12903.66 rows=837566 width=16) (actual time=0.025..451.031 rows=837566 loops=1)
                 Output: t2.pid, t2."toPid"
           ->  Hash  (cost=17404.55..17404.55 rows=1206555 width=8) (actual time=1361.781..1361.781 rows=1206555 loops=1)
                 Output: t1.pid
                 Buckets: 131072  Batches: 16  Memory Usage: 3978kB
                 ->  Seq Scan on pg_temp_3.temp_1 t1  (cost=0.00..17404.55 rows=1206555 width=8) (actual time=0.024..643.426 rows=1206555 loops=1)
                       Output: t1.pid
   ->  Nested Loop  (cost=0.45..8.49 rows=1 width=957) (actual time=2653.135..2653.403 rows=15 loops=1)
         Output: posts.pid, posts.uid, posts.tid, NULL::bigint, posts."timestamp", posts.ip, posts.content, posts.votes, posts.upvotes, posts.downvotes, posts.replies, posts.deleted, posts."deleterUid", posts.edited, posts.editor, posts.data, posts.ctid, p.*
         ->  HashAggregate  (cost=0.02..0.03 rows=1 width=40) (actual time=2653.065..2653.087 rows=15 loops=1)
               Output: p.*, p.pid
               Group Key: p.pid
               ->  CTE Scan on p  (cost=0.00..0.02 rows=1 width=40) (actual time=2277.345..2653.014 rows=15 loops=1)
                     Output: p.*, p.pid
         ->  Index Scan using posts_pkey on public.posts  (cost=0.43..8.45 rows=1 width=917) (actual time=0.016..0.018 rows=1 loops=15)
               Output: posts.pid, posts.uid, posts.tid, posts."timestamp", posts.ip, posts.content, posts.votes, posts.upvotes, posts.downvotes, posts.replies, posts.deleted, posts."deleterUid", posts.edited, posts.editor, posts.data, posts.ctid
               Index Cond: (posts.pid = p.pid)
 Planning time: 0.470 ms
 Execution time: 2883.425 ms

UPDATE "posts"
   SET "toPid" = NULL
  FROM "posts" t1
  LEFT OUTER JOIN "posts" t2
    ON t1."toPid" = t2."pid"
 WHERE t1."pid" = "posts"."pid"
   AND t2."pid" IS NULL
   AND t1."toPid" IS NOT NULL;

                                                                                                                                   QUERY PLAN                                                                                                                                   
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on public.posts  (cost=651169.59..997114.06 rows=470471 width=937) (actual time=6932.368..6932.368 rows=0 loops=1)
   ->  Merge Join  (cost=651169.59..997114.06 rows=470471 width=937) (actual time=5539.097..6929.416 rows=15 loops=1)
         Output: posts.pid, posts.uid, posts.tid, NULL::bigint, posts."timestamp", posts.ip, posts.content, posts.votes, posts.upvotes, posts.downvotes, posts.replies, posts.deleted, posts."deleterUid", posts.edited, posts.editor, posts.data, posts.ctid, t1.ctid, t2.ctid
         Merge Cond: (posts.pid = t1.pid)
         ->  Index Scan using posts_pkey on public.posts  (cost=0.43..332166.06 rows=2218242 width=917) (actual time=0.010..2333.617 rows=1146876 loops=1)
               Output: posts.pid, posts.uid, posts.tid, posts."timestamp", posts.ip, posts.content, posts.votes, posts.upvotes, posts.downvotes, posts.replies, posts.deleted, posts."deleterUid", posts.edited, posts.editor, posts.data, posts.ctid
         ->  Materialize  (cost=651169.16..653521.51 rows=470471 width=20) (actual time=4040.664..4040.711 rows=15 loops=1)
               Output: t1.ctid, t1.pid, t2.ctid
               ->  Sort  (cost=651169.16..652345.33 rows=470471 width=20) (actual time=4040.660..4040.685 rows=15 loops=1)
                     Output: t1.ctid, t1.pid, t2.ctid
                     Sort Key: t1.pid
                     Sort Method: quicksort  Memory: 25kB
                     ->  Merge Anti Join  (cost=5101.45..597192.48 rows=470471 width=20) (actual time=0.023..4040.614 rows=15 loops=1)
                           Output: t1.ctid, t1.pid, t2.ctid
                           Merge Cond: (t1."toPid" = t2.pid)
                           ->  Index Scan using "idx__posts__toPid" on public.posts t1  (cost=0.43..244922.86 rows=1541013 width=22) (actual time=0.015..1712.370 rows=837566 loops=1)
                                 Output: t1.ctid, t1."toPid", t1.pid
                                 Index Cond: (t1."toPid" IS NOT NULL)
                           ->  Index Scan using posts_pkey on public.posts t2  (cost=0.43..332166.06 rows=2218242 width=14) (actual time=0.006..1192.370 rows=1206552 loops=1)
                                 Output: t2.ctid, t2.pid
 Planning time: 0.719 ms
 Execution time: 6932.466 ms

UPDATE "posts" t1
   SET "toPid" = NULL
 WHERE "toPid" IS NOT NULL
   AND NOT EXISTS (
	SELECT t2."pid"
	  FROM "posts" t2
	 WHERE t1."toPid" = t2."pid"
       );

                                                                                                      QUERY PLAN                                                                                                       
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on public.posts t1  (cost=5101.45..592012.86 rows=470469 width=931) (actual time=84336.476..84336.476 rows=0 loops=1)
   ->  Merge Anti Join  (cost=5101.45..592012.86 rows=470469 width=931) (actual time=0.093..78922.606 rows=15 loops=1)
         Output: t1.pid, t1.uid, t1.tid, NULL::bigint, t1."timestamp", t1.ip, t1.content, t1.votes, t1.upvotes, t1.downvotes, t1.replies, t1.deleted, t1."deleterUid", t1.edited, t1.editor, t1.data, t1.ctid, t2.ctid
         Merge Cond: (t1."toPid" = t2.pid)
         ->  Index Scan using "idx__posts__toPid" on public.posts t1  (cost=0.43..239744.47 rows=1541006 width=925) (actual time=0.029..74118.155 rows=837566 loops=1)
               Output: t1.pid, t1.uid, t1.tid, t1."timestamp", t1.ip, t1.content, t1.votes, t1.upvotes, t1.downvotes, t1.replies, t1.deleted, t1."deleterUid", t1.edited, t1.editor, t1.data, t1.ctid, t1."toPid"
               Index Cond: (t1."toPid" IS NOT NULL)
         ->  Index Scan using posts_pkey on public.posts t2  (cost=0.43..332164.92 rows=2218233 width=14) (actual time=0.060..3487.159 rows=1206552 loops=1)
               Output: t2.ctid, t2.pid
 Planning time: 61.094 ms
 Execution time: 84396.291 ms

Edit: It turns out the most time was spent updating rows that already had null foreign keys. Updated the benchmarks to fix that.

ben_lubar

2017/08/21 12:59:18 starting transaction (foreignkeys)
2017/08/21 12:59:18 clearing invalid foreign keys "categories"("parentCid") -> "categories"("cid")
2017/08/21 12:59:19 0 missing keys set to null.
2017/08/21 12:59:19 creating foreign key on "categories"("parentCid") -> "categories"("cid")
2017/08/21 12:59:20 clearing invalid foreign keys "posts"("uid") -> "users"("uid")
2017/08/21 12:59:28 0 missing keys set to null.
2017/08/21 12:59:28 creating foreign key on "posts"("uid") -> "users"("uid")
2017/08/21 12:59:32 deleting invalid foreign keys "posts"("tid") -> "topics"("tid")
2017/08/21 12:59:33 0 missing keys deleted.
2017/08/21 12:59:33 creating foreign key on "posts"("tid") -> "topics"("tid")
2017/08/21 12:59:35 clearing invalid foreign keys "posts"("toPid") -> "posts"("pid")
2017/08/21 12:59:51 15 missing keys set to null.
2017/08/21 12:59:51 creating foreign key on "posts"("toPid") -> "posts"("pid")
2017/08/21 12:59:52 clearing invalid foreign keys "posts"("deleterUid") -> "users"("uid")
2017/08/21 12:59:54 0 missing keys set to null.
2017/08/21 12:59:54 creating foreign key on "posts"("deleterUid") -> "users"("uid")
2017/08/21 12:59:56 clearing invalid foreign keys "posts"("editor") -> "users"("uid")
2017/08/21 12:59:57 0 missing keys set to null.
2017/08/21 12:59:57 creating foreign key on "posts"("editor") -> "users"("uid")
2017/08/21 12:59:58 deleting invalid foreign keys "topics"("cid") -> "categories"("cid")
2017/08/21 12:59:59 0 missing keys deleted.
2017/08/21 12:59:59 creating foreign key on "topics"("cid") -> "categories"("cid")
2017/08/21 12:59:59 clearing invalid foreign keys "topics"("oldCid") -> "categories"("cid")
2017/08/21 12:59:59 0 missing keys set to null.
2017/08/21 12:59:59 creating foreign key on "topics"("oldCid") -> "categories"("cid")
2017/08/21 12:59:59 clearing invalid foreign keys "topics"("uid") -> "users"("uid")
2017/08/21 12:59:59 0 missing keys set to null.
2017/08/21 12:59:59 creating foreign key on "topics"("uid") -> "users"("uid")
2017/08/21 12:59:59 clearing invalid foreign keys "topics"("deleterUid") -> "users"("uid")
2017/08/21 12:59:59 0 missing keys set to null.
2017/08/21 12:59:59 creating foreign key on "topics"("deleterUid") -> "users"("uid")
2017/08/21 12:59:59 clearing invalid foreign keys "topics"("mainPid") -> "posts"("pid")
2017/08/21 13:00:00 0 missing keys set to null.
2017/08/21 13:00:00 creating foreign key on "topics"("mainPid") -> "posts"("pid")
2017/08/21 13:00:00 clearing invalid foreign keys "topics"("teaserPid") -> "posts"("pid")
2017/08/21 13:00:01 0 missing keys set to null.
2017/08/21 13:00:01 creating foreign key on "topics"("teaserPid") -> "posts"("pid")
2017/08/21 13:00:01 committing transaction (foreignkeys)

ben_lubar

Ok, fresh start. PostgreSQL is currently working in NodeBB as fake Redis, the same way MongoDB is.

I'm planning to deprecate NodeBB's entire database API and replace it with one where objects are constructed at startup to represent data types and then all queries go through those objects.

Here I gooooooo

bb36e

@ben_lubar ďťżŘ§Ů�ŘĽŘšŮ�Ř§Ů� Ř§Ů�ŘšŘ§Ů�Ů�Ů� Ů�ŘŮ�Ů�Ů� Ř§Ů�ŘĽŮ�ŘłŘ§Ů�

anotherusername

@ben_lubar said in I converted NodeBB to use a PostgreSQL database. AMA.:

I'm planning to deprecate NodeBB's entire database API and replace it

Cool. That sounds like, what, a few days worth of work?

LB_

@anotherusername nah, it doesn't take more than a few seconds to edit some random wiki page to say "Oh by the way this is all deprecated, and it's documented here so you have no excuse not to know that". Of course Ben is better than that.

heterodox

@ben_lubar said in I converted NodeBB to use a PostgreSQL database. AMA.:

Ok, fresh start. PostgreSQL is currently working in NodeBB as fake Redis, the same way MongoDB is.

I'm planning to deprecate NodeBB's entire database API and replace it with one where objects are constructed at startup to represent data types and then all queries go through those objects.

Here I gooooooo

Is there an actual problem you're trying to solve here or are you just doing it as a proof of concept? Just curious.

TheCPUWizard

@laoc said in I converted NodeBB to use a PostgreSQL database. AMA.:

With XML files stored in it?

Nope.... EDIF is the way to go!

anotherusername

@lb_ said in I converted NodeBB to use a PostgreSQL database. AMA.:

@anotherusername nah, it doesn't take more than a few seconds to edit some random wiki page to say "Oh by the way this is all deprecated, and it's documented here so you have no excuse not to know that". Of course Ben is better than that.

I think the "and replace it" part might take just a bit longer than that.

ben_lubar

@heterodox said in I converted NodeBB to use a PostgreSQL database. AMA.:

@ben_lubar said in I converted NodeBB to use a PostgreSQL database. AMA.:

Ok, fresh start. PostgreSQL is currently working in NodeBB as fake Redis, the same way MongoDB is.

I'm planning to deprecate NodeBB's entire database API and replace it with one where objects are constructed at startup to represent data types and then all queries go through those objects.

Here I gooooooo

Is there an actual problem you're trying to solve here or are you just doing it as a proof of concept? Just curious.

Without having NodeBB's database calls less tied to Redis structures, there's no way PostgreSQL (that is, PostgreSQL not pretending to be Redis) is going to be able to be added to NodeBB without duplicating a lot of the code or removing support for the existing database drivers.

ben_lubar

@anotherusername said in I converted NodeBB to use a PostgreSQL database. AMA.:

@lb_ said in I converted NodeBB to use a PostgreSQL database. AMA.:

@anotherusername nah, it doesn't take more than a few seconds to edit some random wiki page to say "Oh by the way this is all deprecated, and it's documented here so you have no excuse not to know that". Of course Ben is better than that.

I think the "and replace it" part might take just a bit longer than that.

Yeah, the deprecating part is easy. I've got it set up to show a warning if a deprecated function is called at runtime, but to throw an error if a deprecated function is called in a test case.

dkf

@lb_ said in I converted NodeBB to use a PostgreSQL database. AMA.:

it doesn't take more than a few seconds to ~~edit some random wiki page~~ post to twitter to say "Oh by the way this is all deprecated, and it's documented here so you have no excuse not to know that".

FTFReality.

mott555

Why not convert it to SSDS? I heard it has a great index, and I know a great VB5 programmer who might lend you a hand.

sockpuppet7

@ben_lubar did postresql alteady show improvement over mongodb?

ben_lubar

@sockpuppet7 said in I converted NodeBB to use a PostgreSQL database. AMA.:

@ben_lubar did postresql alteady show improvement over mongodb?

I don't remember where I posted this, so the numbers might be slightly off, but reading uncached data was about 50% faster with PostgreSQL and reading cached data was about 6% slower with PostgreSQL.

sockpuppet7

@ben_lubar deletes are a lot faster in postresql, so maybe we can get physical deletes now

ben_lubar

@ben_lubar said in I converted NodeBB to use a PostgreSQL database. AMA.:

@sockpuppet7 said in I converted NodeBB to use a PostgreSQL database. AMA.:

@ben_lubar did postresql alteady show improvement over mongodb?

I don't remember where I posted this, so the numbers might be slightly off, but reading uncached data was about 50% faster with PostgreSQL and reading cached data was about 6% slower with PostgreSQL.

Found the actual statistics:

Edit: For those who prefer non-truncated statistics, the non-iframely'd link is https://gist.github.com/BenLubar/dd816c736a9a1cbd70e3469828b96e5d#file-5-results-md