How to normalize your database. Or was it denormalize?...
-
Continuing the discussion from lost in Discourse, like tears in rain:
There appear to be multiple ways that replies are tracked. There's a "reply_to_post_number" (or similar) on the posts table. There's also a separate post_replies table that's many-to-many between posts.
It's the difference between (IIRC):
- This post is a direct reply to
post_number
in the current thread - This post contains a quote from
post_id
, which may or may not be in the current thread, but ignore if #1 covers this particular quote.
What happens when a post starts off as #1, but you go wandering else-topic and decide to deposit your post elsewhere I've not investigated, but I suspect any insert into #1 would be mutated into an insert into #2.
Test quotes:
WhyTF is this still building?
From elsewhere
@boomzilla said:Kill again. Kill again. Kill again. KILL AGAIN.
##Glossary
(posts
.)id
orpost_id
- the absolute, system wide uniqueid
of an individual post. Once created, it never changes
posts
.post_number
- the relative number of the post within its current thread when it was created/moved. This may change over time.
That (the linked post) was
id
445043 (post_number 49144
): http://what.thedailywtf.com/p/445043
In reply toid
445005 (post_number
49132)postgres@what:~$ psql -d discourse -c "select id, post_number, reply_to_post_number from posts where id in (445005, 445043)" id | post_number | reply_to_post_number --------+-------------+---------------------- 445005 | 49132 | 48733 445043 | 49144 | 49132 (2 rows) postgres@what:~$
First row, @boomzilla (cols 1,2) replying to someone (3), second row, my post (1,2) replying to @boomzilla (3)
postgres@what:~$ psql -d discourse -c "select post_id, reply_id from post_replies where post_id in (445005, 445043) or reply_id in (445005, 445043)" post_id | reply_id ---------+---------- 444441 | 445005 445041 | 445043 445005 | 445043 (3 rows) postgres@what:~$
First row @boomzilla (right) quoting @hardwaregeek (left)
Second row, my (right) quoting @Onyx
Thrid row, my (right) quoting @boomzilla's other post (left)The quote from the first table, doesn't appear in the second.
Now I'm all for normalization to reduce duplication, or denormalization if it helps things in practice, but I'm not convinced storing the same type of information (pointers to replies) in only one of two different tables and in different formats, and which table that data goes in is dictated by the method of creation of that data.
Can someone convince me why this is a reasonable schema, and why I'm wrong in starting this topic it in this category?
It seems to me like some hybrid love-child between normalization (have all reply data in one table,
post_replies
say) and denormalization (keepposts
.reply_to_post_number
but duplicate data inpost_replies
)
- This post is a direct reply to
-
Can someone convince me why this is a reasonable schema
and why I'm wrong in starting this topic it in this category?
No - it seems quite daft.
-
This is also one of the causes of "moved" notifications making no damned sense.
It seems that all notifications get generated in the same way, pointing to
/t/topic_id/slug/post_number
.This means that when a (set of) post(s) gets jeffed, you get a notification pointing to the post that got jeffed - in the original topic, using its original
post_number
. Resulting in the notification taking you nowhere near the yellow jeffing post if more than one post gets jeffed (it's off by one in the best case, I think?), after all thepost_number
s get adjusted to accommodate for the "hole" made by jeffing stuff out.
-
But why fix a genuine bug when you can paint a random UI element blue instead?
-
But why fix a genuine bug when you can
paintremove a random UI elementblue instead?Or perhaps
But why fix a genuine bug when you can paint a random UI element
bluegrey instead?
Relevant: I'm not going to bother to see how many rows were inserted into either table on this. But I have my suspicions.
-
@RaceProUK said:
But why fix a genuine bug when you can paint a random UI element
bluegrey instead?
Just wait until they discover blue-grey…
-
Just wait until they discover blue-grey…
-
-
Can someone convince me why this is a reasonable schema, and why I'm wrong in starting this topic it in this category?
So let me get this straight: If we have a thread with 50,000 posts, and someone deletes relative post #2: Does that mean that the entire posts table is renumbered for that thread to handle the relative numbering change? Because I can hardly wait to list all the 's that would likely incur.
-
Yep; that's why Jeffing causes site outages
-
-
-
If we have a thread with 50,000 posts, and someone deletes relative post #2: Does that mean that the entire posts table is renumbered for that thread to handle the relative numbering change?
I was under the impression that it left holes in the OT....
If it doesn't then it's more of a than I thought; I thought the recent slowness/outage when moving http://what.thedailywtf.com/t/semi-deleted-posts/49234 out of the cupcakes thread was coincidental...
-
Well, I don't think I need to list the 's they should be obvious to anyone !( ).
-
-
I flagged for jeffing.
-
-
I flagged for jeffing.
You what‽ How dare you use forum features for their intended purpose!;)
-
I know it was offending my OCD-esque sense of thread tidiness
-
Thing is the posts I moved were all within the last 20-or-so posts.
Unless it's an O(ntotal number of posts in thread) or some-such operation.
Off to confirm the renumbering first...
-
Off to confirm the renumbering first...
Not confirmed. Old
post_number
6 remainspost_number
6 after replacing one (with the notice) and moving one in the same move. If renumbering happens (unless it's sideqik-queued) I'd expect it to bepost_number
5.See if it stays like that after 10 mins or so.
-
Wait... how then... does the URL jump when crossing over the jeffing hole? I don't think I ever saw it do that.
Maybe I should try scrolling through the Status thread? Lots of jeffing in there.
-
What's demormalization?
(I read it as demoralize first - that seemed to apply to the schema)
-
A portmanteau of demoralise and normalise?
-
Demoronise?
No, wait, it's the opposite...
-
-
-
I've never been eaten by a bley
-
Those only appear in the light. You'd need to leave the basement first.
-
What if it's light but I'm plating the Hitchiker's Guide game and haven't worked out that I need to open my eyes yet?
-
-
He @accalia'd 'playing'
-
The @accalias are catching
-
Something something inflating your post count
-
i'm apparently infectious.
y'all might want to get yourselves tested.... :embarassed:
-
-
-
I specifically dementioned it with an ­, then made that unnecessary by adding the S
-
why? i don't mind the mentions. i find them fun!
-
-
-
-
This is also one of the causes of "moved" notifications making no damned sense.
Still no discodev acknowledgment:
https://meta.discourse.org/t/bug-on-notification-link-for-moved-topic/16707/7?u=boomzilla
-
I was under the impression that it left holes in the OT....
Yeah, and the post number you see on hover doesn't match the newlevator.
-
-
grue?
A monster with a tendency to eat UI elements at random. Found in most of the dark corners of Discourse (esp. the bike shed).
-
Does the URL match though?
-
Would that count for a cite in the Discopaedia?
-
Doesn't it? Go look at...almost any big topic. Status definitely.
-
Oh, right, don't actually have to look for a "hole". URL matches the post, not the newlevator.
Oh, shit...
Could it be... it can't be.
Are they... Is that the reason to pull the full topic JSON on every batch load? So they can sync the fucking newlevator properly?
Fuck.