What the Daily WTF?

kipthegreat

I'm working on a problem at work. I'm presenting the data here as if it were a tag cloud that I were implementing, because fundamentally it is the same many-to-many type of data model (although what I'm actually working with has nothing to do with blogs).

So I have three tables like this (greatly simplified):

Post

id


1

2

3
Link

p_id t_id
1   1

1   2

2   1

2   2

3   1

3   3
Tag

id

1 2 3

What I would like to do is write a query to give me all posts which are tagged with all of a given set of tags. For instance, if I wanted to get all posts tagged by both tags 1 and 2, the query should give me posts 1 and 2 (but not post 3 since it doesn't have tag 2). Right now, I am doing it like this:

select Post.id from Post where
exists(select Post.id from Link where Post.id = Link.p_id and Link.t_id = 1) and
exists(select Post.id from Link where Post.id = Link.p_id and Link.t_id = 2);

I don't think this is the best way to do this, as it involves n subqueries for n tags. Once in production, I might have millions of posts and thousands of tags, and it wouldn't be uncommon to need to run this query on 100 tags or so. Right now, that would require 100 subqueries which I don't think will scale very well (I haven't populated large amounts of data to test yet).

If I were trying to run a query for all posts containing any of a given set of tags, it would be much easier:

select distinct Post.id from Post, Link where Post.id = Link.p_id and Link.t_id in (T1, T2, ...);

That query I think would scale well for 100 tags to search on. But like I said, that's not what I'm doing. :)

Any non-WTF solutions that could be suggested would be appreciated.

kipthegreat

That's the one, thanks!

I was combining "development" and "destruction", but it's actually development/test/production. Google can't read your mind all the time I guess..

kipthegreat

I'm trying to find the original post where Alex coined a term something like "developmentstruction" environment. Googling that term doesn't turn up anything, so I think I may have it a little wrong. It's when you are editing the server-side scripts of a web-app on the live, public facing server (rather than using a test server).

Any help?

kipthegreat

I think the statement about piracy was intended to be humorous. :)

kipthegreat

An interesting tale of WTF code deliberately put in software by the CIA that they knew KGB spies were trying to steal.

Read here.

kipthegreat

I think Randal Munroe, author of webcomic xkcd.com, must be familiar with this site. Or maybe he's just seen enough WTF code in his lifetime.

Random Number generator - xkcd.com

http://xkcd.com/c221.html

kipthegreat

That's a good idea.. I really like the textonly.site.com/page solution. It would be really easy to parse the current URL to determine which page to show. Now I've just gotta learn how to set up my server to do that...

kipthegreat

I'm still not 100% clear on idempotency. I have a site that is kind of heavy with graphics, and I provide a text-only version of the site, that needs to work correctly on Pocket IE (and also might be preferred for dial-up visitors). At the bottom of each page is a link that will take you to the
text-only or rich version of the site (whichever one you aren't
viewing). The "view text-only" link uses a query string with something like "?textonly=T" or "?textonly=F".

So here's the idempotency question: say a user goes to "page.php?textonly=T". When the page is loaded, a "textonly" cookie is set, so that the whole site will be viewed that way (the user doesn't have to pass the "?textonly" query to each page on the site). Does this violate idempotency? Will any cookies be set by a prefetch? If cookies are not set/used, there is no side-effect. But if the browser prefetches the textonly page and sets the cookie, the next page the user goes to will show up text only. But I can't imagine the browser setting a cookie on a prefetch, and no spiders use cookies, so I don't think this violates idempotency.... Am I right?

kipthegreat

To clarify, I am fully aware of the fact that this is "security through obscurity". This is for a small hobby website, and the data isn't private, sensitive, or confedential. If someone sees a page that is not meant for them, it is not a big deal. I'd just rather it not happen.

But the reason I'm doing this is beside the point... I'd just like to know if there is anyway to make the database assign a unique random value to the private key when creating a record, or if I will have to write code to generate the id and make sure that it isn't in use. I would rather not reinvent the wheel if there is already a better method built into the database.

kipthegreat

Well my last post with a database question went well, I'll try another.

Is there a way to automatically generate a unique and random primary key, in the database? I know all about auto increment, but that leaves the problem that a user can see "getinfo.php?id=122" and guess that "getinfo.php?id=121" will work also. I want the id's to be random integers (32 bit int will be more than sufficient, database will have at most about 1000 rows), so that the user can't just guess another number. But the data I'm trying to conceal isn't confidential or sensitive in any way, so requiring users to create a profile and log in is overkill.

I've been Googling the subject and there doesn't seem to be a way to do this in the database, which seems odd (I mean, it seems like it would be a common problem).

I was thinking about just taking the last 8 hex digits off of the hex string returned by php code "md5(microtime() . mt_rand());", and convert to an int. There is a very small chance of generating the same value (theoretically, 1 in 2^32), but I can solve that programmatically as there is only going to be one user creating rows (i.e. no concern for transactions).

So... I have a decent pragrammatic solution, but I'd like to use the built-in database solution if possible. Anyone know if this is possible? Oh, and I'm using everyone's favorite database, MySQL. :)

kipthegreat

@kipthegreat

Best posts made by kipthegreat

Latest posts made by kipthegreat

`Post id`

Link
p_id t_id

Tag
id