Building SQL queries the hard way
-
I'm in the process of updating a website for the second decade of the new millennium. The previous developer did not know about JOIN nor IN, as evidenced by this gem I found within a multi-step chain of manual joining:
$query = "SELECT id FROM pages WHERE " ; foreach ($page_ids as $pid) { $query .= "(page_id = $pid AND published=1) OR " ; } $query = substr($query, 0, strrpos($query, "OR")) ; $query .= " ORDER BY title ASC" ;
You see how he was "efficient" by querying all
$page_ids
in one query? I'm sure he was pretty proud of having thought of that trick. TRWTF? I'm not gonna clean this up. It works.
-
I'd love to see how well that would work on a table of 10+ million rows.
-
@gleemonk Other things he didn't know about: parameterization. Apparently.
-
@blakeyrat you would only need to parameterize that if
$page_ids
is a user-supplied array.
-
@anotherusername "need" to yes, but you should get into the habit of always doing it consistently because it's the right thing to do.
-
@gleemonk ...wait, what?
He has a list of page_ids and he's using them to query
pages.id
... if that doesn't result in getting the same list back, that db schema is TRWTF.
-
@blakeyrat said in Building SQL queries the hard way:
@anotherusername "need" to yes, but you
shouldought to get into the habit of always doing it consistently because it's the right thing to do.FTFY
-
@anotherusername He did say, right in the OP, that
I'm not gonna clean this up. It works.
The issue is not that it doesn't work, merely that there are better ways to do it.
-
@gleemonk said in Building SQL queries the hard way:
I'm not gonna clean this up. It works.
Oh come on, it's easy!
$query = "SELECT id FROM pages WHERE page_id IN(".implode(",", $page_ids).") AND published=1 ORDER BY title ASC";
There! Perfect! Ship it!
-
@anotherusername said in Building SQL queries the hard way:
@gleemonk ...wait, what?
He has a list of page_ids and he's using them to query
pages.id
... if that doesn't result in getting the same list back, that db schema is TRWTF.Not necessarily. If these IDs come from the URL then the user might have added some extra IDs there to see some unpublished pages that he shouldn't see.
Ordering the IDs by page title might also be important.It's not pretty nor efficient but atleast the previous developer knew that he shouldn't blindly trust the users.
-
@anotherusername said in Building SQL queries the hard way:
@gleemonk ...wait, what?
He has a list of page_ids and he's using them to query
pages.id
... if that doesn't result in getting the same list back, that db schema is TRWTF.Not a , there are multiple pages sharing the same
page_id
due to i18n. Granted, I could have chosen better field names when anonymizing the snippet.
-
@Onyx said in Building SQL queries the hard way:
Oh come on, it's easy!
$query = "SELECT id FROM pages WHERE page_id IN(".implode(",", $page_ids).") AND published=1 ORDER BY title ASC";
There! Perfect! Ship it!
Wow this is cool! Now if you could help me JOINing these:
$query = "SELECT attribute_id FROM attribute_value WHERE value='$category_id'" ; $list = dbquery($query) ; $page_ids = array() ; foreach ($list as $entry) { $query = "SELECT page_id FROM attribute WHERE id=$entry" ; $list2 = dbquery($query) ; $page_ids[] = $list2[0] ; } if (count($page_ids) < 1) return array() ; // $query = "SELECT id FROM pages WHERE " ; // foreach ($page_ids as $pid) { // $query .= "(page_id = $pid AND published=1) OR " ; // } // $query = substr($query, 0, strrpos($query, "OR")) ; // $query .= " ORDER BY title ASC" ; // TODO Ask Onyx how it works. $query = "SELECT id FROM pages WHERE page_id IN(".implode(",", $page_ids).") AND published=1 ORDER BY title ASC";
You see I speak SQL perfectly but I find advanced concepts like joins a bit tough to understand
In other news I just confirmed that
$category_id
may be supplied as a GET parameter. (It's not the first injection opportunity I found, and probably not the last.)
-
@gleemonk said in Building SQL queries the hard way:
injection
And this is why I always, always have some kind of abstraction layer that enforces proper parameter binding. Anyone bypassing them is legible to find themselves on the wrong end of my crowbar.