I SELECT f.*, c.* sometimes.


  • ♿ (Parody)

    @darkmatter said:

    As for my answer, I come from a data warehouse querying background, hence the use of the subquery to eliminate as much data as possible before joining.

    Yeah, there a zillion things that could change the performance.



  • Yeah. I agree. @boomzilla has a solid no bullshit query.



  • I didn't forget the asterisk, discourse helped and italicized my query.



  • So it was markdown trickery...Got it.


  • ♿ (Parody)

    I recommend using the triple backtick around code. You don't have to worry about that and you get highlighting and a monospace font.



  • My original reply is fine, it's his quoting of my original reply that broke it.

    I used < pre>< code> like you assholes told me to when i bitched about not being able to use just < code>



  • But triple backtick is even less typing than manually putting in HTML that can't possibly go wrong...



  • @Frank said:

    SELECT q.*,(SELECT description FROM faqs_topics WHERE id=q.category) AS catDescription
    FROM faqs_questions q
    WHERE q.id NOT IN (SELECT questionid FROM faqs_answers)
    ORDER BY q.createdate

    Can someone tell me why this solution is substandard. @boomzilla didn't like it. But why?

    I will change NOT IN to NOT EXISTS

    Like this:

    @Frank said:

    WHERE not exists (SELECT 1 FROM faqs_answers WHERE questionid=faqs_questions.id)

    But what else is wrong with it?


  • ♿ (Parody)

    @Frank said:

    Can someone tell me why this solution is substandard. @boomzilla didn't like it. But why?


    Filed Under: Subquery abuse



  • Really? But all I need is the description. Explain.



  • Is the thinking that if you can do it without a subquery then that is the best route? It must be additional overhead on the DB?



  • Nevermind I found an explanation here:

    Filed under: creepy winky picture



  • Because like my original post stated, if you ever get more than one match from your

    select q.*, (subselect dupes)

    your sql query will choke and die?

    If you're being serious,

    The goal is to do a subquery, then a join so that tables can effectively use their indexes before joins so there's less 'swallowing the ocean'

    EX:

    
    select *
    from tablea a
    left join tableb b
    on a.id = b.foreignid
    where datetime >= somedate
    

    vs

    
    select *
    from
    (
     select * from tablea where datetime >= somedate
    ) a
    left join
    (
    select * from tableb where datetime >= somedate
    ) b
    on a.id = b.foreignid
    

    Generally performs better.

    [edit] fixed some formatting between the two queries. Fuck the query formatting though.


  • ♿ (Parody)

    @Frank said:

    Is the thinking that if you can do it without a subquery then that is the best route? It must be additional overhead on the DB?

    In addition to potential performance issues, it's simply not as easy to read. YMMV, of course. When I need a subquery, I tend to pull it out into a with clause. This works well in Oracle. I have only barely touched mysql, so I can't really comment there.


  • ♿ (Parody)

    @Matches said:

    if you ever get more than one match...your sql query will choke and die?

    I thought that maybe mysql would do some sort of cross join with that, since he was doing it. But, yeah, that's another potential land mine.



  • I see. But in seriousness. I should have mentioned its a reference table. It will never change. So my comment on f'd in the a was a joke, no one will ever change this table, if they do I will kill them with fire (I actually told them that for this project this table can never change....ever). Everyone is pretty clear about it.



  • I haven't used MySQL since version 4.1ish, so I can't tell you much about that environment. I generally use sql server



  • @Frank said:

    I see. But in seriousness. I should have mentioned its a reference table. It will never change. So my comment on f'd in the a was a joke, no one will ever change this table, if they do I will kill them with fire (I actually told them that for this project this table can never change....ever). Everyone is pretty clear about it.

    Filed under: create trigger FuckYou instead of insert select 1



  • As far as I know MySQL doesn't support with ...

    but you mentioned you are Oracle specific.



  • How would you use WITH? There's probably some way to do it in MySQL these days; it's come a long way since 4.x.



  • That's a good question. As far as I know it is not an option. but then again there are a fuckton of differences between Oracle and MySQL. Starting with identity tables. You get used to it fast but you wonder. Will they ever change it to match what SQL SERVER and MySQL does with incrementing values?

    Answer is probably no...never...



  • with works with sql server as well

    ;with table as
    (
    select something from somewhere
    )
    select * from table

    Looks like mysql is about the only db without a with clause

    Oracle's identities, apparently?

    But it looks like real shit, since it's lost if the db crashes, and you have to have a process to reset it. (or you could cache the whole range, but that also looks like a terrible idea)



  • And MySQL won't get it since Oracle need to keep MySQL crippled to be able to sell the big iron stuff.

    I wonder if MariaDB will obtain it if it hasn't already.



  • Honestly? I don't feel like missing the with clause is that crippling. Even on databases that have it, with generally is pretty poor performance. @sam my enter button is broken again.

    Temp tables, table variables, etc. generally perform significantly better than with tables (at least on sql server)


  • :belt_onion:

    @Frank said:

    SELECT q.*,(SELECT description FROM faqs_topics WHERE id=q.category) AS catDescription FROM.....

    @boomzilla said:

    Filed Under: Subquery abuse

    @Frank said:
    Really? But all I need is the description. Explain.

    If you need the description, then join to the table for it.
    By subquerying in the select part, you're basically asking the dbms to run the subquery X number of times, where X is the number of results you got back. Also, you need to have a "distinct" or "group by" clause on you subquery to guarantee a single result or your query might just flat out fail altogether.

    For 2 row example tables it might work great. For multi-million or billion row tables... well good luck with that.



  • Though some tools are smart enough to resolve this to a variable if it can determine it will never change. (I don't think this can be applied to any mysql editor...)


  • :belt_onion:

    Possible, but not likely if the subquery is using a column from the selection to subselect data.
    If you did
    [code]select (select 1)
    from table[/code]
    or
    [code]select (select a from table2 where table2.a = table1.a)
    from table1 where a = 3[/code]
    then sure, it will likely optimize fine, but why would you do that unless you're trying to screw with your colleagues.



  • Oh, no, I don't recommend it. I'm in the camp of 'Join directly', or 'join via subquery' in the from clause.



  • Well there is a good reason.

    BTW I put the query @boomzilla designed in my production code.

    Everybody taught me something new about SQL today.


  • :belt_onion:

    The culmination of years and years of data warehouse database experience... I taught Frank a thing.


    Filed Under: Consulting fee now please.


  • So I think everyone is in consensus that joins are better than subqueries and that right join should be killed with fire.

    Also to make the query human readable.



  • Especially if you use an application that, in its homebrew query abstraction layer (before PDO and similar were a thing) actually tests and refuses to run any query with a subquery in it.



  • That was some serious sub query awesomeness in your query.

    I understood your thinking instantly...i don't mind subqueries actually and I think there are some cases where they can't be avoided.

    At least I think that's the case.



  • I did like the way my query looked. I didn't consider another person might have look at it. I was not concerned about DB overhead...it's not a site like this discourse forum where transactions can get intense.


  • :belt_onion:

    Subqueries can be good.
    I've even done subqueries in the select part of a statement, but they are for extremely special circumstances. One instance was a query that would have required joining a 3.5billion row table to a 500million row table of a different grain1, where the final result only ever actually gave back about 25 rows.

    In that situation, selecting the 25 final rows from the 3.5billion row table and then subquerying the 500million row table greatly improved the performance.

    Also note subqueries of that type don't even work on certain engines (Netezza comes to mind).

    1) which meant having to run a subquery on the 500mil row table to group on the same key as the 3.5billion row table so the join wouldn't duplicate the data.



  • As likely being among "some people," I should probably apologize for being a tad harsh - when code snippets get posted here, the general response seems to be that people pick them apart mercilessly.

    @Matches proposes the closest response to what I would choose. If the schema is fixed, go with his/her answer - the only thing I would add would be some stylistic modifications:

    [code]
    select fq., ft.
    from faqs_questions as fq
    left join faqs_topics as ft on
    (fq.faqs_topic_id = ft.faqs_topic_id)
    where not exists
    (
    select fa.question_id
    from faqs_answers as fa
    where fq.faqs_question_id = fa.faqs_question_id
    )
    order by fq.create_date
    [/code]

    Rationale:

    • The aliases resemble the table names, so that if this query grows (or each of the columns in the * are expanded), it's easier to see what's coming from where.
    • Starting with the base table(s) and going left to right on joins is more idiomatic
    • Prefer NOT EXISTS to LEFT JOIN WHERE NULL to NOT IN. You can get burned by NOT IN if either side of the expression is nullable. NOT EXISTS also makes your intent clearer to the query optimizer and can be more performant (at least in SQL Server).
    • Indent subqueries, derived tables and common table expressions for readability
    • All referenced columns are aliased. This is important because if someone adds a column with the same name to any of the tables participating in the join and you don't alias it, you'll get an ambiguous column error and the query is broke.
    • Primary key and foreign key columns should have the same names so the join predicate to use is obvious
    • Since the underlying convention seems to involve underscores, added them to column names. There's a religious war over under_scores vs. PascalCase (as there is in most languages); my personal preference is underscores.
    • There are religious wars over whether tables and their keys should be named singular.singular_id, plural.singular_id, plural.plural_id, etc. My personal preference is plural.singular_id because it sounds more natural when spoken (i.e. select star from orders where order_id is...). Most people I've asked seem to have little problem with any convention as long as it's CONSISTENT.

  • :belt_onion:

    At least do [code]where not exists (select 1 from....[/code]
    There is no need to get the value out to tell whether the result would exist.

    As for naming things, the request/challenge was not to rename things but to improve the actual SQL logic, that is why hardly anyone renamed anything.



  • Argh, this thread reminds me I need to write like 30 queries this weekend. At least SQL is easy. The layer between my database and my app is more of a pain in the ass.



  • @darkmatter said:

    At least do

    where not exists (select 1 from....

    There is no need to get the value out to tell whether the result would exist.

    Fair point, I left the column there mainly to emphasize the need for qualified names.



  • This is a great answer. I appreciate more explanation when it comes to 'why' certain conventions are used.

    Also, when I see code answers it tells more about people and what they deal with on a day to day basis.

    Yes my code was not great, but I come from a code environment that uses databases differrently...because I am solving front end requests. When I do composite keys with rollback deletes to other tables I get a little excited...people kind of get excited because I'm excited but it doesn't mean much. Also I don't get a whole lot of time during projects to do such things. Maybe two projects a year leverage tables in concerning ways.

    Basically there is a cost to doing things which can be argued one way or another till the cows come home but when you are in the trenches it's up to the business-side and clients to determine what is needed and what they want to pay for.

    Rule of thumb is that pain usually predicates change and do extra massaging only when told.

    With that said. This thread alone has taught me at least 10 really important things I can easily do when I am working through queries which only required a shift in thinking and concerns.

    Thanks again for the great answer!



  • @Groaner said:

    Primary key and foreign key columns should have the same names so the join predicate to use is obvious

    I agree with everything but this. I prefer calling my primary keys just id (if they are of integer / auto number type). That way I can at a glance distinguish them from foreign keys, which are formatted as something_id.

    I don't see great benefit from naming them both the same. If you get rid of aliases (in case of shorter table names), you can arrive at nice looking constructs such as table1.id = table2.table1_id.



  • @Onyx said:

    Would someone enlighten me on the necessity of WHERE 1 = 1 AND... here?

    WHERE 1=1 makes it easier to dynamically build the query - you have something like foo = "select ... where 1=1", then you can simply append other conditions - if there are any. Makes the code a bit smaller, since you don't have to check if you already have a where clause every time you want to add a condition.


  • BINNED

    @ender said:

    WHERE 1=1 makes it easier to dynamically build the query - you have something like foo = "select ... where 1=1", then you can simply append other conditions - if there are any. Makes the code a bit smaller, since you don't have to check if you already have a where clause every time you want to add a condition.

    That... makes sense. And I have a few annoying ifs I could kill that way. See, that's why I love this forum: I learn the WTFs to avoid, and learn that some WTFs are actually just a clever trick.



  • @ender is right. In coldfusion queries you will see where 1= 1 quite a bit.

    <cf query name="pullAllMonkies" datasourse="#application.dsn#">
    
    Select * from monkey q Where 1=1
    
    <cfif argument eq 1>
    And q.field = #argument#
    <cfelseif argument eq 2>
    And q.field = 2
    <cfelseif argument eq 9>
    And q.field = 9
    </cfif>
    
    </cfquery>
    

    This way I can use the raw query for something else if none of those checks match.

    Filed under: I should have explained that.



  • Why are you ignoring two of the arguments if the first one is set?



  • Because I am typing on my phone and was trying to get my post out...and I didn't think anyone would catch or care... lol. Dammit!



  • You guys don't let anything slip...I'm on the laptop now...lol.



  • There FIFM...syntactically correct..But I am digging up a real world example so you can see how hairy this can get ....for shits and gigs.



  • Here is a day in the life of a coldfusion programmer approaching queries.

    Disclaimer (this is only an example):

    <cfquery name="getAllmonkey" datasource="#application.dsn#">
    SELECT * FROM monkey LEFT JOIN monkey_categories ON monkey.category = monkey_categories.id
    WHERE 1=1 
    <cfif id neq "">
    	AND monkey.id = <cfqueryparam cfsqltype="cf_sql_integer" value="#id#">
    <cfelseif NOT archive>
    	<cfif application.dstype eq "mysql">
    		AND NOW() > startDate
    		AND ( endDate IS NULL OR NOW() < endDate )
    	<cfelse>
    		AND getdate() > startDate
    		AND ( endDate IS NULL OR getdate() < endDate )
    	</cfif>
    <cfelseif archive>
    	<cfif application.dstype eq "mysql">
    		AND NOW() > startDate
    		AND ( endDate IS NOT NULL AND NOW() >= endDate )
    	<cfelse>
    		AND getdate() > startDate
    		AND ( endDate IS NOT NULL AND getdate() >= endDate )
    	</cfif>
    </cfif>
    <cfif arguments.category>
    	AND (monkey.category = <cfqueryparam cfsqltype="cf_sql_integer" value="#arguments.category#">
    		<cfif application.settings.var('monkey.multipleCategories')>
    			OR monkey.category LIKE <cfqueryparam cfsqltype="cf_sql_varchar" value="#arguments.category#,%">
    			OR monkey.category LIKE <cfqueryparam cfsqltype="cf_sql_varchar" value="%,#arguments.category#">
    			OR monkey.category LIKE <cfqueryparam cfsqltype="cf_sql_varchar" value="%,#arguments.category#,%">
    		</cfif>
    		)
    </cfif>
    ORDER BY monkey.sortorder ASC, monkey.startDate DESC, monkey.headline ASC
    </cfquery>


  • @Frank said:

    <cfif application.dstype eq "mysql">
    AND NOW() > startDate
    AND ( endDate IS NOT NULL AND NOW() >= endDate )
    <cfelse>
    AND getdate() > startDate
    AND ( endDate IS NOT NULL AND getdate() >= endDate )
    </cfif>

    So the app is trying to be usable with both MySQL and SQL Server, with no abstraction layer.

    There's no way it's gonna end badly, right?


Log in to reply