Parse SQL query and get a list of tables



  • Evening all.

    I have been tasked with writing a function which will be passed a sql query as a string, it should parse this query prepending (is that a word? you get what I mean anyway...) dbo.[tablename] to any tables it finds. I have been given some code as an example (its simply looking for instances of "from" and "where/inner/outer/left/right" etc.. and assuming that the word in the middle is a table...
    This is being done in C#

    All Im really looking for is pointers of the best way to do this... (Although the brief is I will be passed a string and I will return a string, so please bear that in mind if you have any ideas!)

     Thanks



  •  That sounds... fun. I would start by getting more requirements and considering the possibilities. 

    Questions to ask:

    Will there always be a from and a where? 

    Are you guarenteed by the spec to have a where clause in all SQL queries passed in? 

    What are you supposed to do if there isn't one?

    Things to think about:

    What are the possible ways that a from clause can look like?  

    from ((myfoo inner join mybar on myfoo.a=mybar.a) inner join thisbar on myfoo.a=thisbar.a) ... to infinity

    from foo f, bar b

    from foo, bar

    from foo

     

    I am no expert at SQL queries so I know that I am missing some.  Get more requirements and pound out all of the possible situations that you could see first.  From there you just write a recursive parser.


       



  • Here is what I'd do:

     

    1.  Start with a list of tables.

    2.  Make sure there are no field names with the same name as the table

    3.  Find and Replace



  • Trivial. First you need to read the relevant portions of the SQL standard, which I estimate at being 1389 pages long (that is, parts [url=http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=45498]1[/url] and [url=http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=38640]2[/url]). Should be a piece of kake.

    Once you've done that, return to our children to receive additional helpful advice.



  • Presumably you are after the marginal performance gained buy adding the .DBO.

    What is stopping you from changing the source of these queries?

     



  • I've got the perfect WTF solution. In a recursive function, you prefix (I think that's the proper word) every keyword with "dbo." and then submit the query. If you get an error, you try the next combination, etc. until something returns. That will teach them.

     Anyway, for the life of me, I cannot see why the submitting module cannot be adapted to do this or why the server does not offer another option to make "dbo" the default namespace or database or whatever you're programming against. If you have identical table names in different schemas/databases/namespaces, then you're f*cked to begin with, so this sounds like a really, really bad way to patch a very uninformed change. Perhaps it's time for a CCB?



  • @Spectre said:

    Trivial. First you need to read the relevant portions of the SQL standard, which I estimate at being 1389 pages long (that is, parts <A title=http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=45498 href="http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=45498" target=_blank>1 and <A title=http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=38640 href="http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=38640" target=_blank>2). Should be a piece of kake.

    Once you've done that, return to our children to receive additional helpful advice.

     

    Cheers, yeah, Ill get right on that... :)



  • @LoztInSpace said:

    Presumably you are after the marginal performance gained buy adding the .DBO.

    What is stopping you from changing the source of these queries?

    DBA's say its for security, but I believe that its for the performance (however minimal)

    The reason that we dont want to change the queries themselves is that they are scattered through aroun 150 web apps.. however all of these use the same "Data Layer" (Quotes because its a DAL in the loosest possible way) so we can change this to parse the queries that its executing and then we can ignore (at least for now) the individual apps...



  • @wonkoTheSane said:

    DBA's say its for security, but I believe that its for the performance (however minimal)
    Guess you better write a pretty efficient process then.



  • @wonkoTheSane said:

    @LoztInSpace said:

    Presumably you are after the marginal performance gained buy adding the .DBO.

    What is stopping you from changing the source of these queries?

    DBA's say its for security, but I believe that its for the performance (however minimal)

    The reason that we dont want to change the queries themselves is that they are scattered through aroun 150 web apps.. however all of these use the same "Data Layer" (Quotes because its a DAL in the loosest possible way) so we can change this to parse the queries that its executing and then we can ignore (at least for now) the individual apps...

    If it's for security then why not just get the DBAs to implement it properly and be done with it.  I can't think of any security measure this prefix would solve that a decent DBA couldn't work out.  Care to share the details of the alledged risk?



  • @LoztInSpace said:

    If it's for security then why not just get the DBAs to implement it properly and be done with it.  I can't think of any security measure this prefix would solve that a decent DBA couldn't work out.  Care to share the details of the alledged risk?
    He can't, because the PHBs didn't have a good reason. I thought that was obvious from his previous comments. Shame the DBAs don't want to do their own work...



  • @LoztInSpace said:

    then why not just get the DBAs to implement it properly and be done with it.  I can't think of any security measure this prefix would solve that a decent DBA couldn't work out.  Care to share the details of the alledged risk?

     

    If they ever tell us what they think the risk is then Ill let you know :)

    Personally I think that its one guy quite high up that has decided this and no one wants to challenge the decision!



  •  Old post. I know but I found a solution to this:

    http://www.sqlparser.com/retrieve-refactor-table-column-name.php



  • @sproketboy said:

     Old post. I know but I found a solution to this:

    http://www.sqlparser.com/retrieve-refactor-table-column-name.php

    Or....

    Grab the SQL-99 grammer in BNF form from [url="http://savage.net.au/SQL/sql-99.bnf.html"]here[/url] and run the grammer and the query through lex or JFlex to tokenize it.  Then, search for all of the <table_primary> tokens.



  • Wow -- OLD post.

    Wouldn't running the queries through the (what must be a really convoluted) pre-processor negate any performance gains.


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.