Optional SQL parameters and SQL injection avoidance



  • Hi,

    I've been trying to find a "perfect" solution for this problems for a while and thought maybe the WTF crowd could point me in the right direction.

    The problem iswhen you are trying to assemble SQL statements with multiple clauses that are passed to the WHERE statement - and all and any are optional. At the same time you want to protect from SQL injection risks.

    In detail:

    Say, you have a SELECT query that is assembled in code, based on a query form on a web page. Let's just say 3 parameters:

    1) Firstname
    2) Lastname
    3) Age

    The form can be filled in with any combination of fields and any can be left empty.

    Ultimately that results in queries that look like various combinations.

    SELECT * FROM Table WHERE firstname=? AND lastname=? AND age=?

    or

    SELECT * FROM Table WHERE lastname=? AND age=?

    or

    SELECT * FROM Table WHERE firstname=?

    And so on, through all combinations, including possibly SELECT * FROM Table.

    Now, putting the WHERE clause together with string assembly is easy. Just check for whether a form field was filled and if so add that query element. Glue them together with AND strings. Something like:

    whereClause += (isNotEmpty(txtLastName)) ? "lastname = '" + txtLastName + "'" : "";

    And then just a list of these for each form field. Put " AND " in between, presto.

    It goes without saying that this is BAD! Alarm bells and all that. SQL Injection danger, blablabla.

    Thou shalt use parametrised queries.  But how do you put together parametrised queries elegantly without having huge ugly if-else trees? Because you have to create your query string, THEN plug in the parameters.

    So, I could do something like:

    whereClause += (isNotEmpty(txtLastName)) ? "lastname=?" : "";

    Do that for all the form fields, plug it into the query object. THEN I have to repeat that to add the parameters!

    if (isNotEmpty(txtLastName)) SetParam(1,txtLastName);

    Oh, and how do I know the position/index of the parameter in the query? ARGH! Surely there's a proper way to do this?

    Right now, I use the first method, but with something like this:

    whereClause += (isNotEmpty(txtLastName)) ? "lastname = '" + Normalise(txtLastName) + "'" : "";

    In the Normalise() method I do my darndest to munge and check that field to make sure it doesn't put me at risk for injection. But I'm worried no matter what I do, it may not be enough and parametrised queries is the way to go.

    All my googles on the topic always end up showing me a single parameter. Maybe two and they do indeed use a brief if-else block. But once you have, say 4 or more, you end up with horrible constructs if taking their advice.

    Does anybody have a suggestion on the "Right Way" for n optional parameters?

    Thanks

    Mike 



  • Unless performance prohibits it, I do it like that: 

    select * from person where (first_name=:p1 or :p1 is null) and (last_name=:p2 or :p2 is null) and (age=:p3 or :p3 is null)

    (the parameter names p1, p2, p3 have been choosen to keep it short. Normaly I'd use p_first_name etc.)

    If your language only allows for positional parameters in queries, it gets kind-of ugly... 

    select * from person where (first_name=? or ? is null) and (last_name=? or ? is null) and (age=? or ? is null)

    SetParam(1, txtFirstName); SetParam(2, txtFirstName);
    SetParam(3, txtLastName); SetParam(4, txtLastName);
    SetParam(5, txtAge); SetParam(6, txtAge);

     

    The advantage of this style is that the query is always the same, so the database system doesn't have to do a hard parse for every possible variation of the query; you might even be able to keep the query as a prepared statement to further reduce the overhead. 



  • Hi!

    That is exactly what I'm looking for! I guess I should have come up with that myself, but sometimes... forest...trees...

    I'm using DB2, so definite affirmative on named parameters (with a @ at front instead of :). I'll try this and see if it also does substitution of one parameter in multiple locations (you would think so, but I never make assumptions, burned too often).

    Many, many thanks!

    Mike



  • @marakai said:

    In the Normalise() method I do my darndest to munge and check that field to make sure it doesn't put me at risk for injection.

    Are you saying that your DB2 API has no genuine escape method, or... WTF



  • @ulzha said:

    @marakai said:
    In the Normalise() method I do my darndest to munge and check that field to make sure it doesn't put me at risk for injection.
    Are you saying that your DB2 API has no genuine escape method, or... WTF
     

    For obvious reasons, an escape method cannot be part of the database. If such a method exists, it must be part of the API. Anyway, the whole idea of escaping is fundamentally broken.



  • Another possibility (Java code because I copied it):

    int[] offsets = new int[3];
    String sql = "SELECT * FROM table WHERE true "
    if(firstname != null){
    offsets[0] = 1;
    sql += " AND firstname = ?";
    }

    if(lstname != null){
    offsets[1] = 1;
    sql += " AND lastname = ?";
    }

    if(age != -1){
    offsets[2] = 1;
    sql += " AND age = ?";

    }

    PreparedStatement pstmt = con.prepareStatement(sql);

    if(firstname != null){
    pstmt.setString(offsets[0], firstname);
    }

    if(lastname != null){
    pstmt.setString(offsets[0]+offsets[1], lastname);
    }

    if(age != -1){
    pstmt.setInt(offsets[0]+offsets[1]+offsets[2], age);
    }

    Moderator's note: fixed the formating for you;small bug fixed for clarity



  • OK, I give up. 

    Anyway, the idea is that you have an array of ints where 0 means the optional parameter is not present, and 1 means it is present. Then you can compute the index of a given parameter by adding up the contents of the array up to that index. 



  • @ammoQ said:

     

    select * from person where (first_name=:p1 or :p1 is null) and (last_name=:p2 or :p2 is null) and (age=:p3 or :p3 is null)

    (the parameter names p1, p2, p3 have been choosen to keep it short. Normaly I'd use p_first_name etc.)

    If your language only allows for positional parameters in queries, it gets kind-of ugly... 

    select * from person where (first_name=? or ? is null) and (last_name=? or ? is null) and (age=? or ? is null)

    SetParam(1, txtFirstName); SetParam(2, txtFirstName);
    SetParam(3, txtLastName); SetParam(4, txtLastName);
    SetParam(5, txtAge); SetParam(6, txtAge);

     

     

    Well, as it turns out, our . NET data provider for DB2 (DataDirect) only supports ANSI markers (... lastname = ? ...) while at the same time not understanding something like "? is null", after setting the parameter names appropriately. So that hoses me on both nice and ugly solution.

    Meaning this could go to REALLY ugly, real quick.

    Also found that, at least in DB2, a parameter marker is never allowed on the left hand side of an expression. 

    So, it looks like I will have to pursue some sort of indexed solution of parameter arrays. *shudder*

    Great, so I'll get to nominate my own code as WTF of the month...

     

    Mike 



  •   Gotta love replying to your own posts, but I was wondering if this would work without being a total mess (getting late, brain fried, so dumping it here first before I chase my tail with code...) - brazzy actually sort of gave me the idea.

    To make it reusable (I think the term on TDWTF is "enterprisey"?) I could use a hashtable of the SQL field name (as string) and the value as object, which would be CAST() appropriately. The latter may not even be needed in my case as it's all strings.

    Looping through the list of input fields on the form, only those that aren't empty get added to the hashtable. From said hashtable I could create snippets of "AND"-prefixed parts of the WHERE clause.

    As parameter names for each marker I'd simply use generic names like p1, p2, p3 within the loop.

    Or maybe I'll just go to bed and try again when I'm more conscious...

    Mike 



  • A little improvement in brazzys version, works without the ugly array.

    But of course this kind of solution is (like any version with positional parameters) implicitely dangerous in terms of maintenance... 

    String sql = "SELECT * FROM table WHERE true "
    if(firstname != null){
    sql += " AND firstname = ?";
    }

    if(lstname != null){
    sql += " AND lastname = ?";
    }

    if(age != -1){
    sql += " AND age = ?";

    }

    PreparedStatement pstmt = con.prepareStatement(sql);
    int parcnt = 0;
    if(firstname != null){
    pstmt.setString(++parcnt, firstname);
    }

    if(lastname != null){
    pstmt.setString(++parcnt, lastname);
    }

    if(age != -1){
    pstmt.setInt(++parcnt, age);
    }

     


  •  @marakai said:

    Looping through the list of input fields on the form, only those that aren't empty get added to the hashtable. From said hashtable I could create snippets of "AND"-prefixed parts of the WHERE clause.

    You don't need a hashtable, just a list of columnname-value pairs. Since you only have positional parameters, parameter names are not required at all.



  • @ammoQ said:

    You don't need a hashtable, just a list of columnname-value pairs. Since you only have positional parameters, parameter names are not required at all.

     

    Yeah, wish I knew why I thought "hashtable" - it's not like I'm doing anything with the key.

    Well, things are kinda working as long as I use a DataReader. But in the end I need to stuff it into a frakking DataAdapter (don't ask) and that's not happening. Same code, swapping Reader with Adapter. Reader returns results, Adapter remains empty.

    That's it, I'm off to bed... futz with this tomorrow. Thanks for the help, but, the bouncing off of ideas and getting input certainly made some things happen.

    Cheers

    Mike 



  • I could be wrong, but I thought DB2 was one of the few RDBMS that supported stored procedure overloading, at least I thought that was the case when I worked with DB2/400. If your using the regular version of DB2, it might have the same capabilities and might make your life easier. 


  • ♿ (Parody)

    You can alternatively use something like:

    WHERE FIRSTNAME LIKE ?

    Then set the parameter to be either '%' or the actual name.

    For numeric parameters, is could be something like:

    WHERE AGE BETWEEN ? AND ?

    And they either both get the actual value, or the smallest and largest values that the data can take. 

     



  • @boomzilla said:

    You can alternatively use something like:

    WHERE FIRSTNAME LIKE ?

    Then set the parameter to be either '%' or the actual name.

    For numeric parameters, is could be something like:

    WHERE AGE BETWEEN ? AND ?

    And they either both get the actual value, or the smallest and largest values that the data can take. 

     

    Also a possibility, but beware of the case that the value is NULL, so it doesn't even match '%' resp. falls between -INF and INF.



  •  Well, after a day of hacking away at this, I came up with a solution, but I don't think it will win any beauty prizes....

    Here's an excerpt. It works, but I can't help thinking there's got to be a better way!

     public DataSet VNCQueryAsDataSet(string refNo, string lastName, string firstName, string birthDate)
            {
                string vncWhereClause = " WHERE 1=1 ";
                List<DB2Parameter> parameterList = new List<DB2Parameter>();

                if (!String.IsNullOrEmpty(refNo))
                {
                    vncWhereClause += AppendANDSubclause("reference_no = ?");
                    parameterList.Add(AddParameterWithValue(refNo, DB2DbType.VarChar, 15));
                }
                if (!String.IsNullOrEmpty(lastName))
                {
                    vncWhereClause += AppendANDSubclause("last_name = ?");
                    parameterList.Add(AddParameterWithValue(lastName, DB2DbType.VarChar, 30));
                }
                if (!String.IsNullOrEmpty(firstName))
                {
                    vncWhereClause += AppendANDSubclause("first_name LIKE ?");
                    parameterList.Add(AddParameterWithValue(firstName+"%", DB2DbType.VarChar, 30));
                }
                if (!String.IsNullOrEmpty(birthDate))
                {
                    vncWhereClause += AppendANDSubclause("birth_date = DATE(CAST(? AS VARCHAR(15)))");
                    parameterList.Add(AddParameterWithValue(birthDate, DB2DbType.VarChar, 15));
                }

                return GetDataSet(AssembledCommand(_vncSelect, vncWhereClause, parameterList));
            }

            private DB2Parameter AddParameterWithValue(string paraValue, DB2DbType db2Type, int length)
            {
                string parameterName = "@p_" + _paramIndex++.ToString();

                DB2Parameter paraMarker = new DB2Parameter(parameterName, db2Type, length);
                paraMarker.Value = UpperCaseAndFixApostrophe(paraValue);

                return paraMarker;
            }

            private string AppendANDSubclause(string subclause)
            {
                return @" AND (" + subclause + ") ";
            }

     

    The last two methods are merely helpers as a results of refactoring for better readability, what with all the munging of strings. Some field declarations are not visible here:

     The _paramIndex is a class static field. I use it to auto-create parameter marker names.

    The base where clause is initialised with "WHERE 1=1" to always have some clause, even if all form fields are passed in empty. Then there's the ugly if{} blocks for every form field passed as parameter into the method. Any suggestion on how to improve this gratefully accepted! If a field is not empty, then its sub clause for the WHERE statement is appended in parameter marker form. The parameter and its value are kept in a simple generic list. 

    Once all is munged together, it's all dumped into a method to turn it into a Command object:

    private DB2Command AssembledCommand(string dbSelect, string query, List<DB2Parameter> paramList)
            {
                string queryCoda = " FETCH FIRST " + _maxRecords.ToString() + " ROWS ONLY";
                DB2Command assembledCommand = new DB2Command(dbSelect + query + queryCoda);
                assembledCommand.Parameters.Clear();
                assembledCommand.Parameters.AddRange(paramList.ToArray());
                return assembledCommand;
            }

    Again, queryCoda is just used as local variable for readability. Everything is pasted together, including the parameter list which is added in bulk to the DB2Command.

    I guess I achieved what I set out to do - injection proofing the code and using parametrised database queries.

    Strong doubts remain how maintainable this is...

    Mike 



  • Hi,

     If you're looking for a good guide to this, try Erland Sommarskog's site: http://www.sommarskog.se/

     The page on dynamic SQL and it's alternatives for dynamic where clauses should help: http://www.sommarskog.se/dynamic_sql.html

     The good thing with him is that he goes over all the possible methods, and gives a good comparison over all of them. Well worth a read for anyone into SQL.



  • I don't know if anybody really cares, but in the spirit of closure, I wanted to add that after some futzing on and off on this issue, I hobbled together a solution. It may not win any prizes, but it works well and the resulting code was fairly clean.

    In the end I created a simple wrapper class "ANDClause" (OK, sue me for unimaginative naming) which will hold both parametrised markers and the DB parameter. Surrounding them is a collection wrapper.

    This ultimately allowed me to reduce and refactor my web method to little more than

     

    ANDClauseCollection andClauses = new ANDClauseCollection();

    andClauses.Add(new ANDClause("reference_no = ?", DB2DbType.VarChar, 15, UpperCaseAndFixApostrophe( refNo)));
    andClauses.Add(new ANDClause("last_name = ?", DB2DbType.VarChar, 30, UpperCaseAndFixApostrophe( lastName)));
    andClauses.Add(new ANDClause("first_name LIKE ?", DB2DbType.VarChar, 30, UpperCaseAndFixApostrophe( firstName+"%")));
    andClauses.Add(new ANDClause("birth_date = DATE(CAST(? AS VARCHAR(15)))", DB2DbType.VarChar, 15, UpperCaseAndFixApostrophe( birthDate)));

    return GetDataSet(AssembledCommand(_vncSelect, andClauses));

    (The upper case et al WTF is because the legacy database this needs to talk to is so old it can't deal with lower case characters) 

    When adding clauses to the collection, checking is done whether there are valid values in each (seeing as the main use is for web search forms).

    AssembledCommand makes use of a method in the collection class, QueryCommand(), that throws out the whole shebang of all the possible AND clauses in a nice clean SELECT...WHERE...AND...AND... chunk.

     private DB2Command AssembledCommand(string dbSelect, ANDClauseCollection clauses)
            {
                string queryCoda = " FETCH FIRST " + _maxRecords.ToString() + " ROWS ONLY";
                return clauses.QueryCommand(dbSelect, queryCoda);
            }

     <shrug>I'm still a little surprised that there isn't a "standard" way of doing what I thought is a common task. So I can't help thinking I re-invented the wheel. But at least other can use this and even understand the code without running screaming.</shrug>

    Mike



  • I did the following in PHP: (mres = mysql_real_escape_string)



    $where = array("1");

    if (isset($SearchName) && $SearchName != "")

    $where[] = "Name = '" . mres($SearchName) . "'";

    if (isset($SearchJob) && $SearchJob != "")

    $where[] = "Job = '" . mres($SearchJob) . "'";



    $query = "SELECT * FROM Humans WHERE " . implode(" AND ", $where);



    Works, might not be the most 'enterpricy' sollution, but everyone seems to understand what is going on.



  • You can change the

    if (isset($SearchName) && $SearchName != "")

    to just:

    if (!empty($SearchName))

    :)


    The real WTF is that Community server ate the ]s... $where[ should be $where[]



  • Eep, post timeout :@

    I was going to say, you could extend it to easily use a few POST variables nicely:

    $where_checks = array(
    	// POST variable => corresponding field in table
    	'name' => 'Name',
    	'job' => 'Job'
    );
    $where = '1';
    
    foreach ($where_checks as $post_var => $where_var)
    {
    	if (!empty($_POST[$post_var]))
    		$where .= ' AND ' . $where_var . ' = "' . mysql_real_escape_string($_POST[$post_var]) . '"'; 
    }
    
    $query = '
    	SELECT *
    	FROM Humans
    	WHERE ' . $where;
    

    :)


Log in to reply