Get max ID



  • Our company just recently had a developer leave to head to greener pastures.  He had gotten his "Master of Science in Computer Information Systems - Database Management and Business Intelligence Concentration" and he felt he wasn't being paid what he was now worth. Shortly after he left we got into some of his code and found this function that was used to get the maximum ID from a table.  I'm not sure, but I think there's an easier way to do it.

    private long GetDocumentInfoDocID(string strDocTypeID)
            {
                long    nDocID  = 0;
                long    nRows   = 0;
                bool    bDateID = false;
                bool    bRows   = true;
                string  strSQL  = "";
                string  strMsg  = "";

                string strDocInfoFile = "DocumentInfo" + "_" + strDocTypeID;

                MySqlCommand dtCommand = null;
                MySqlDataReader dtReader = null;

                try
                {
                    strSQL = string.Format("select * from " +  strDocInfoFile);
                    dtCommand = dtConnection.CreateCommand();
                    dtCommand.CommandType = CommandType.Text;
                    dtCommand.CommandText = strSQL;

                    dtReader = dtCommand.ExecuteReader();

                    bRows = dtReader.HasRows;

                    while (bRows)
                    {
                        bDateID = true;
                        if (bDateID == true)
                        {
                            if (dtReader != null)
                            {
                                bRows = false;
                                dtReader.Close();
                            }
                            // Get the number of items in the current date
                            strSQL = string.Format("select * from " + strDocInfoFile);
                            dtCommand.CommandText = strSQL;
                            dtReader = dtCommand.ExecuteReader();
                            while (dtReader.Read())
                            {
                                nRows++;
                            }
                            nDocID = nRows;
                        }
                    }
                }
                catch (Exception ex)
                {
                    strMsg = string.Format("Error in reading DocumentInfoInfo table by function:GetDocumentInfoDocID, Exception: {0}", ex.Message);
                    LoggerInfo(strMsg);
                    throw new MyException(strMsg);
                }
                finally
                {
                    if (dtReader != null)
                        dtReader.Close();
                }
                return nDocID;
            }
     



  • @bobzilla said:

    string strDocInfoFile = "DocumentInfo" + "_" + strDocTypeID;


    Obvisouly he should've contracted the DocumentInfo and underscore...

    Shaves off a few cycles everytime you request the max ID


    </PurposelyObliviousToTheAtrocityInTotal>



  • Never heard of "select top 1", then?



  • @Mason Wheeler said:

    Never heard of "select top 1", then?

     

    ...or alternatively "SELECT MAX(IDField) FROM Table"



  • @MeesterTurner said:

    @Mason Wheeler said:

    Never heard of "select top 1", then?

     

    ...or alternatively "SELECT MAX(IDField) FROM Table"

     

    ...oh yeah.  I knew that one!  I blame cerebral flatulence.



  • Needing to know the max id is usually a sign of bad design and potential concurrency problems, especially since this function can be slow. He also made the test for bDateID rather superfluous, which makes me suspect that originally he tried to look through just the records with the current date to save time. That of course fails miserably, so the solution is (no doubt here) going through the entire table and counting the rows. So our two chief WTFs are: concurrency failure, slow speed, and a superfluous variable/test. No, our three chief WTFs are: concurrency failure, slow speed, a superfluous variable/test, and hoping that maxid == nr of rows. Our four WTFs are ...



  • Among the WTFs in this example are... I should just come in again.



  • @TGV said:

    Needing to know the max id is usually a sign of bad design and potential concurrency problems
     

    +1

     



  • @bobzilla said:

    Master of Science in Computer Information Systems - Database Management and
    Business Intelligence Concentration

    That's the whole problem right there...



  • @MeesterTurner said:

    @Mason Wheeler said:

    Never heard of "select top 1", then?

     

    ...or alternatively "SELECT MAX(IDField) FROM Table"

     

    If you don't specify an ORDER BY, select top 1 could (in theory at least) pull any one record from the table.

    In reality, this'll probably never happen, but you shouldn't rely on implementation details that aren't in the specification.



  • @blakeyrat said:

    @MeesterTurner said:

    @Mason Wheeler said:

    Never heard of "select top 1", then?

     

    ...or alternatively "SELECT MAX(IDField) FROM Table"

     

    If you don't specify an ORDER BY, select top 1 could (in theory at least) pull any one record from the table.

    In reality, this'll probably never happen, but you shouldn't rely on implementation details that aren't in the specification.

     

    Of course.  The full query was supposed to be something like

    select top 1 ID_COLUMN

    from  TABLE_NAME

    order by ID_COLUMN

     

    I just figured the rest of it was obvious enough to leave implicit. :P



  • This is one of the best posts in the sidebar in a while.

    I don't usually think about the word "atrocious", and yet it jumped right at me when I read that code.



  • @Mason Wheeler said:

    select top 1 ID_COLUMN

    from  TABLE_NAME

    order by ID_COLUMN

     

    That returns the min(ID_COLUMN). You gotta:

     

    @Mason Wheeler said:

    select top 1 ID_COLUMN

    from  TABLE_NAME

    order by ID_COLUMN desc


     Sorry, SQL nerd here.



  • Wow. Just, wow.

    @bobzilla said:

    strSQL = string.Format("select * from " +  strDocInfoFile);

    Way to use string.Format...and then concatenate and not use a format string at all.

    @bobzilla said:

    while (bRows)
                    {
                        bDateID = true;
                        if (bDateID == true)
                        {
                            if (dtReader != null)
                            {
                                bRows = false;
                                dtReader.Close();
                            }
                            // Get the number of items in the current date
                            strSQL = string.Format("select * from " + strDocInfoFile);
                            dtCommand.CommandText = strSQL;
                            dtReader = dtCommand.ExecuteReader();
                            while (dtReader.Read())
                            {
                                nRows++;
                            }
                            nDocID = nRows;
                        }
                    }

    So he instantiates a boolean to true, and then check if it's true...each time through the loop. I guess this guards against the value of "true" changing to "false" all of a sudden.

    And he instantiates a new SQL query (that's identical to the old one) and run it again. Once for each row returned. A SELECT * query, no less. Yikes.

    I guess we can give him credit for cleaning up with finally {}...



  • At the organisation where I work I encourage those with lesser SQL skills to sit in on my "Lunchtime Learnings" sessions where I go over issues like this one.

    Finding the number of rows in a table is a non-trivial problem but, after some consideration, I have discovered that the best way is to use a stored procedure that opens a cursor and then loops through each row incrementing a counter. Just return the counter and you're done!



  • In what datatabase is that stored procedure better than SELECT count(SomeIndexedColumn) FROM tablename ???

    By using the database's built-in count mechanism, you will be using all of the deeply-hidden optimisations possible for that table structure and by using an indexed column, you have a good chance of hitting something that already was counted for you and if it wasn't, the count will probably be cached for the next request.

    I would question the original system however - why use a different table (file?) for different document categories?  This suggests that it is using a file-based system and stored procedures don't exist.



  • @toth said:

    And he instantiates a new SQL query (that's identical to the old one) and run it again. Once for each row returned. A SELECT * query, no less. Yikes.

    No.  It runs the query once to check that there are rows and then it runs it again to actually count the rows.

    This is probably an artefact of the prematurely-optimised solution which probably ran the query for just the current day, checked if there were any rows and then ran the all-data query if the day came up empty.  Since it may not even be using a database on the back end, the query with the WHERE statement would probably have to inspect all rows and will take longer than just returning all the rows.



  • @blakeyrat said:

    @Mason Wheeler said:

    select top 1 ID_COLUMN from  TABLE_NAME order by ID_COLUMN

    That returns the min(ID_COLUMN). You gotta:

    @Mason Wheeler said:

    select top 1 ID_COLUMN from  TABLE_NAME order by ID_COLUMN desc

     Sorry, SQL nerd here.

    You claim to be an SQL nerd and yet didn't catch the nonexistent "select top 1" syntax? Or is this some MSSQL nonsense? Neither MySQL not PostgreSQL recognizes it, and I'm pretty sure it isn't in the standard either. The correct syntax would be SELECT id_column FROM table_name ORDER BY id_column DESC LIMIT 1. Alternatively, it might be that I'm just a grammar nazi.

    Of course, the correct way to solve the original problem would be to use a sequence for generating the ids (or an auto_increment field if your SQL server doesn't support sequences).


  • Discourse touched me in a no-no place

    @Qwerty said:

    In what datatabase is that stored procedure better than SELECT count(SomeIndexedColumn) FROM tablename ???
    They weren't serious. You do realise that, don't you?



  • Topc0der taught me everything I know.

    You sir, have been trolled.



  • @tdb said:

    You claim to be an SQL nerd and yet didn't catch the nonexistent "select top 1" syntax? Or is this some MSSQL nonsense?

    It's MSSQL's syntax.



  • @tdb said:

    You claim to be an SQL nerd and yet didn't catch the nonexistent "select top 1" syntax? Or is this some MSSQL nonsense? Neither MySQL not PostgreSQL recognizes it, and I'm pretty sure it isn't in the standard either. The correct syntax would be SELECT id_column FROM table_name ORDER BY id_column DESC LIMIT 1. Alternatively, it might be that I'm just a grammar nazi.

    Of course, the correct way to solve the original problem would be to use a sequence for generating the ids (or an auto_increment field if your SQL server doesn't support sequences).

     

    Ok, fine I'm a T-SQL nerd. And you're a pedant. I'd hardly call T-SQL "nonsense", at this point it's as well-established as any other flavor of the language, and it's a damn sight more convienent than stock SQL.

    And yes, obviously the real point here is that if you ever *need* this query, you're doing something wrong.

    Happy? Or are you going to nitpick some other meaningless detail now?



  • @Daniel15 said:

    @tdb said:
    You claim to be an SQL nerd and yet didn't catch the nonexistent "select top 1" syntax? Or is this some MSSQL nonsense?
    It's MSSQL's syntax.
     

    and just to show that it isn't just MSSQL...

    IBM's Informix system uses "select first 1 * " different keyword, but same syntax.



  • @toth said:

    So he instantiates a boolean to true, and then check if it's true...each time through the loop. I guess this guards against the value of "true" changing to "false" all of a sudden.
    It's to protect against glitching attacks on the memory bus. That's how PS3 was hacked, you know.



  •  @TGV said:

    Needing to know the max id is usually a sign of bad design and potential concurrency problems, especially since this function can be *slow*. He also made the test for bDateID rather superfluous, which makes me suspect that originally he tried to look through just the records with the current date to save time. That of course fails miserably, so the solution is (no doubt here) going through the entire table and counting the rows. So our two chief WTFs are: concurrency failure, slow speed, and a superfluous variable/test. No, our three chief WTFs are: concurrency failure, slow speed, a superfluous variable/test, and hoping that maxid == nr of rows. Our *four* WTFs are ...

    Amongst our WTFery are such elements as concurrency failure, slow speed... I'll come in again... Amongst our WTFery are such diverse elements as concurrency failure, slow speed,  a superfluous variable/test, and nice red pocket protectors.. OH DAMN!

     

     



  • @SQLDave said:

    Amongst our WTFery are such elements as concurrency failure, slow speed... I'll come in again... Amongst our WTFery are such diverse elements as concurrency failure, slow speed,  a superfluous variable/test, and nice red pocket protectors.. OH DAMN!

    I bow to your great knowledge of the Monty Python. Bring ... the comfy chair!



  • @TGV said:

    I bow to your great knowledge of the Monty Python. Bring ... the comfy chair!

    (cough) Shouldn't that be Fetch … the comfy chair?

    --
    BFN
    CAD (sometime member of CaRP: Campaign for Real Pedantry …)


Log in to reply