I've touched information_schema in its naughty place


  • Trolleybus Mechanic

    I feel dirty for having done this, but it was the best I could do with what I was given.

    There's this table, you see-- and it has, umm-- 100+ columns. It tracks a very large web form, so most of those fields I can agree with. But then comes the repeats. And I'm not just talking about the five repeats of contact info (parent1_address, parent2_city, parent3_phone...). I can live with those. It's the damned "action" columns.

    The form will be acted upon by system users several times, and the user_id of the user is recorded each time. When the form is created, saved, submitted, checked by a manager, added to a waitlist, and so forth. There's about a dozen actions.

    And really, I didn't care about it because it was someone else's headache. I work on a different part of the system. Until now. I just got the requirement for the user module. "A user cannot do certain things in the system if they have ever taken an action on any form."

    So I have to check if ANY of the dozen (and growing) insert_user_id, update_user_id, [anything]_user_id fields contain their user ID.  Had the system been designed the way I reccomended it, with an action audit table, this would be trivial. 

    IF EXISTS (SELECT 1 FROM audit WHERE user_id = @user_id).

    But not anymore. So I thought maybe I could get away with hardcoding the columns:

    IF EXISTS (SELECT 1 FROM bigass_table WHERE insert_user_id = @user_id OR update_user_id = @user_id OR approve_user_id = @user_id...).

    But no, becuase like I said, there's a dozen of them, and the number is still growing as the form is developed-- and more will be added on based on client requests long after this requirement is implemented and forgotten about. So I did the only thing I knew how to do: abuse information_schema.columns with a dildo made of dynamic SQL:

    CREATE PROCEDURE bigass_form_user_HasEntry
        @user_id int
    AS
    BEGIN

    DECLARE @csv as table(id int identity, column_name varchar(max))
    DECLARE @count as int
    DECLARE @max as int
    DECLARE @sql as varchar(max)
    DECLARE @column_name as varchar(max)

    INSERT INTO @csv(column_name)
    select column_name from information_schema.columns WHERE table_name='bigass_form' and column_name like '%user_id%'
    SELECT @max = @@rowcount

    SET @count = 1

    WHILE @count <= @max
    BEGIN
        SET @column_name = (SELECT column_name FROM @csv WHERE id=@count)
        SET @sql = CASE WHEN @sql is null THEN '' ELSE @sql + ' OR ' END + @column_name + '=' + CASE WHEN @user_id IS NULL THEN @column_name ELSE CAST(@user_id as varchar) END
        SET @count = @count + 1
    END

    IF @sql is null
    BEGIN
        SET @sql = '1=0'
    END

    SET @sql = 'SELECT ' + CAST(@user_id as varchar) + ' as [user_id], CASE WHEN SUM(bigass_form_id) > 0 THEN 1 ELSE 0 END as has_entry FROM bigass_form WHERE ' + @sql


    print @sql
    exec(@sql)

    END

    To whoever comes across this five years down the road, I'm sorry. I'll document it for you, and buy you a beer if I ever meet you.



  • If it's not MongoDB, it's the wrongo DB.



  • ALTER TABLE bigass_form ADD [null = null; drop table bigass_form; -- user_id] int


  • Garbage Person

     And here I thought the one particularly paranoid "verify the database schema" procedure I'd written once was flagrant aabuse of information_schema.

     

    (After about 2 weeks of dealing with trying tot keep that procedure updated, it got nooped out)



  • @Lorne Kates said:

    and more will be added on based on client requests long after this requirement is implemented and forgotten about

    I think you just answered your own question here...  Long after you implement this: are you REALLY planning on staying at same dev shop long after you implement this?  Besides, how many times have you performed a full head-desk because of someone elses garbage code.  Fuck 'em.

    Create the required write-only documentation outlining what, where, why, etc. and be done with it.



  • Apparently, there are still idiots out there who think building a schema that only allows an action to be performed once on a record is a good idea. I've never understood the habit of putting "modified by" in a column in the table instead of creating an audit table. A workflow actions table would have been the right answer in your case and would have avoided all of your dynamic SQL.



    BTW, had I been given this assignment, I would have written the evil WHERE clause instead of using information schema. Besides the slight risk of a bizarre bug due to future unexpected naming, the dynamic SQL route could have negative security implications.



  • @Jaime said:

    Apparently, there are still idiots out there who think building a schema that only allows an action to be performed once on a record is a good idea. I've never understood the habit of putting "modified by" in a column in the table instead of creating an audit table. A workflow actions table would have been the right answer in your case and would have avoided all of your dynamic SQL.

    Because it's standard industrial best practices that's what we've always done here we have to conform to established patterns we have more experience than you whitepaper says so stop questioning me! You know, same as the reason why we don't use more than one database user...


  • Trolleybus Mechanic

    @Xyro said:

    You know, same as the reason why we don't use more than one database user...
     

    If we used more than one database user, we'd have to hardcode a second password in the config file.


Log in to reply