Non-Unique Identifiers



  •  Confession time, I figured out why a program I've been working on for a while would sometimes crash when trying to load a particularly large save file

    setID((int) (Math.random() * Short.MAX_VALUE));

     Looking back, at the time I wrote that I had just spent a long time on the object hierarchy/properties system, and didn't feel like writing the proper code to geta truly unique ID. Of course, I never met my goal of fixing it the next day...



  •  I think the problem is that Math.random() isn't random enough.  You need to make it randomer.



  • @joeyadams said:

     I think the problem is that Math.random() isn't random enough.  You need to make it randomer.

     

    Indeed. Clearly that is the problem.



  • @Z1_Jacob said:

    setID((int) (Math.random() * Long.MAX_VALUE));
    Fixed!

    And now, the Enterprisey version!!

    @Singleton
    public class EnterpriseRandomBeanFactory {
        public Random newEnterpriseRandomBean() {
            return new EnterpriseRandomBean();
        }
    }
    public class EnterpriseRandomBean {
        Random random = new Random();
        public Random getRandom() {
            return random()
        }
        public void setRandom(Random random) {
            this.random = random;
        }
    }

    // later in the code:
    Context ctx = JNDIUtil.getContext();
    EnterpriseRandomBeanFactory erbf = (EnterpriseRandomBeanFactory)ctx.lookup("/hurp/durp/random");
    EnterpriseRandomBean erb = erbf.newEnterpriseRandomBean();
    int i = erb.getRandom().nextInt();
    short s = (short) i; // Truncated for legacy purposes
    setID((int) s);



  • I've seen worse!!! A colleague once had the "unique" id of new customer applications based on just the current time (not date and time, just time)



    It worked fine until about three weeks in, and for some unknown and "unreproducable-by-colleague" reason (at least, until I went into the code), applications were randomly causing primary key errors.



  • @MeesterTurner said:

    I've seen worse!!! A colleague once had the "unique" id of new customer applications based on just the current time (not date and time, just time)



    It worked fine until about three weeks in, and for some unknown and "unreproducable-by-colleague" reason (at least, until I went into the code), applications were randomly causing primary key errors.

    TRWTF is counting time in cycles, amirite? Why isn't don't we call today 4133367237772346663228654?



  • @MeesterTurner said:

    (not date and time, just time)
     

    I'm assuming time as in "12:30" instead of "milliseconds/ticks since epoch"?



  • I've seen similar using a database. The "get random id" function was basically this:

    GetMultiplierValueFromDB()
    ID = Math.random() * Multiplier
    While FindIDInDB(ID) ID = Math.random() * Multiplier
    InsertIDInDB(ID)

    Of course, FindIDinDB() created a string such like "SELECT * FROM RandomDB WHERE ID = "+ID, executed that query, and if the number of rows was > 0, it returned true. 

    I only got involved because the user who wrote this had left and the application was getting noticably slower. 



  • @Mole said:

    I've seen similar using a database. The "get random id" function was basically this:

    GetMultiplierValueFromDB()
    ID = Math.random() * Multiplier
    While FindIDInDB(ID) ID = Math.random() * Multiplier
    InsertIDInDB(ID)

    Of course, FindIDinDB() created a string such like "SELECT * FROM RandomDB WHERE ID = "+ID, executed that query, and if the number of rows was > 0, it returned true. 

    I only got involved because the user who wrote this had left and the application was getting noticably slower. 

    OMG!  I know someone else who did exactly that.  And when I pointed it out they said that they had read a MS article that claimed that the SQL Server B-Trees would become "unbalanced" if an identity was used.  This would slow the database down over time.  Of course this mumbo-jumbo convinced the "Powers That Be" that this guy's solution was the best.

    Anyway this was a project for the US Postal Service to track container shipments during the Christmas rush season... With a bad acronym: CNTS


  • Garbage Person

    @Auction_God said:

    Anyway this was a project for the US Postal Service to track container shipments during the Christmas rush season... With a bad acronym: CNTS
    Why only during the christmas rush? Do they not use container shipments during the normal season? Why couldn't their normal system handle intermodal shit in the first place?



  • @Weng said:

    @Auction_God said:

    Anyway this was a project for the US Postal Service to track container shipments during the Christmas rush season... With a bad acronym: CNTS
    Why only during the christmas rush? Do they not use container shipments during the normal season? Why couldn't their normal system handle intermodal shit in the first place?


    The simple answer: Because it is the USPS...
    The longer answer:
    During most months, the USPS has enough transportation capacity on their own airplanes/trucks to handle the mail volume. During the month of December they bring up a command center (full-screen monitors, etc.) and they hire whatever additional space they can on commercial air flights, or even rent additional planes, train cars, and private truckers to carry all the extra volume. If remember correctly, almost 50% of the USPS volume occurs in late November and December.
    They just don't need the system during the rest of the year.

    Let me tell you that the deadlines on this project were pretty much set in stone :) Nobody would move the Christmas holidays because we didn't get something done.



  • I have an application in which the ID of an object is its location in memory.  Hacky, but I leave the task of making a unique ID to the kernel.  And no, it's not like a pointer; the language doesn't even support pointers.

     

    (Luckily the information isn't ever stored in a database.  That would be terrible.)



  • @Auction_God said:

    @Mole said:

    I've seen similar using a database. The "get random id" function was basically this:

    GetMultiplierValueFromDB()
    ID = Math.random() * Multiplier
    While FindIDInDB(ID) ID = Math.random() * Multiplier
    InsertIDInDB(ID)

    Of course, FindIDinDB() created a string such like "SELECT * FROM RandomDB WHERE ID = "+ID, executed that query, and if the number of rows was > 0, it returned true. 

    I only got involved because the user who wrote this had left and the application was getting noticably slower. 

    OMG!  I know someone else who did exactly that.  And when I pointed it out they said that they had read a MS article that claimed that the SQL Server B-Trees would become "unbalanced" if an identity was used.  This would slow the database down over time.  Of course this mumbo-jumbo convinced the "Powers That Be" that this guy's solution was the best.

    Anyway this was a project for the US Postal Service to track container shipments during the Christmas rush season... With a bad acronym: CNTS

    In older versions of Sybase (which I have had the extreme misfortune to work on), huge "blocks" of identity column values were pre-allocated on server startup. This caused issues if the Sybase process ever went down or had to be killed, because then the block of IDs that had been allocated in that session would be marked as used and hence never became available again. If you were using integers as your identity column, this meant you would not only have massive gaps - in the order of millions - between your IDs, you would eventually (or quickly, depending on how often the Sybase process was restarted) run out of values for your identity columns.

    To solve this, we didn't use identity columns. Instead we had a table called "CONTROL_NUMBERS" which had 2 columns, the name of a table and the integer value of its highest current primary key. When you wanted to insert a new row into a table, you called a proc that checked CONTROL_NUMBERS, got the highest ID for the table you were inserting, added 1 to the result, then updated CONTROL_NUMBERS with that value and returned the result. Fun times!


Log in to reply