Constraints on usernames?


  • ♿

    What limits would you put on usernames for a forum? So far, I have "can't be empty" and "can't contain control characters".


  • SockDev

    I'd say anything printable is OK, though possibly strip (certain) whitespace characters



  • Ban Dragonball Z character names.



  • User[0-9]+


  • ♿

    Wouldn't that mean User001 and User01 could be different people?



  • @ben_lubar said:

    for a forum

    Depends on the intended users. I could see wanting to limit unicode shenanigans if most of your users are non-technical. Or limiting it to things common on a basic US keyboard if you are just looking at basic english speakers. It all depends on who you are trying to get on your forum (note: the system used to run forums as a general case would need to support much more than what a specific forum probably wants).



  • Yes. AS_DESIGNED. Use [Uu][Ss][Ee][Rr][0-9]+!* instead for added fun and confusion.

    Filed Under: When everyone's special, then no one is special



  • If you have a 1 character minimum length, the shorter names will become more sought-after as more people join. Soon there will be pressure to sell the 1-character names of people who haven't posted in awhile.

    My recommendation: ask at least $2500.


  • ♿

    Should I do Unicode normalization on usernames?


  • ♿

    If usernames are case insensitive, what do I lowercase the letter I to?



  • What layer are you making them insensitive at, and why?


  • ♿

    Okay, case-sensitive usernames it is.



  • You might want to make the authentication form insensitive, for example. But insert the user data how they give it to you.



  • Again this depends on the target market. Many people have problems with case-sensitivity in different contexts (which also ties in with @Captain's what layer question).


  • ♿

    Well, it's the application layer, since I can't do it in the user layer and doing business logic in the database layer is a Bad Idea.


  • ♿

    I'm thinking of using the usernames as primary keys in a NoSQL database, since all the keys in NoSQL databases are strings anyway.



  • @Captain said:

    You might want to make the authentication form insensitive, for example. But insert the user data how they give it to you.

    This is the best way.



  • @ben_lubar said:

    I'm thinking of using the usernames as primary keys in a NoSQL database, since all the keys in NoSQL databases are strings anyway.

    Don't do that.

    A name isn't a person, a name is a property of a person. A person can have multiple names. It's bad data modeling to use it as a primary key.


  • ♿

    A username is a user. It doesn't make sense to have multiple accounts with the same username.



  • @ben_lubar said:

    A username is a user.

    No; it's a property of a user. It can be changed. You could potentially have multiples, as I said before. You could potentially have NONE, in the case where you let people interact with the system before creating an account.

    Now you may personally not support all these uses of the field, but that doesn't change the fact that using a username as a primary key is a bad idea.


  • ♿

    If they don't have an account, there won't be a record in the database for their account. If they have multiple accounts, they each have a different username or they wouldn't be able to log in with them.

    What do you propose for the case where someone tries to register an account that already exists? Allow it and have two accounts that only differ in registration date?


  • ♿

    @ben_lubar said:

    A username is a user. It doesn't make sense to have multiple accounts with the same username.

    But you might want them to change their username. But then again, you're using NoSQL, so best DB practice has already been tossed out the window.



  • No @ben_lubar, you have another table to join username to userid that way you can tie multiple names to a single user (think merging two accounts together). Of course differentiating which username to tie to posts is going to be messy.



  • He isn't saying usernames can't be unique. He's saying using it as a primary key is a bad idea, for reasons. For example, if you change a user's username, you have to cascade the change (which is a big, error prone pain in the ass).



  • @ben_lubar said:

    If they don't have an account, there won't be a record in the database for their account.

    That's not a given. Maybe you want to let the user "claim" posts they made anonymously in the past. Those posts have to live somewhere in the DB, but have no user name attached to them.

    @ben_lubar said:

    If they have multiple accounts, they each have a different username or they wouldn't be able to log in with them.

    A lot of forums have a concept of "moderator hat", where a user who is also a moderator is "wearing the moderator hat" when he's posting solely to perform some sort of moderation action. You could implement that in your database by literally having the "moderator hat" be a different user account with its own username.

    Or you might want to introduce the concept of impersonation, where one user can act as if they were a second user. Very helpful for debugging, especially debugging permissions issues. In this case, the user would also have two usernames.


  • ♿

    If you change a user's username, you break everything that links to that user anyway, which is why I'm disallowing changes of usernames.



  • @ben_lubar said:

    If you change a user's username, you break everything that links to that user anyway, which is why I'm disallowing changes of usernames.

    You are a Nazi.

    Don't do this. You're being an ass for no reason. Bad, bad, bad, bad, bad idea.


  • SockDev

    @blakeyrat said:

    Now you may personally not support all these uses of the field, but that doesn't change the fact that using a username as a primary key is a bad idea.

    Using a string as a PK is a bad idea yes, but why would having a UNIQUE constraint on the username field be bad?


  • ♿

    @ben_lubar said:

    If you change a user's username, you break everything that links to that user anyway, which is why I'm disallowing changes of usernames.

    Unless you didn't use the usename as the PK. Natural keys will seduce you into trouble. Stay away from them.


  • ♿

    @RaceProUK said:

    Using a string as a PK is a bad idea yes

    No one said that.



  • @RaceProUK said:

    but why would having a UNIQUE constraint on the username field be bad?

    Who said it would? Not me.

    EDIT: wait, are you seriously unaware that you can have a UNIQUE constraint on a column that isn't a primary key?


  • SockDev

    @boomzilla said:

    No one said that.

    Never said they did; I was agreeing with @blakeyrat
    @blakeyrat said:
    Who said it would? Not me.

    That's OK then :smile:


  • ♿

    @RaceProUK said:

    Never said they did; I was agreeing with @blakeyrat

    Well, you said it, and you're wrong. There's nothing wrong with string PKs.


  • ♿

    @oldusername said:

    Hey, check out my profile here [link to /u/oldusername]

    ?



  • I think it's different usage. We're talking about it in terms of how databases implement keys, whereas the theory calls unique properties "primary keys".


  • SockDev

    @boomzilla said:

    Well, you said it, and you're wrong. There's nothing wrong with string PKs.

    Sure though, from an efficiency standpoint, integers are better?


  • ♿

    Oh, URL links. Well, presumably they know they changed their username. If your goal is to prevent broken links on teh intertubez, your best strategy is not to make any links.


  • ♿

    I'm not sure why looking up some 8-byte block of serialized data is harder or easier than looking up any other byte sequence in a b-tree.


  • ♿

    @RaceProUK said:

    Sure though, from an efficiency standpoint, integers are better?

    I doubt anyone would notice, unless your DB is really awful.



  • @Captain said:

    I think it's different usage. We're talking about it in terms of how databases implement keys, whereas the theory calls unique properties "primary keys".

    Who's theory?

    Look, primary keys by necessity have to be marked UNIQUE. But you can also have UNIQUE constraints on non-keys, even in some NoSQL systems (I know MongoDB supports it, for example.)

    @RaceProUK said:

    Sure though, from an efficiency standpoint, integers are better?

    He's using NoSQL, so integers are out the window. GUIDs work fine though.



  • Because 8 bytes doesn't get you much of a string? But really they aren't bad, just be careful how you use them.


  • ♿

    Why does {1F7CB49D-FDA6-4C6D-B7E1-968E63A1FF6D} identify a user better than yoloswaglord420?


  • SockDev

    @blakeyrat said:

    He's using NoSQL, so integers are out the window.

    O…K… that's just… odd.
    @blakeyrat said:
    GUIDs work fine though.

    That would have been my next suggestion :smile:



  • @ben_lubar said:

    Why does {1F7CB49D-FDA6-4C6D-B7E1-968E63A1FF6D} identify a user better than yoloswaglord420?

    Well it's a lot more dignified, for one thing.

    Look, I'm of the school that keys should always just be keys. With very very very few exceptions.

    (For example, if the USPS defines 2-digit State codes, and they never change, and you'll never need to use State codes from a different country, then maybe-- MAYBE-- MMMMMAAAAAYYYYYBBBBBEEEEE-- I'd consider using the 2-digit State code as a primary key. I still wouldn't though.)

    Your database doesn't store a person, it stores data related to a person. There is no string of numerals or characters that can uniquely identify a person. Does not exist. (Maybe if you had their full DNA sequence.) Therefore, the only alternative is to use an artificially-generated key. Like you should be using anyway.

    As an added bonus, you get the potential for implementing all the features I just mentioned.


  • ♿

    I can use any data I want in the actual document, but the key is a byte sequence.


  • SockDev

    @ben_lubar said:

    Why does {1F7CB49D-FDA6-4C6D-B7E1-968E63A1FF6D} identify a user better than yoloswaglord420?

    {1F7CB49D-FDA6-4C6D-B7E1-968E63A1FF6D} is permanent, yoloswaglord420 could change. Sometimes, usernames change; do you really want to crawl the DB, fixing all your references?


  • ♿

    UUIDs are handy because you can generate them easily. But they're an important bit of infrastructure that no one really needs to look at.



  • Um, Codd's.

    I'm not disagreeing with you. You, me, and others are talking about primary keys in terms of the fact that their columns are marked as primary keys. Other people talk about primary keys in terms of the properties of the data.

    A primary key uniquely specifies a tuple within a table. In order for an attribute to be a good primary key it must not repeat. ...


  • ♿

    @Captain said:

    OtherBad people

    Yuck.


  • ♿

    We've already shown that it's impossible to change all references to an old username, so the old username has to stay. If we put a field in the user data that says "this user was renamed/merged, look over here instead", that would make sense, and as a bonus you can actually determine what username a user had when they made a post.


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.