The problem with usernames

In “Let’s talk about usernames” James Bennett – author of django-registration – digs deeper into an at first seemingly simple thing such as usernames and how to keep ‘m safe and unique.

And no, you can’t make it by just doing a a simple comparison. You’ll have to think of more than that if you want to do it good:

  • Casing: John_Doe vs. JOHN_DOE
  • Homographs: а (U+0430 CYRILLIC SMALL LETTER A) vs. a (U+0061 LATIN SMALL LETTER A)
  • Reserved words (especially when used in e-mail addresses and (sub)domains): Think of admin and hostmaster
  • Reserved words (especially when used in URLs): Think of login, register, and even keybase.txt
  • Confusables: paypal vs. paypa1
  • Misspellings
  • Non-ASCII Characters: ç vs. c

Let’s talk about usernames →

(via Matthew)

To tackle the Non-ASCII Characters in JavaScript, Lea Verou recently tweeted this nice trick, using String.protototype.normalize:

'céçile'.normalize("NFD").replace(/[\u0300-\u036f]/g, "");
// ~> cecile