
In “Let’s talk about usernames” James Bennett – author of django-registration – digs deeper into an at first seemingly simple thing such as usernames and how to keep ‘m safe and unique.
And no, you can’t make it by just doing a a simple comparison. You’ll have to think of more than that if you want to do it good:
- Casing:
John_Doevs.JOHN_DOE - Homographs:
а(U+0430 CYRILLIC SMALL LETTER A) vs.a(U+0061 LATIN SMALL LETTER A) - Reserved words (especially when used in e-mail addresses and (sub)domains): Think of
adminandhostmaster - Reserved words (especially when used in URLs): Think of
login,register, and evenkeybase.txt - Confusables:
paypalvs.paypa1 - Misspellings
- Non-ASCII Characters:
çvs.c - …
To tackle the Non-ASCII Characters in JavaScript, Lea Verou recently tweeted this nice trick, using String.protototype.normalize:
TIL you can convert letters to their non-accented version by just doing:
str.normalize("NFD").replace(/[\u0300-\u036f]/g, "");Whoa. No more 1KB lookup tables!
Mind = blown!And it’s supported everywhere!https://t.co/fukRLsu3VV
— Lea Verou (@LeaVerou) November 26, 2017
'céçile'.normalize("NFD").replace(/[\u0300-\u036f]/g, "");
// ~> cecile