Regexp for invalid characters, unicode
Posted: Fri Jan 18, 2008 3:50 am
Hey everyone! Is it just me, or did the forums just get a facelift? Looking great 
I'm having a bit of a headache (
) at the moment, due to trying to wrap my head around unicode (UTF-8) instead of using the old iso-8859-1, and using regular expression to support a wide array of languages.
The project I'm working on is a World of Warcraft "fansite" ( a new community site for guilds ), where users will be able to make lists of their characters, join and organize guilds, and more.
My head starts to hurt when I think about all the different character names that will pop up. As far as I can tell, everything that's prohibited for use in character names are special characters (!@'"\/& etc..), numbers and whitespace. After going through the crash course posted here, I thought that I had come up with the solution, by using:
, to match any non-alphanumberic characters, numbers and whitespace, but without success. Also, I'm not sure how "\W" will treat unicode characters like "é, ü, û", or names like "Meèn'ame" and so on.
Can anyone tell me what I'm doing wrong, and give me a few pointers to get me back on track?
I'm having a bit of a headache (
The project I'm working on is a World of Warcraft "fansite" ( a new community site for guilds ), where users will be able to make lists of their characters, join and organize guilds, and more.
My head starts to hurt when I think about all the different character names that will pop up. As far as I can tell, everything that's prohibited for use in character names are special characters (!@'"\/& etc..), numbers and whitespace. After going through the crash course posted here, I thought that I had come up with the solution, by using:
Code: Select all
"\^[\W\d\s]$\"Can anyone tell me what I'm doing wrong, and give me a few pointers to get me back on track?