Strange results when parsing accented chars
Posted: Fri Feb 17, 2006 6:32 pm
Can anyone tell my why the following matches 'fiancé' as 'fianc'(wrong) and 'idée' as 'idée'(correct) in PHP?
Bizarrely, it works perfectly in RegexBuddy!!!
I have tried using both ascii and unicode alternatives for the accented chars - no difference.
I am pretty new to regex, but I think I am asking for any words(inc. French), min 3 chars in length, which may contain a hyphen...
Oh, I also use the /i (case insensitive) modifier...
Thanks
Seppo
Bizarrely, it works perfectly in RegexBuddy!!!
Code: Select all
\b([a-zéä]+\-?[a-zéä]*){3,}\bI am pretty new to regex, but I think I am asking for any words(inc. French), min 3 chars in length, which may contain a hyphen...
Oh, I also use the /i (case insensitive) modifier...
Thanks
Seppo