I'm a little confused as to whether preg with the /u switch or mb_ereg should be used for matching/validating/stripping input. Is one of these deprecated? Is one going to be kept and the other dropped? Are both going in favour of something new? The preg examples I've seen have a lot more granular control over which unicode characters can be matched, is this something which can be done equally finely if I use something like Perl syntax in the mb_ereg family?
Additionally to this, and I don't know which would be the best forum for this second question, are there any resources which say what a valid unicode/IDN e-mail address looks like? The old ASCII version is so well documented even Wikipedia has all the info anyone needs on that (including the difference between the standard and what's actually accepted) but I can't find diddly on international. Has there in fact been no change other than what counts as a letter?
Finally (spot the noob, right?), as unlikely as I find it, I'd just like a little reassurance; is there any way a stored character in a string holding variable can interfere with the running of PHP code? I don't mean if it's passed to the command line or a database in/as a SQL command, just if it's being manipulated in PHP and maybe output to a text file or e-mailed.
Hopefully you're not all reeling from the avalanche of stupid