Page 1 of 1

Validating emails using RFC compliant regex

Posted: Sun Nov 09, 2008 9:16 pm
by alex.barylski
EDIT | I did keep the original source after all: http://iamcal.com/publish/articles/php/parsing_email

I found this regex snippet a long while (I've lost track of the original author my bad):

Code: Select all

 
$qtext = '[^\\x0d\\x22\\x5c\\x80-\\xff]';
$dtext = '[^\\x0d\\x5b-\\x5d\\x80-\\xff]';
$atom = '[^\\x00-\\x20\\x22\\x28\\x29\\x2c\\x2e\\x3a-\\x3c'.'\\x3e\\x40\\x5b-\\x5d\\x7f-\\xff]+';
$quoted_pair = '\\x5c[\\x00-\\x7f]';
$domain_literal = sprintf('\\x5b(%s|%s)*\\x5d', $dtext, $quoted_pair);
$quoted_string = sprintf('\\x22(%s|%s)*\\x22', $qtext, $quoted_pair);
$domain_ref = $atom;
$sub_domain = sprintf('(%s|%s)', $domain_ref, $domain_literal);
$word = sprintf('(%s|%s)', $atom, $quoted_string);
$domain = sprintf('%s(\\x2e%s)*', $sub_domain, $sub_domain);
$local_part = sprintf('%s(\\x2e%s)*', $word, $word);
$addr_spec = sprintf('%s\\x40%s', $local_part, $domain);
 
return preg_match(sprintf('!^%s$!', $addr_spec), $value) ? 1 : 0;
 
I figured I'd share that.

Now however I wish to validate my emails in two distinct parts:

Code: Select all

account@domain
I have split my email address into two parts:

account
domain

I am going to implement some kind of ping test to validate the domain exists and maybe check the TLD is valid as well but before I go splitting the domain and/or testing the domain server I would like to customize the above regex to validate the account and domain portions separately.

Apparently the account is the complex part to validate as it can literally be anything set forth by the system admin whereas domain names are much simplier and must confirm to stricter standards?

Anyways, if you can help break that regex above into two separate validations I would be forever greatful, as would my little library of code. :)

Cheers,
Alex