E-mail address validation
Moderator: General Moderators
Re: E-mail address validation
Apparently my regex is now the basis for filter_var(). I guess sometimes it is better to re-invent the wheel.
Last edited by MichaelR on Tue Oct 19, 2010 5:38 pm, edited 2 times in total.
Re: E-mail address validation
Okay, for those who are interested, this code now matches for folding white spaces and infinitely nested comments. The entire regular expression is just 777 characters long. Compare with Perl's infamous regex which is over 6,500 characters long and doesn't match for comments or folding white space.
The class also has an option to check for MX RRs at the given domain.
For those who might be aware, there are differences between RFC 5322 and RFC 5321. Taking into account just the address itself, not the mailbox/route, etc., the following code should be run for each:
The class also has an option to check for MX RRs at the given domain.
For those who might be aware, there are differences between RFC 5322 and RFC 5321. Taking into account just the address itself, not the mailbox/route, etc., the following code should be run for each:
Code: Select all
// RFC 5322
EmailAddressValidator::SetEmailAddress('michael@example.com', false)->Validate();
// RFC 5321
EmailAddressValidator::SetEmailAddress('michael@example.com', false)->SetCFWS(false)->SetObsolete(false)->Validate();
Re: E-mail address validation
I've updated to the second version now. The code is much better separated in the class and the entire regular expression has been reduced to just 585 characters for isValid5322 and just 383 for isValid5321. Part of this is due to not checking for length limits, as per RFC 5321, which states "To the maximum extent possible, implementation techniques that impose no limits on the length of these objects should be used." Indeed, the length limit is only a SHOULD, not a MUST.
The other differences include not being able to turn off dot-atom or domain-name domains, and nor can one make domain names or dot atoms "strict". Additionally, CFWS is always of the obsolete form, and rather than pass as the optional second parameter "true" when instantiating the object, the two options are "5321" and "5322" which turn on quoted strings and domain literals in the first case and obsolete local-parts, domain literals, and CFWS in the second. Internationalized labels need to be explicitly allowed if required. Finally, the object can be instantiated using the "new" keyword as normal rather than by just using the static "setEmailAddress()" method.
My article, linked to in my signature, also now includes unit tests and comparisons with other popular validators/parsers.
The other differences include not being able to turn off dot-atom or domain-name domains, and nor can one make domain names or dot atoms "strict". Additionally, CFWS is always of the obsolete form, and rather than pass as the optional second parameter "true" when instantiating the object, the two options are "5321" and "5322" which turn on quoted strings and domain literals in the first case and obsolete local-parts, domain literals, and CFWS in the second. Internationalized labels need to be explicitly allowed if required. Finally, the object can be instantiated using the "new" keyword as normal rather than by just using the static "setEmailAddress()" method.
My article, linked to in my signature, also now includes unit tests and comparisons with other popular validators/parsers.