Page 1 of 1

Need help getting email addresses using regular expressions

Posted: Sun May 14, 2006 2:38 am
by 8kobe
I am not that good (actually pretty bad) at regular expressions. I am converting over my site and I need to get all the emails from my current site. I know how to run the preg_match, just need the expression part of it.
So I basically need a way to pull out email addresses from something like this

Hey my email is joe@shmoe.com
This is my email address john@gmail.com
This is my email Gary, john@something.net

The emails can occur anywhere in the content, and there are multiple emails in the content. Thanks and please leave your paypal account so I can donate you a little for your help.

Re: Need help getting email addresses using regular expressi

Posted: Sun May 14, 2006 6:27 am
by aerodromoi
8kobe wrote:I am not that good (actually pretty bad) at regular expressions. I am converting over my site and I need to get all the emails from my current site. I know how to run the preg_match, just need the expression part of it.
So I basically need a way to pull out email addresses from something like this

Hey my email is joe@shmoe.com
This is my email address john@gmail.com
This is my email Gary, john@something.net

The emails can occur anywhere in the content, and there are multiple emails in the content. Thanks and please leave your paypal account so I can donate you a little for your help.
Here you go - hope this helps.

Code: Select all

<?php
$string = "Lorem johndoe@hello2.com dolor amet consequitur@hello.com ipsum dolor@sit.de abcde@abcdefg.de";
$search = '/([._a-z0-9-]+[._a-z0-9-]*)@(([a-z0-9-]+\.)*([a-z0-9-]+)(\.[a-z]{2,6})?)/is';

echo "string: ".$string."<br />";
echo "results: ";

preg_match_all($search,$string, $emails);
for ($i=0;$i<count($emails[0]);$i++){
  echo $emails[0][$i];
  if ($i != count($emails[0])-1) echo ", ";
}
?>
aerodromoi

Re: Need help getting email addresses using regular expressi

Posted: Sat May 27, 2006 2:25 am
by shrikant_deshpande
8kobe wrote:I am not that good (actually pretty bad) at regular expressions. I am converting over my site and I need to get all the emails from my current site. I know how to run the preg_match, just need the expression part of it.
So I basically need a way to pull out email addresses from something like this

Hey my email is joe@shmoe.com
This is my email address john@gmail.com
This is my email Gary, john@something.net

The emails can occur anywhere in the content, and there are multiple emails in the content. Thanks and please leave your paypal account so I can donate you a little for your help.
Hello Friend ...

I think your are searching for this string...

preg_match("/^[A-Z0-9._%-]+@[A-Z0-9._%-]+\.[A-Z]{2,6}$/i", $data[$email_index])

the function gives boolean output... if "$data[$email_index]" this variable is a perfect email address then and only then... it returns true...
bye

Posted: Tue Jun 06, 2006 5:01 pm
by andym01480
The most robust email address regex that checks against the RFC822 standard is contained in this

function

Code: Select all

is_valid_email_address($email){

		$qtext = '[^\\x0d\\x22\\x5c\\x80-\\xff]';

		$dtext = '[^\\x0d\\x5b-\\x5d\\x80-\\xff]';

		$atom = '[^\\x00-\\x20\\x22\\x28\\x29\\x2c\\x2e\\x3a-\\x3c'.
			'\\x3e\\x40\\x5b-\\x5d\\x7f-\\xff]+';

		$quoted_pair = '\\x5c[\\x00-\\x7f]';

		$domain_literal = "\\x5b($dtext|$quoted_pair)*\\x5d";

		$quoted_string = "\\x22($qtext|$quoted_pair)*\\x22";

		$domain_ref = $atom;

		$sub_domain = "($domain_ref|$domain_literal)";

		$word = "($atom|$quoted_string)";

		$domain = "$sub_domain(\\x2e$sub_domain)*";

		$local_part = "$word(\\x2e$word)*";

		$addr_spec = "$local_part\\x40$domain";

		return preg_match("!^$addr_spec$!", $email) ? 1 : 0;
	}
Source is http://www.iamcal.com/publish/

Posted: Wed Jun 07, 2006 9:58 am
by Roja
andym01480 wrote:The most robust email address regex that checks against the RFC822 standard is contained in this
"Most robust" is definitely arguable. Ignoring that issue, however, keep in mind that the code there is released under the Creative Commons Attribution-ShareAlike 2.5 License, which means that anything using that function must be released under that same license.

Alternatively, I've posted GPL licensed code for email validation numerous times on these forums as well, and it too is RFC compliant.

* It should be noted that the GPL also requires that using that function means that it must be released under the same license. The difference being that the GPL is more widely used.

Posted: Wed Jun 07, 2006 11:38 am
by andym01480
Sorry I meant "The most robust I'd seen so far"!!!

Posted: Fri Jun 16, 2006 12:54 am
by printf
Sure the variable names have been changed and how they are combined to build the regex, but it's still the same one found in the very first Perl Cook Book Chapter 7.

pif!

Posted: Fri Jun 16, 2006 1:41 am
by andym01480
Aaah change them back and post it and then it wont be under that strange licence!!!!