Need help getting email addresses using regular expressions

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
8kobe
Forum Newbie
Posts: 14
Joined: Sun Mar 05, 2006 11:35 am

Need help getting email addresses using regular expressions

Post by 8kobe »

I am not that good (actually pretty bad) at regular expressions. I am converting over my site and I need to get all the emails from my current site. I know how to run the preg_match, just need the expression part of it.
So I basically need a way to pull out email addresses from something like this

Hey my email is joe@shmoe.com
This is my email address john@gmail.com
This is my email Gary, john@something.net

The emails can occur anywhere in the content, and there are multiple emails in the content. Thanks and please leave your paypal account so I can donate you a little for your help.
User avatar
aerodromoi
Forum Contributor
Posts: 230
Joined: Sun May 07, 2006 5:21 am

Re: Need help getting email addresses using regular expressi

Post by aerodromoi »

8kobe wrote:I am not that good (actually pretty bad) at regular expressions. I am converting over my site and I need to get all the emails from my current site. I know how to run the preg_match, just need the expression part of it.
So I basically need a way to pull out email addresses from something like this

Hey my email is joe@shmoe.com
This is my email address john@gmail.com
This is my email Gary, john@something.net

The emails can occur anywhere in the content, and there are multiple emails in the content. Thanks and please leave your paypal account so I can donate you a little for your help.
Here you go - hope this helps.

Code: Select all

<?php
$string = "Lorem johndoe@hello2.com dolor amet consequitur@hello.com ipsum dolor@sit.de abcde@abcdefg.de";
$search = '/([._a-z0-9-]+[._a-z0-9-]*)@(([a-z0-9-]+\.)*([a-z0-9-]+)(\.[a-z]{2,6})?)/is';

echo "string: ".$string."<br />";
echo "results: ";

preg_match_all($search,$string, $emails);
for ($i=0;$i<count($emails[0]);$i++){
  echo $emails[0][$i];
  if ($i != count($emails[0])-1) echo ", ";
}
?>
aerodromoi
shrikant_deshpande
Forum Newbie
Posts: 8
Joined: Sat May 27, 2006 1:40 am

Re: Need help getting email addresses using regular expressi

Post by shrikant_deshpande »

8kobe wrote:I am not that good (actually pretty bad) at regular expressions. I am converting over my site and I need to get all the emails from my current site. I know how to run the preg_match, just need the expression part of it.
So I basically need a way to pull out email addresses from something like this

Hey my email is joe@shmoe.com
This is my email address john@gmail.com
This is my email Gary, john@something.net

The emails can occur anywhere in the content, and there are multiple emails in the content. Thanks and please leave your paypal account so I can donate you a little for your help.
Hello Friend ...

I think your are searching for this string...

preg_match("/^[A-Z0-9._%-]+@[A-Z0-9._%-]+\.[A-Z]{2,6}$/i", $data[$email_index])

the function gives boolean output... if "$data[$email_index]" this variable is a perfect email address then and only then... it returns true...
bye
User avatar
andym01480
Forum Contributor
Posts: 390
Joined: Wed Apr 19, 2006 5:01 pm

Post by andym01480 »

The most robust email address regex that checks against the RFC822 standard is contained in this

function

Code: Select all

is_valid_email_address($email){

		$qtext = '[^\\x0d\\x22\\x5c\\x80-\\xff]';

		$dtext = '[^\\x0d\\x5b-\\x5d\\x80-\\xff]';

		$atom = '[^\\x00-\\x20\\x22\\x28\\x29\\x2c\\x2e\\x3a-\\x3c'.
			'\\x3e\\x40\\x5b-\\x5d\\x7f-\\xff]+';

		$quoted_pair = '\\x5c[\\x00-\\x7f]';

		$domain_literal = "\\x5b($dtext|$quoted_pair)*\\x5d";

		$quoted_string = "\\x22($qtext|$quoted_pair)*\\x22";

		$domain_ref = $atom;

		$sub_domain = "($domain_ref|$domain_literal)";

		$word = "($atom|$quoted_string)";

		$domain = "$sub_domain(\\x2e$sub_domain)*";

		$local_part = "$word(\\x2e$word)*";

		$addr_spec = "$local_part\\x40$domain";

		return preg_match("!^$addr_spec$!", $email) ? 1 : 0;
	}
Source is http://www.iamcal.com/publish/
Roja
Tutorials Group
Posts: 2692
Joined: Sun Jan 04, 2004 10:30 pm

Post by Roja »

andym01480 wrote:The most robust email address regex that checks against the RFC822 standard is contained in this
"Most robust" is definitely arguable. Ignoring that issue, however, keep in mind that the code there is released under the Creative Commons Attribution-ShareAlike 2.5 License, which means that anything using that function must be released under that same license.

Alternatively, I've posted GPL licensed code for email validation numerous times on these forums as well, and it too is RFC compliant.

* It should be noted that the GPL also requires that using that function means that it must be released under the same license. The difference being that the GPL is more widely used.
User avatar
andym01480
Forum Contributor
Posts: 390
Joined: Wed Apr 19, 2006 5:01 pm

Post by andym01480 »

Sorry I meant "The most robust I'd seen so far"!!!
printf
Forum Contributor
Posts: 173
Joined: Wed Jan 12, 2005 5:24 pm

Post by printf »

Sure the variable names have been changed and how they are combined to build the regex, but it's still the same one found in the very first Perl Cook Book Chapter 7.

pif!
User avatar
andym01480
Forum Contributor
Posts: 390
Joined: Wed Apr 19, 2006 5:01 pm

Post by andym01480 »

Aaah change them back and post it and then it wont be under that strange licence!!!!
Post Reply