What is wrong with this regex?

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
klevis miho
Forum Contributor
Posts: 413
Joined: Wed Oct 29, 2008 2:59 pm
Location: Albania
Contact:

What is wrong with this regex?

Post by klevis miho »

I have this function:
preg_match_all('\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b', $econtents, $matches);

and it displays me this error:
Warning: preg_match() [function.preg-match]: Delimiter must not be alphanumeric or backslash

This regex is used to find emails.
What is wrong?

Thanks in advance
MichaelR
Forum Contributor
Posts: 148
Joined: Sat Jan 03, 2009 3:27 pm

Re: What is wrong with this regex?

Post by MichaelR »

Try this:

Code: Select all

preg_match_all('/\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b/', $econtents, $matches);
As the error warning suggests, you've wrongfully used the backslash (\) as a delimiter. Use the forwardslash (/).

And your regex will deny quite a lot of email addresses. All those that end with .co.uk, as a prime example. Not to mention it only allowing for upper-case. A slightly better version, but still lacking (the local part check validates only those characters allowed by Hotmail in this example), would be:

Code: Select all

preg_match_all('/\b[a-z0-9_-]+(?:\.[a-z0-9_-]+)*@(?:[a-z0-9]+(?:-[a-z0-9]+)*\.){1,127}[a-z]{2,6}\b/i', $econtents, $matches);
Last edited by MichaelR on Thu Dec 17, 2009 6:31 am, edited 4 times in total.
klevis miho
Forum Contributor
Posts: 413
Joined: Wed Oct 29, 2008 2:59 pm
Location: Albania
Contact:

Re: What is wrong with this regex?

Post by klevis miho »

Thnx but I did this too, only to come with another error:

Warning: preg_match_all() [function.preg-match-all]: Unknown modifier 'b'
MichaelR
Forum Contributor
Posts: 148
Joined: Sat Jan 03, 2009 3:27 pm

Re: What is wrong with this regex?

Post by MichaelR »

Yes, my apologies. I misplaced the "b" character. Now fixed.

Edit: And again, my lack of coffee worked against me. I forgot to escape the "b" character.
klevis miho
Forum Contributor
Posts: 413
Joined: Wed Oct 29, 2008 2:59 pm
Location: Albania
Contact:

Re: What is wrong with this regex?

Post by klevis miho »

Thnx but it wont work again :( . Ok look here all my code:

Code: Select all

 
$file = fopen("prova.txt","r");
$content = fread($file,"100000");
    
    $find = preg_match_all('/b[a-z0-9_-]+(?:\.[a-z0-9_-]+)*@(?:[a-z0-9]+(?:-[a-z0-9]+)*\.){1,127}[a-z]{1,6}b/i', $contents, $matches);
    if($find) {
        echo "found";
    } else {
        echo "not found";
    }
prova.txt has only this string inside: Email: test@hotmail.com

The output is: not found

:(
MichaelR
Forum Contributor
Posts: 148
Joined: Sat Jan 03, 2009 3:27 pm

Re: What is wrong with this regex?

Post by MichaelR »

In that specific example, you'll need to use the following. If you plan on having more than one email address in the file, this may not work. If so, show me an example with two or more email addresses stored in the file:

Code: Select all

preg_match_all('/\x20[a-z0-9_-]+(?:\.[a-z0-9_-]+)*@(?:[a-z0-9]+(?:-[a-z0-9]+)*\.){1,127}[a-z]{1,6}/i', $content, $matches);
(Note that I've changed $contents to $content because $contents has not been defined in your example; just $content.
klevis miho
Forum Contributor
Posts: 413
Joined: Wed Oct 29, 2008 2:59 pm
Location: Albania
Contact:

Re: What is wrong with this regex?

Post by klevis miho »

Yeah it worked, thnx, but the main problem was a mispelling of the $contents variable in the funcion :$ .

Thanks man, you were a great help :)
Post Reply