regular expression help

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
Krypt
Forum Newbie
Posts: 1
Joined: Thu Oct 09, 2003 5:22 pm

regular expression help

Post by Krypt »

I'm trying to modify the following bit of code:

eregi_replace("((ht|f)tp://)((([a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,4}))|(([0-9]{1,3}\.){3}([0-9]{1,3})))((/|\?)[a-z0-9~#%&'_\+=:\?\.-]*)*)", "<a href=\"\\0\" target=\"_blank\">\\0</a>", $text);

What I'd like is to have the above code ignore any url that begins with "url=" OR "]" (without the quotes of course). Everything else should be parsed as normal.

Does anyone know how to do this?

Thanks.
m3rajk
DevNet Resident
Posts: 1191
Joined: Mon Jun 02, 2003 3:37 pm

Post by m3rajk »

umm.. i'm not overly big on the posix ones, so i'll give you the perl ones (which are supposivly slightly faster)

Code: Select all

$text=preg_replace('%^(((ht|f)tp://)w+[\.\w]+/[^\s])$%gi', '"<a href="\1" target="_blank">\1</a>', $text);
just some notes for your edification:
1- you need to set the return to something. i'm not sure you did with the snippet you gave.

2- while you have most of what needs \ with one of those, i think you missed some.

3- sometimes there's shorts that you can use to build the character set. it helps.

4- i think you used * by accident. * = 0 or more. meaning that you would have been ok with only a top level, which can't happen. you need it at least once, thus, + which means 1 or more is what you wanted

5- some preg shorts:
\d = [0-9]
\D = ^\d
\s = [ \f\r\t\n] (all space characters. i think i got them all)
\S = ^\s
\W= ^\w
\w = [A-Za-z0-9_]

any of the following (and others i'm not mentionoing here) are legal boundry seperators (in perl) as long as you \ them in the string and use them on either end: /|%

6- specials that are both preg and posix:
| = or
^ = start with or not depending on use
$ = end
. = any single character
[] = defines a character set
{} = repetition set
? = preceeding is optional

i hope that's helpful. i haven't tested the preg_repleace for you, but it will be global (g) and case insensitive (i)
m3rajk
DevNet Resident
Posts: 1191
Joined: Mon Jun 02, 2003 3:37 pm

Post by m3rajk »

one more thing,, unless posix differes on this, \0 doesn't exist. it starts at \1 and goes to \9
Post Reply