Page 1 of 2
regex preg_replace help: URL to a link
Posted: Thu Mar 11, 2010 1:46 pm
by ninethousandfeet
Hi,
I've come so close to a working solution to my problem, just need help to get the final little bugs out.
Users create comments, which can contain a reference to a link within the comment.
My current preg_replace to determine if there is a url so it can be turned to a link:
Code: Select all
$msg = preg_replace("/([^A-z0-9?])(http|ftp|https)([\:\/\/])([^\\s]+)/"," <a href=\"$2$3$4\" ref="nofollow">$2$3$4</a>",$msg);
Problems:
* example 1:
http://helloworld.com is a great site, check it out. --- does not convert url to a link, something to do with there not being any characters before the url
* example 2: check out
http://helloworld.com, it's a great site! --- converts to link BUT the comma is part of the link so it won't work (this also happens if a parenthesis, dash, etc. are connected to the end of the url)
Any help with either or both of these problems would be great, thanks for taking a look!
Brad
Re: regex preg_replace help: URL to a link
Posted: Thu Mar 11, 2010 2:14 pm
by tr0gd0rr
In the first capturing pattern, you need to allow any white space OR beginning of string `(^|\s)`
In the last capturing pattern, you need to use only characters allowed unescaped in urls:
Something like `([a-zA-Z~!@$%&*()_+:?,./;'=#-]+)`
If you want to assume that the url does not end in certain punctuation (which it may), create another capturing pattern:
Something like `([a-zA-Z~@$%&*_+/'=])`
Also, checkout this tool for testing regexes:
http://gskinner.com/RegExr/
Re: regex preg_replace help: URL to a link
Posted: Thu Mar 11, 2010 11:36 pm
by ninethousandfeet
Thanks for your help... halfway got it.
This is what I have right now and the beginning space problem seems to be fixed.
I tried various versions of your ending capture code, but came up empty each time. I can send a bunch of the variations I tried if it will help. Can you help me with placement of the code you have written for the end of this preg_replace? (I like that validator regex site... it helped explain things a bit, but couldn't quite get it on there either)
I have:
/(^[A-z0-9]?|\s)(http|ftp|https)([\:\/\/])([^\\s]+)/
I tried replacing the last capturing pattern with your code, also tried to add it to what I already had in that capture pattern in the front and back, and none of it worked.
Thank you for your help!
Re: regex preg_replace help: URL to a link
Posted: Fri Mar 12, 2010 3:15 pm
by tr0gd0rr
The following is working for me on your example text:
Code: Select all
(^|\s)(http|ftp|https|mailto)([\:\/\/])([a-zA-Z~!@$%&*()_+:?,./;'=#-]{2,}[a-zA-Z~@$%&*_+/'=])
I may have thrown you off in my previous post because I used back-ticks instead of
Re: regex preg_replace help: URL to a link
Posted: Sun Mar 14, 2010 4:11 pm
by ninethousandfeet
For some reason, that still won't work. When I use your most recent option, the entire comment does not appear. Is it something with the replace portion maybe? Or do you think it's something else?
Code: Select all
$msg = preg_replace("/(^|\s)(http|ftp|https|mailto)([\:\/\/])([a-zA-Z~!@$%&*()_+:?,./;'=#-]{2,}[a-zA-Z~@$%&*_+/'=])/"," <a href=\"$2$3$4\" ref=\"nofollow\">$2$3$4</a> ",$msg);
Re: regex preg_replace help: URL to a link
Posted: Mon Mar 15, 2010 11:10 am
by tr0gd0rr
Running your code I get the error `Warning: preg_replace() [function.preg-replace]: Unknown modifier ';'` because you need to escape the two slashes in the fourth capturing pattern.
Re: regex preg_replace help: URL to a link
Posted: Mon Mar 15, 2010 2:31 pm
by ninethousandfeet
I've added \ in front of the two / in the 4th capture.
I then experienced a problem with they link being stopped if a number appeared.
Example of this problem:
http://bit.ly/r 7tPR
So, I fixed this by changing the two a-zA-Z ... to ... A-z0-9
The comma problem is fixed. I'm sure this is very rare, but what can I add to the 1st capture to ignore any characters before the http|ftp... Or maybe a better way to put it is to ignore those characters and start the link at the http...
Example user input:
- (
http://me.com)
- hi, go to-
http://me.com
Thanks for your help with all of this, regex is a whole new world for me.
Cheers,
Brad
Re: regex preg_replace help: URL to a link
Posted: Mon Mar 15, 2010 4:56 pm
by tr0gd0rr
Maybe use a \b instead of the first capturing pattern. \b indicates it must be a word break.
I think that \b is equivalent to (^|[^\w]) in this case, so you can try that too. BTW, \w is equal to [a-zA-Z0-9] and it could be used in the regexes above.
Re: regex preg_replace help: URL to a link
Posted: Mon Mar 15, 2010 4:59 pm
by s.dot
I've used the one that came with phpbb a while back
Code: Select all
function make_clickable($text)
{
$ret = ' ' . $text;
$ret = preg_replace("#(^|[\n ])([\w]+?://[^ \"\n\r\t<]*)#is", "\\1<a href=\"\\2\" target=\"_blank\">\\2</a>", $ret);
$ret = preg_replace("#(^|[\n ])((www|ftp)\.[^ \"\t\n\r<]*)#is", "\\1<a href=\"http://\\2\" target=\"_blank\">\\2</a>", $ret);
$ret = preg_replace("#(^|[\n ])([a-z0-9&\-_.]+?)@([\w\-]+\.([\w\-\.]+\.)*[\w]+)#i", "\\1<a href=\"mailto:\\2@\\3\">\\2@\\3</a>", $ret);
$ret = substr($ret, 1);
return $ret;
}
$text = make_clickable($text);
It has never failed me.
Re: regex preg_replace help: URL to a link
Posted: Mon Mar 15, 2010 8:33 pm
by ninethousandfeet
tr0gd0rr - that worked great. the only thing i can't figure out now is that when the user submits a new comment, i use javascript to display the new comment with older comments (just like a stream of comments on facebook for example).
the problem is that the punctuation before and after the link is part of the link. then if you click refresh, it is correct and the punctuation (,.() etc.) is not part of the link. any idea why this is happening? not a huge deal, but if it's fixable i'd love to fix it. i'll keep trying and post if i come across a solution. let me know if you can think of anything or if you need to see more of my code to help.
thank you!
s.dot - everything worked fine with your code except punctuation before and after would appear as part of the link the whole time. i like the cleanliness of the function. do you know how to overcome these problems? i.e.
http://example.com, check it out ... the comma is included in the link, which causes a broken link.
Re: regex preg_replace help: URL to a link
Posted: Tue Mar 16, 2010 1:35 am
by s.dot
Just wanted to point out that it's not *my* code
I guess this behavior is because commas and periods are part of valid URLs. Although, I have used this function on a popular forum before and I've never ran into any issues with users adding commas or periods or any other punctuation after posting URLs.. in fact I never knew this was an issue.
But since those characters are valid parts of URLs, you cannot ignore them. eg
http://www.example.com/page/3,1,3, may be a valid URL.
However, If you wish to not include trailing punctuation characters, I don't know how to edit the regex to avoid them LOL, I'm admittedly not very good with regular expressions.
EDIT| And, that is weird that phpbb did not link the last comma in that URL I posted. Hmm, maybe the phpbb function I am using is outdated or I edited it at some point.
Re: regex preg_replace help: URL to a link
Posted: Tue Mar 16, 2010 1:42 am
by s.dot
Actually, here is the original function from the PHPBB 2.x forum code:
Code: Select all
function make_clickable($text)
{
// pad it with a space so we can match things at the start of the 1st line.
$ret = " " . $text;
// matches an "xxxx://yyyy" URL at the start of a line, or after a space.
// xxxx can only be alpha characters.
// yyyy is anything up to the first space, newline, or comma.
$ret = preg_replace("#([\n ])([a-z]+?)://([^,\t \n\r]+)#i", "\\1<a href=\"\\2://\\3\" target=\"_blank\">\\2://\\3</a>", $ret);
// matches a "www.xxxx.yyyy[/zzzz]" kinda lazy URL thing
// Must contain at least 2 dots. xxxx contains either alphanum, or "-"
// yyyy contains either alphanum, "-", or "."
// zzzz is optional.. will contain everything up to the first space, newline, or comma.
// This is slightly restrictive - it's not going to match stuff like "forums.foo.com"
// This is to keep it from getting annoying and matching stuff that's not meant to be a link.
$ret = preg_replace("#([\n ])www\.([a-z0-9\-]+)\.([a-z0-9\-.\~]+)((?:/[^,\t \n\r]*)?)#i", "\\1<a href=\"http://www.\\2.\\3\\4\" target=\"_blank\">www.\\2.\\3\\4</a>", $ret);
// matches an email@domain type address at the start of a line, or after a space.
// Note: Only the followed chars are valid; alphanums, "-", "_" and or ".".
$ret = preg_replace("#([\n ])([a-z0-9\-_.]+?)@([\w\-]+\.([\w\-\.]+\.)?[\w]+)#i", "\\1<a href=\"mailto:\\2@\\3\">\\2@\\3</a>", $ret);
// Remove our padding..
$ret = substr($ret, 1);
return($ret);
}
This does not link commas at the end, but still does periods.
Re: regex preg_replace help: URL to a link
Posted: Tue Mar 16, 2010 11:40 am
by tr0gd0rr
Yah with regexes you can go as simple or as complicated as you want. For your purposes, sticking with a widely-used method such as the phpBB function should do fine.
Re: regex preg_replace help: URL to a link
Posted: Tue Mar 16, 2010 5:00 pm
by ninethousandfeet
I somewhat combined the function and the preg_replace that is working with the characters both in the beginning and end to get this:
Code: Select all
function make_clickable($msg)
{
$ret = ' ' . $msg;
$ret = preg_replace("/(\b)(http|ftp|https|mailto)([\:\/\/])([A-z0-9~!@$%&*()_+:?,.\/;'=#-]{2,}[A-z0-9~@$%&*_+\/'=])/","<a href=\"$2$3$4\" ref=\"nofollow\">$2$3$4</a>",$ret);
$ret = substr($ret, 1);
return $ret;
}
$msg = make_clickable($msg);
The above code works great when the user reloads the browser, the only problem is that when there new comment is submitted and displayed immediately using js, the characters will appear as part of the link. Any ideas how to make the link the same when the js loads the comment so the user does not have to reload the page to see it displayed properly?
I can provide additional code if this isn't sufficient. Thanks for both of your help with this!
Re: regex preg_replace help: URL to a link
Posted: Tue Mar 16, 2010 6:15 pm
by tr0gd0rr
Not sure what you mean about the JS. Is there a JS-driven preview feature? Is there a JS function that does the same thing as the PHP function?