Page 1 of 1

URL hyperlinking

Posted: Tue Sep 11, 2007 12:59 pm
by mrkite
Here's a function I wrote that will hyperlink most urls found in text (doesn't include cctlds).

Code: Select all

function markup($text)
{
        $tlds="biz|com|edu|gov|info|mil|mobi|net|org";
        function fixurl($matches)
        {
                $url=$matches[0];
                if (isset($matches[3]) && $matches[3]=='@')
                        return "<a href=\"mailto:{$matches[2]}\">{$matches[0]}</a>";
                if (strpos($url,"http://")===false) $url="http://$url";
                return "<a href=\"$url\">$matches[0]</a>";
        }
        $scheme="(?:http://|mailto:)";
        $user="(?:[a-zA-Z0-9_.]+)";
        $hostname="(?:[a-zA-Z0-9]+\.)+";
        $path="(?:/([a-zA-Z0-9$\-_.+!*'(),;:@&=%]*[a-zA-Z0-9])?)*";
        $query="(?:\?([a-zA-Z0-9$\-_.+!*'(),;:@&=%]*[a-zA-Z0-9&=\-])?)?";
        return preg_replace_callback(
"{($scheme?($user(@)$hostname(?:$tlds))|$scheme?$hostname(?:$tlds)$path$query)}",
"fixurl",$text);
}

// example usage:
echo markup("This is a test.com.  Notice it doesn't include (irritating.com/closing?parens), or require  www or http prefixes. Email test bob@nowhere.com. ");
It was designed to autolink urls for a newspaper print conversion. In articles, they almost always write "blah.com/sports" without the www hostname most of the time and certainly never including scheme.

Posted: Sat Sep 15, 2007 1:17 pm
by s.dot
A couple thoughts

1) You should make the require www optional, as many people would only want actual links hyperlinked.
2) There are many more domain names than you included in your $tlds. .ca .it .co.uk etc etc etc.