regex function to test full urls
Moderator: General Moderators
regex function to test full urls
I've got a string entered by users, which can contain <a> tags - I already do all kinds of error checking and just needs code for the last bit.
I need to verify that the href="" property of each <a> tag begins with http:// please help me with such a function
Thanks
I need to verify that the href="" property of each <a> tag begins with http:// please help me with such a function
Thanks
- Chris Corbyn
- Breakbeat Nuttzer
- Posts: 13098
- Joined: Wed Mar 24, 2004 7:57 am
- Location: Melbourne, Australia
Code: Select all
echo preg_replace('~<a\s+[^>]*href="(?!http://).*?>(.*?)</a>~is', '$1', $string);EDIT | If it works, that would turn a string like:
Code: Select all
Go to <a class="foo" href="ftp://bad-site.tld">my ftp site</a> and download lots of bad thingsCode: Select all
Go to my ftp site and download lots of bad things*ahem*
Generally, parsing html is a pain in the smurf, there's lots and lots of things that can go wrong - if your sutuation allows it, it is better to use a "safe" replacement like bbcode.
Code: Select all
$string = 'Go to <a class="foo" href="ftp://bad-site.tld"><a href="ftp://bad-site.tld">my ftp site</ a></a> and download lots of bad things';- Chris Corbyn
- Breakbeat Nuttzer
- Posts: 13098
- Joined: Wed Mar 24, 2004 7:57 am
- Location: Melbourne, Australia
Ok smart-ass 
Code: Select all
<?php
$string = 'Go to <a class="foo" href="ftp://bad-site.tld"><a href="ftp://bad-site.tld">my ftp site</ a></a> and download lots of bad things';
while (preg_match('~<a\s+[^>]*?href="(?!http://).*?>(.*?)</\s*a>~is', $string))
$string = preg_replace('~<a\s+[^>]*?href="(?!http://).*?>(.*?)</\s*a>~is', '$1', $string);
echo $string;I like a game when I see one 
Next!
Code: Select all
<?php
$string = 'Go to <a class="foo" href="ftp://bad-site.tld"><a href= "ftp://bad-site.tld">my ftp site</ a></a> and download lots of bad things';
while (preg_match('~<a\s+[^>]*?href="(?!http://).*?>(.*?)</\s*a>~is', $string))
$string = preg_replace('~<a\s+[^>]*?href="(?!http://).*?>(.*?)</\s*a>~is', '$1', $string);
echo $string;
?>- Chris Corbyn
- Breakbeat Nuttzer
- Posts: 13098
- Joined: Wed Mar 24, 2004 7:57 am
- Location: Melbourne, Australia
Also quotes, also checks against javascript.
Also, in the specific case with ftp, browsers being the smart things that they are
will automatically use the FTP protocol with urls like ftp.opera.com (but that's just trivia, nothing to do with the OP)
Actually I have a similar problem with "native" html. I'm using an old piece of code written for the previous version of the soft I'm rewriting, but it's not without problems. I've decided to try HTML Purifier, but haven't found time about it yet.
Also, in the specific case with ftp, browsers being the smart things that they are
Actually I have a similar problem with "native" html. I'm using an old piece of code written for the previous version of the soft I'm rewriting, but it's not without problems. I've decided to try HTML Purifier, but haven't found time about it yet.
- Chris Corbyn
- Breakbeat Nuttzer
- Posts: 13098
- Joined: Wed Mar 24, 2004 7:57 am
- Location: Melbourne, Australia