Page 1 of 1

How to filter url out of a string?

Posted: Fri Oct 02, 2009 7:43 pm
by silverbullet
hello I have a string of text with spaces in it and 3 urls in it. Each of the 3 urls changes for each string of text I have and I want a piece of code which extracts the 2nd url from such a string (the url starts with http or https)

I know of the preg_match function but I am unable to get this working. I am new to php.

Please help me, any ideas welcomed dearly

Thanks

Re: How to filter url out of a string?

Posted: Fri Oct 02, 2009 7:47 pm
by jackpf

Code: Select all

preg_match('/(http:\/\/.*?)(\s|$)/i', $str, $matches);
Should do the trick.

Re: How to filter url out of a string?

Posted: Sat Oct 03, 2009 11:45 am
by silverbullet
Hello, thanks so much for your advice but preg_match you gave me only picks up the last url and I need to pick up the second url out of the string.

Here's an example string:

submitted by <a href="http://www.example.com/hoopy/timetowork"> timetowork </a> <br/> <a href="http://wantthis.com/example.php?ig=666">[link]</a> <a href="http://www.anotherexample.com/news/comm ... mple_page/"

The preg_match you gave me only picks up the last url (possibly because it's the only one in quotation marks. The link I want has [link] after it and is always second in the string.

Would dearly appreciate your help.

Thank you.

Re: How to filter url out of a string?

Posted: Sat Oct 03, 2009 11:48 am
by jackpf
Oh yeah, my bad. I can't test anything atm (not at home), but try this:

Code: Select all

preg_match_all('/(http:\/\/.*?)(\s|$)/i', $str, $matches);
 
print_r($matches);

Re: How to filter url out of a string?

Posted: Sat Oct 03, 2009 12:32 pm
by silverbullet
Hello that works better but it prints all the urls, like this:

Array ( [0] => Array ( [0] => http://www.example.com/blahblah" [2] => http://www.anotherexample.com/hoopy/blahblah"> [3] => http://siteiwant.co.uk/archives/496242">[link] [4] =>
etc etc etc etc..... ....... [425 ) [2] => Array ( [0] => [1] => [2] => [3] => [4] => ) )

I would like to extract the link [3] (without the end bit ">[link] which appears on every one).

Hope you can help with that.

Thanks very much.

Re: How to filter url out of a string?

Posted: Sat Oct 03, 2009 1:08 pm
by John Cartwright
I prefer to capture everything within the quotes instead. Although, since you have both encoded and decoded html entities, it is best to convert it to one of the other to simplify the regex. Something like:

Code: Select all

$source = 'submitted by <a href="http://www.example.com/hoopy/timetowork"> timetowork </a> <br/> <a href="http://wantthis.com/example.php?ig=666">[link]</a> <a href="http://www.anotherexample.com/news/comments/2tudjg/example_page/"';
 
preg_match_all('#a href="([^"]+)#is', html_entity_decode($source), $matches);
 
echo '<pre>';
print_r($matches);
 

Code: Select all

Array
(
    [0] => Array
        (
            [0] => a href="http://www.example.com/hoopy/timetowork"
            [1] => a href="http://wantthis.com/example.php?ig=666"
            [2] => a href="http://www.anotherexample.com/news/comments/2tudjg/example_page/"
        )
 
    [1] => Array
        (
            [0] => http://www.example.com/hoopy/timetowork
            [1] => http://wantthis.com/example.php?ig=666
            [2] => http://www.anotherexample.com/news/comm ... mple_page/
        )
 
)

Re: How to filter url out of a string?

Posted: Sat Oct 03, 2009 2:19 pm
by silverbullet
Thank you both very much.

I've integrated it into my php code and it works great.

I learnt a lot. :)