Page 1 of 1

Getting an img src out of a A HREF

Posted: Mon Apr 09, 2007 1:08 pm
by Matt Phelps
Hi - need some help with some regex bits. I have to admit I try but I really struggle with this stuff.

I have...

Code: Select all

<img src="http://stuff.tv/csfiles/blogs/future/xbox-360-elite.jpg" height="221" width="400" border="1" hspace="4" alt="Xbox-360-Elite">
and I want to extract the "http://stuff.tv/csfiles/blogs/future/xbox-360-elite.jpg" bit (without the quote marks).

I tried...

Code: Select all

preg_match("/http:\/\/(.*)\.(.*)/i", $myurl, $matches); //get the image at $matches[0]
but that gives me the url with the opening http and jpg chopped off. Can anyone help?

Posted: Mon Apr 09, 2007 1:16 pm
by Kieran Huggins

Code: Select all

preg_match("#\ssrc="(.*?)"\s#i", $myurl, $matches);
The parentheses are what "capture" the return - you want to encapsulate the part you want back.

also:
php manual wrote:$matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.
You'll find the URL at $matches[1], as $matches[0] contains the whole matched string.

play here: http://www.cuneytyilmaz.com/prog/jrx/

Posted: Mon Apr 09, 2007 1:19 pm
by Matt Phelps
thanks for the swift response!

Unfortunately I get

Code: Select all

Parse error: syntax error, unexpected '('
at that line of code.

Posted: Mon Apr 09, 2007 1:21 pm
by Kieran Huggins

Code: Select all

preg_match('#\ssrc="(.*?)"\s#i', $myurl, $matches);
oops - wrong quotes

Posted: Mon Apr 09, 2007 1:23 pm
by Matt Phelps
That's much closer! Now I get...

Code: Select all

src="http://stuff.tv/csfiles/blogs/future/DSC_0206.JPG"
Can the quotes and the 'src=' be excluded as well or maybe I have to just trim that off with a string function?

Posted: Mon Apr 09, 2007 1:25 pm
by Kieran Huggins
Kieran Huggins wrote:You'll find the URL at $matches[1], as $matches[0] contains the whole matched string.

Posted: Mon Apr 09, 2007 1:27 pm
by Matt Phelps
Brilliant thanks so much. :)

Posted: Fri Apr 13, 2007 7:59 pm
by Gurzi
sorry, i have one doubt..

i'm learning regex and why you used (.*?).

Why *?together ?

Thnks :D

Posted: Fri Apr 13, 2007 9:34 pm
by nickvd
That regex won't pass this: <img src='hotstuff.jpg' alt=''/>...


Something like #src=([\'"])([^\1]*?)\1# would allow either ' or ", though I have a feeling it could be improved.. any help? :)