Getting an img src out of a A HREF

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
Matt Phelps
Forum Commoner
Posts: 82
Joined: Fri Jun 14, 2002 2:05 pm

Getting an img src out of a A HREF

Post by Matt Phelps »

Hi - need some help with some regex bits. I have to admit I try but I really struggle with this stuff.

I have...

Code: Select all

<img src="http://stuff.tv/csfiles/blogs/future/xbox-360-elite.jpg" height="221" width="400" border="1" hspace="4" alt="Xbox-360-Elite">
and I want to extract the "http://stuff.tv/csfiles/blogs/future/xbox-360-elite.jpg" bit (without the quote marks).

I tried...

Code: Select all

preg_match("/http:\/\/(.*)\.(.*)/i", $myurl, $matches); //get the image at $matches[0]
but that gives me the url with the opening http and jpg chopped off. Can anyone help?
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

Code: Select all

preg_match("#\ssrc="(.*?)"\s#i", $myurl, $matches);
The parentheses are what "capture" the return - you want to encapsulate the part you want back.

also:
php manual wrote:$matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.
You'll find the URL at $matches[1], as $matches[0] contains the whole matched string.

play here: http://www.cuneytyilmaz.com/prog/jrx/
Last edited by Kieran Huggins on Mon Apr 09, 2007 1:20 pm, edited 2 times in total.
Matt Phelps
Forum Commoner
Posts: 82
Joined: Fri Jun 14, 2002 2:05 pm

Post by Matt Phelps »

thanks for the swift response!

Unfortunately I get

Code: Select all

Parse error: syntax error, unexpected '('
at that line of code.
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

Code: Select all

preg_match('#\ssrc="(.*?)"\s#i', $myurl, $matches);
oops - wrong quotes
Matt Phelps
Forum Commoner
Posts: 82
Joined: Fri Jun 14, 2002 2:05 pm

Post by Matt Phelps »

That's much closer! Now I get...

Code: Select all

src="http://stuff.tv/csfiles/blogs/future/DSC_0206.JPG"
Can the quotes and the 'src=' be excluded as well or maybe I have to just trim that off with a string function?
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

Kieran Huggins wrote:You'll find the URL at $matches[1], as $matches[0] contains the whole matched string.
Matt Phelps
Forum Commoner
Posts: 82
Joined: Fri Jun 14, 2002 2:05 pm

Post by Matt Phelps »

Brilliant thanks so much. :)
Gurzi
Forum Commoner
Posts: 27
Joined: Wed Aug 02, 2006 4:04 pm

Post by Gurzi »

sorry, i have one doubt..

i'm learning regex and why you used (.*?).

Why *?together ?

Thnks :D
nickvd
DevNet Resident
Posts: 1027
Joined: Thu Mar 10, 2005 5:27 pm
Location: Southern Ontario
Contact:

Post by nickvd »

That regex won't pass this: <img src='hotstuff.jpg' alt=''/>...


Something like #src=([\'"])([^\1]*?)\1# would allow either ' or ", though I have a feeling it could be improved.. any help? :)
Post Reply