Any questions involving matching text strings to patterns - the pattern is called a "regular expression."
Moderator: General Moderators
Matt Phelps
Forum Commoner
Posts: 82 Joined: Fri Jun 14, 2002 2:05 pm
Post
by Matt Phelps » Mon Apr 09, 2007 1:08 pm
Hi - need some help with some regex bits. I have to admit I try but I really struggle with this stuff.
I have...
Code: Select all
<img src="http://stuff.tv/csfiles/blogs/future/xbox-360-elite.jpg" height="221" width="400" border="1" hspace="4" alt="Xbox-360-Elite">
and I want to extract the "
http://stuff.tv/csfiles/blogs/future/xbox-360-elite.jpg " bit (without the quote marks).
I tried...
Code: Select all
preg_match("/http:\/\/(.*)\.(.*)/i", $myurl, $matches); //get the image at $matches[0]
but that gives me the url with the opening http and jpg chopped off. Can anyone help?
Kieran Huggins
DevNet Master
Posts: 3635 Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:
Post
by Kieran Huggins » Mon Apr 09, 2007 1:16 pm
Code: Select all
preg_match("#\ssrc="(.*?)"\s#i", $myurl, $matches);
The parentheses are what "capture" the return - you want to encapsulate the part you want back.
also:
php manual wrote: $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.
You'll find the URL at $matches[1], as $matches[0] contains the whole matched string.
play here:
http://www.cuneytyilmaz.com/prog/jrx/
Last edited by
Kieran Huggins on Mon Apr 09, 2007 1:20 pm, edited 2 times in total.
Matt Phelps
Forum Commoner
Posts: 82 Joined: Fri Jun 14, 2002 2:05 pm
Post
by Matt Phelps » Mon Apr 09, 2007 1:19 pm
thanks for the swift response!
Unfortunately I get
Code: Select all
Parse error: syntax error, unexpected '(' at that line of code.
Kieran Huggins
DevNet Master
Posts: 3635 Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:
Post
by Kieran Huggins » Mon Apr 09, 2007 1:21 pm
Code: Select all
preg_match('#\ssrc="(.*?)"\s#i', $myurl, $matches);
oops - wrong quotes
Matt Phelps
Forum Commoner
Posts: 82 Joined: Fri Jun 14, 2002 2:05 pm
Post
by Matt Phelps » Mon Apr 09, 2007 1:23 pm
That's much closer! Now I get...
Code: Select all
src="http://stuff.tv/csfiles/blogs/future/DSC_0206.JPG"
Can the quotes and the 'src=' be excluded as well or maybe I have to just trim that off with a string function?
Kieran Huggins
DevNet Master
Posts: 3635 Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:
Post
by Kieran Huggins » Mon Apr 09, 2007 1:25 pm
Kieran Huggins wrote: You'll find the URL at $matches[1], as $matches[0] contains the whole matched string.
Matt Phelps
Forum Commoner
Posts: 82 Joined: Fri Jun 14, 2002 2:05 pm
Post
by Matt Phelps » Mon Apr 09, 2007 1:27 pm
Brilliant thanks so much.
Gurzi
Forum Commoner
Posts: 27 Joined: Wed Aug 02, 2006 4:04 pm
Post
by Gurzi » Fri Apr 13, 2007 7:59 pm
sorry, i have one doubt..
i'm learning regex and why you used (.*?).
Why *?together ?
Thnks
nickvd
DevNet Resident
Posts: 1027 Joined: Thu Mar 10, 2005 5:27 pm
Location: Southern Ontario
Contact:
Post
by nickvd » Fri Apr 13, 2007 9:34 pm
That regex won't pass this: <img src='hotstuff.jpg' alt=''/>...
Something like #src=([\'"])([^\1]*?)\1# would allow either ' or ", though I have a feeling it could be improved.. any help?