regex to get linked images

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

regex to get linked images

Post by GeXus »

I want to be able to parse a URL and get the image URL only for images that are linked to an image.. so for example

Code: Select all

// This would not be grabbed
<img src="image.jpg"/>

// This would not be grabbed
<a href="http://www.abc.com"><img src="image.jpg"/></a>

// This would be - it would return 'http://www.abc.com/image_full.jpg'
<a href="http://www.abc.com/image_full.jpg"><img src="image.jpg"/></a>

Would anyone be able to help me out with how I would do this? I really have no clue! :)

Thanks a lot!
User avatar
Benjamin
Site Administrator
Posts: 6935
Joined: Sun May 19, 2002 10:24 pm

Post by Benjamin »

Code: Select all

$link = '<a href="http://www.abc.com/image_full.jpg"><img src="image.jpg"/></a>';

if (preg_match_all('#<\s{0,2}a\s{1,3}.*?href\s{1,3}=\s{1,3}[\'"]{1}http:.*?[\'"]{1}.*?>\s{1,3}(<\s{1,3}img\s{1,3}.*?src\s{1,3}=\s{1,3}[\'"]{1}http.*?[\'"]{1}.*?>)\s{1,3}<\s{1,3}/a\s{1,3}>#im', $link, $matches))
{
    echo "<pre>" . print_r($matches, true) . "</pre>";
}
totally untested.
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Post by GeXus »

Nice...

Doesnt quite work though.. I will try messing around with... If you have any suggestions..... :) I'm really BAD at regex

I really appreciate this!
User avatar
Benjamin
Site Administrator
Posts: 6935
Joined: Sun May 19, 2002 10:24 pm

Post by Benjamin »

What did/didn't it match? I'm sure it needs some work.
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Post by GeXus »

It just didnt match anything.. i ran it exactly as you have it
User avatar
Benjamin
Site Administrator
Posts: 6935
Joined: Sun May 19, 2002 10:24 pm

Post by Benjamin »

Ok, I'll tweak it later.
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Post by GeXus »

astions wrote:Ok, I'll tweak it later.
Sweet... Thanks!
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Post by GeXus »

I've got an expression that seems to be doing the job!

Code: Select all

#(?<=href=\x22)([\w:.]*/)+\w+\.jpg(?=\x22)#im
Thanks a lot for your help!
User avatar
Benjamin
Site Administrator
Posts: 6935
Joined: Sun May 19, 2002 10:24 pm

Post by Benjamin »

Ok cool. One less thing on my todo list tonight :)
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Post by GeXus »

Just want to update... It is

Code: Select all

#(?<=href=\x22)(?:[\w:.]*)+\w+\.jpg(?=\x22)#im
The prevous one was not matching <a href="img.jpg"><img src="img.jpg"/></a> where the HREF was relative.

Thanks again!
User avatar
GeertDD
Forum Contributor
Posts: 274
Joined: Sun Oct 22, 2006 1:47 am
Location: Belgium

Post by GeertDD »

GeXus wrote:

Code: Select all

#(?<=href=\x22)(?:[\w:.]*)+\w+\.jpg(?=\x22)#im
The prevous one was not matching <a href="img.jpg"><img src="img.jpg"/></a> where the HREF was relative.
But that one doesn't match full image URLs anymore.

Try this one:

Code: Select all

#(?<=href=\x22).+?\.jpg(?=\x22)#i
Post Reply