Extract all href and src attributes

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Extract all href and src attributes

Post by alex.barylski »

I need a thourough regex which extracts all href="" and src="" attributes and returns the values.

I plan on checking the status of each as a way of easily informing me of dead links, missing images, etc...

I figure this could be done in regex easily and has likely already been done (countless times) so if you know of a resource/snippet example which does this, please lemme know. :)

I will manually check protocols to javascript: or other so regex can ignore that. ;)

Cheers :)
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Post by alex.barylski »

Nice. :)

I'll have to take a better look tomorrow, me tired. :)

Thanks dude :)
Post Reply