HTML <img tag Reg Ex. Help needed.
Posted: Thu Jun 22, 2006 3:26 am
First post so first let me say a big hi to the community (or at least that part of it that reads this
)
Anyway, here goes.
I'm having a little trouble with a regex that I could use some help with, if anyone could point out what I've done wrong or another way that'd be great.
Scenario: I need to replace some legacy html (using the depreciated img property dynsrc) with new html using object embed tags for video play back on the page. The pages are created dynamically by the user in an existing interface, which has spinoff results which need to keep the dynsrc version so I can't make the change here.
Here's my regex <img[\s].*?[\s]dynsrc=[\"\'](.*?)[\"\']+.*?[\s]\/>
What I hope to return is an array, holding 2 things for each dynsrc image it finds. The first is the whole <img ... /> and the second is the contents of the dynsrc="..." itself (path to the video)
The problem: If there is any <img /> before the one with the dynsrc then the regex is returning everything between the first <img it finds and the close > of the dynsrc.
This means that if you have <img src="1" /><img dynsrc="2" /> it returns <img src="1" /><img dynsrc="2" /> which is not good as the next line of code will replace this with object embed tags, loosing the first image all together.
Hope I've explained well enough for someone to help.
Catch you anon
Anyway, here goes.
I'm having a little trouble with a regex that I could use some help with, if anyone could point out what I've done wrong or another way that'd be great.
Scenario: I need to replace some legacy html (using the depreciated img property dynsrc) with new html using object embed tags for video play back on the page. The pages are created dynamically by the user in an existing interface, which has spinoff results which need to keep the dynsrc version so I can't make the change here.
Here's my regex <img[\s].*?[\s]dynsrc=[\"\'](.*?)[\"\']+.*?[\s]\/>
What I hope to return is an array, holding 2 things for each dynsrc image it finds. The first is the whole <img ... /> and the second is the contents of the dynsrc="..." itself (path to the video)
The problem: If there is any <img /> before the one with the dynsrc then the regex is returning everything between the first <img it finds and the close > of the dynsrc.
This means that if you have <img src="1" /><img dynsrc="2" /> it returns <img src="1" /><img dynsrc="2" /> which is not good as the next line of code will replace this with object embed tags, loosing the first image all together.
Hope I've explained well enough for someone to help.
Catch you anon