[SOLVED] Parsing HTML Documents
Posted: Wed Mar 10, 2010 5:55 am
Hi everyone,
Currently I'm in the process of writing a script that grabs the contents of a HTML webpage and there's certain things I'd like to do with the contents, I just can't think of the most efficient way to do it.
Here's a list of things I'd like to do:
Modifying the contents of <img src> tags, modifying actual text (without effecting the text with <> tags), I thought of using preg_replace, but just can't think of how to create a regex to do it.
A simple example would be, the document contains:
<title>title</title>
And I want to modify the text 'title', without breaking the HTML text... any ideas?
Currently I'm in the process of writing a script that grabs the contents of a HTML webpage and there's certain things I'd like to do with the contents, I just can't think of the most efficient way to do it.
Here's a list of things I'd like to do:
Modifying the contents of <img src> tags, modifying actual text (without effecting the text with <> tags), I thought of using preg_replace, but just can't think of how to create a regex to do it.
A simple example would be, the document contains:
<title>title</title>
And I want to modify the text 'title', without breaking the HTML text... any ideas?