Page 1 of 1

[SOLVED] Parsing HTML Documents

Posted: Wed Mar 10, 2010 5:55 am
by timWebUK
Hi everyone,

Currently I'm in the process of writing a script that grabs the contents of a HTML webpage and there's certain things I'd like to do with the contents, I just can't think of the most efficient way to do it.

Here's a list of things I'd like to do:

Modifying the contents of <img src> tags, modifying actual text (without effecting the text with <> tags), I thought of using preg_replace, but just can't think of how to create a regex to do it.

A simple example would be, the document contains:

<title>title</title>

And I want to modify the text 'title', without breaking the HTML text... any ideas?

Re: Parsing HTML Documents

Posted: Wed Mar 10, 2010 7:37 am
by VladSun

Re: Parsing HTML Documents

Posted: Wed Mar 10, 2010 7:55 am
by timWebUK
Ah perfect, this should do fine. Thanks!