xpaths

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
mxb7642
Forum Newbie
Posts: 3
Joined: Mon Jun 02, 2008 8:10 pm

xpaths

Post by mxb7642 »

I'd like to get the html from a remote site so that I can retrieve data from it using xpaths so I need it as a valid xml dom object. How can this be done for a website, say http://www.yahoo.com. Thanks
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Re: xpaths

Post by Chris Corbyn »

If the site is not valid XML then you'll struggle. Web browsers go to great lengths to deal with invlaid (X)HTML but most XML libraries want valid XML.

SimpleXML may work, but I doubt it. What do you need to do this for?
mxb7642
Forum Newbie
Posts: 3
Joined: Mon Jun 02, 2008 8:10 pm

Re: xpaths

Post by mxb7642 »

Id like to pull a list of locations off of a certain site so that I can keep my database upto date.
I don't believe the site i'm looking at provides regular xml but could you please explain what makes xml regular? And are there any good packages to convert irregular xml to regualr xml? Thank you.
User avatar
Weirdan
Moderator
Posts: 5978
Joined: Mon Nov 03, 2003 6:13 pm
Location: Odessa, Ukraine

Re: xpaths

Post by Weirdan »

actually dom extension allows to load html:
php manual wrote: DOMDocument::loadHTML
bool DOMDocument::loadHTML ( string $source )
The function parses the HTML contained in the string source . Unlike loading XML, HTML does not have to be well-formed to load. [...]
mxb7642
Forum Newbie
Posts: 3
Joined: Mon Jun 02, 2008 8:10 pm

Re: xpaths

Post by mxb7642 »

thanks. thats exactly what i needed (once i was able to suppress all the annoying warnings).
do you know if its possible to make the query command return an array of string instead of objects? I know its not necessary, but i think it would be more efficient.
Post Reply