XML_HTMLSax help!!!!

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
skolar
Forum Newbie
Posts: 11
Joined: Mon May 09, 2005 12:36 pm

XML_HTMLSax help!!!!

Post by skolar »

I'm trying to implement a class that parses out link from an HTML string. However, i'm having trouble with the referencing inside the callbacks for the element handlers. When its done parsing, the array i have to store the links goes back to null? Why is that?

Here is my code:

Code: Select all

class ParseHTML {
	var $parser;
	var $links;
	function ParseHTML() {
		$this->parser = &new XML_HTMLSax();
		$this->parser->set_object($this);
		$this->parser->set_element_handler('open', 'close');
                $this->parser->set_data_handler('data');
		$this->links = array();
	}
	
	function open($parser, $tag, $attr) {
		if($tag == "a") {
			if(count($attr) > 0) {
                                $hashed_link = md5($attr['href']);          	
				$this->links[$hashed_link] = $attr['href'];
			}
		}
	}

	function close($parser, $tag) { 
	}

	
	function parse($html) {
		$this->parser->parse($html);
		return $this->links;
	}
}
Can someone please tell me how to do this? I really need to return those links when the parsing is done. Otherwise the parsing is useless. The referencing is getting to be confusing and i cant tell if the handlers are even accessing the right links array. How can i make sure its the array from the actual calling ParseHTML instance and not an anonymous copy of the instance.

Thanks in advance.
Post Reply