Page 1 of 1
simple html parsing
Posted: Fri Nov 07, 2003 3:40 am
by tintin
Hi !
I'd like to parse html code, returning an array separarating texts and tags.
thefunction("<html>hello</html>") should return the array ("<html>", "hello", "</html>") with "<" when it is tags.
I'm unsuccessful with preg_split( "[</|<|>]", ...) since it removes the separators and returns ("html", "hello", "/html")
I would like to avoid using a huge xml parser for that, and don't need to check syntax.
Any idea ?
Thanks for help.
Posted: Fri Nov 07, 2003 10:20 am
by php_wiz_kid
What exactly is this for? I made a small little function that I use which replaces {HTML} with <html>hello</html>. Here's an example:
HTML: index.html
<html>
<head>
<title>{TITLE}</title>
</head>
<body>
{BODY}
</body>
</html>
PHP: index.php
<?php
$array = array(
'TITLE' => 'My Web Page',
'BODY' => 'Body of my Web Page.'
);
/*This function loads an html file and replaces the array key's with the value in the html document. Will explain if you want it.*/
LoadFile('index.html', $array);
?>
Output: index.php (in browser)
<html>
<head>
<title>My Web Page</title>
</head>
<body>
Body of my Web Page.
</body>
</html>
I think this is similar to what you're looking for. Some of you might notice this from PHPbb which is where I got the idea.
Posted: Sun Nov 09, 2003 4:01 am
by tintin
Hi !
I need to change in a valid html source the name of the file referenced by some tags.
e.g : <IMG SRC=somewhere/file.gif ...> to <img src="anotherfile.gif">
My idea was to split the HTML code in an array, process the tags as requested, and implode the array to obtain the resulting code.
The first part of your fonction might help me to do the first parsing step.
Thanks for help.
Posted: Sun Nov 09, 2003 3:32 pm
by php_wiz_kid
Give me a little more info and I'd be more than willing to help. What exactly are you trying to do? Can you give me some type of example, and how would the "first part" of this function help you?
Posted: Sun Nov 09, 2003 3:48 pm
by Gen-ik
If you know which part of the HTML you want to change you could do something like this...
Code: Select all
<?php
$imgName = "theimage.gif";
$imgWidth = 100;
$imgHeight = 20;
?>
<body>
<img src="<?php echo $imgName ?>" width="<?php echo $imgWidth ?>" height="<?php echo $imgHeight ?>" />
</body>
... and then change variables to whatever you need them to be.
Posted: Mon Nov 10, 2003 4:33 am
by tintin
Hi !
How can I explain it better, with my poor english ??
I have a file containing valid html code, with tags refering to files (like <img src=...> or <link href=...> or <table background=...>).
I need to
1) make a list of the requested files for some tags
2) change in the source, the names of those files
My idea was to split the html code in an array with cells containing the tags, and other containing the text between tags. See my first example above.
After that, it will be easy for me to parse inside the tags refering to files, find the files to change and create a new html file with the modified code.
I'm pretty sure that someone has had to do a similar job before, and that there is a simple and reliable way to do the splitting part with existing functions available in php.
Thanks for your time.