Page 1 of 1

Confusion over installation and use of PHP Tidy

Posted: Thu Jan 11, 2007 12:08 pm
by frogg
I was using my own (primitive) function to tidy up my html code until a friend told me about php tidy, and output buffering. So I installed the beginners .exe from the PHP tidy website, and I used the following guide:
http://linuxformat.co.uk/wiki/index.php ... _extension

Anyway, I ended up with this code:

Code: Select all

ob_start();
echo "stuff to go in document";
$htmlContents = ob_get_contents();
ob_clean();
$tidy = new tidy('');
$tidy->parseString($htmlContents);
$tidy->cleanRepair();
echo $tidy;
ob_end_flush();
This works ok, but I have absolutely no idea how to change the settings. For example I would like to enable indentation, and change to xhtml instead of html.

So my questions are:
a) Is this the right way to go about it? Is there something more efficient / simpler? Or is it just plain wrong?
b) How do I change the settings with this or the better way of doing it? (Point me to a guide if relevant).

Posted: Thu Jan 11, 2007 12:20 pm
by Ollie Saunders
This works ok, but I have absolutely no idea how to change the settings.
Have a look at the second parameter for tidy_parse_file().
a) Is this the right way to go about it? Is there something more efficient / simpler? Or is it just plain wrong?
If tidying your HTML is really necessary then Tidy is probably the best tool for it. Tidy is a PECL extension (written in C and therefore efficent) for doing exactly what you seem to want to achieve.

Posted: Thu Jan 11, 2007 1:07 pm
by frogg
Thanks! I've now created an array to pass some settings over to the tidy constructor function:

Code: Select all

ob_start();

       //content i want
	echo "this is frogg's website!!!";

	$htmlContents = ob_get_contents();
	ob_clean();

        //settings I want
	$options['output-xhtml'] = 'yes';
	$options['indent'] = 'auto';
	$options['indent-spaces'] = 3;

        //settings go to tidy
	$tidy = new tidy('',$options);
	$tidy->parseString($htmlContents);
	$tidy->cleanRepair();
	echo $tidy;
	ob_end_flush();
this is the same as suggested on the guide mentioned in my first post, however none of these settings seem to work.

Posted: Thu Jan 11, 2007 3:52 pm
by feyd
You may also want to take a look at Ambush Commander's HTMLPurifier library. :)

Posted: Thu Jan 11, 2007 3:57 pm
by Ollie Saunders
It has just occurred to me that XSLT processors clean up stuff and have XHTML style and indentation configuration too. You need the XSL extension installed for that. The actual code to do it is relatively simple but I'd have to dig it out....unless Kieran is about, he probably knows it since he uses XSLT all the time.

Posted: Fri Jan 12, 2007 7:13 am
by frogg
Thanks guys, I will look into those two.