How To Use This Function On External File?
Posted: Tue Nov 23, 2010 2:31 am
I found this little function for cleaning Word general html code and I know a bloke who could really use it.
Unfortunately my php isn't good enough yet for ME to know how to use it:
function cleanHTML($html) {
/// <summary>
/// Removes all FONT and SPAN tags, and all Class and Style attributes.
/// Designed to get rid of non-standard Microsoft Word HTML tags.
/// </summary>
// start by completely removing all unwanted tags
$html = ereg_replace("<(/)?(font|span|del|ins)[^>]*>","",$html);
// then run another pass over the html (twice), removing unwanted attributes
$html = ereg_replace("<([^>]*)(class|lang|style|size|face)=(\"[^\"]*\"|'[^']*'|[^>]+)([^>]*)>","<\\1>",$html);
$html = ereg_replace("<([^>]*)(class|lang|style|size|face)=(\"[^\"]*\"|'[^']*'|[^>]+)([^>]*)>","<\\1>",$html);
// sample word html <p class="aaa" style="background:dot">abc</p> will return <p > </p>
}
I assume I put it in a php document like, maybe 'cleanhtml.php'
and then outside of the function declaration and definition I call the function with the file I want to clean as an argument.
Something like: cleanHTML( myfilenameonthehardrive.html )
And save that file. cleanhtml.php in the same dir as the 'dirty' file I've quoted.
Then I open it with my browser and it should run and end and the output should be a cleaned file.
Yes?
Something like that?
But exactly how? Because I haven't made it work yet...

And then I
Unfortunately my php isn't good enough yet for ME to know how to use it:
function cleanHTML($html) {
/// <summary>
/// Removes all FONT and SPAN tags, and all Class and Style attributes.
/// Designed to get rid of non-standard Microsoft Word HTML tags.
/// </summary>
// start by completely removing all unwanted tags
$html = ereg_replace("<(/)?(font|span|del|ins)[^>]*>","",$html);
// then run another pass over the html (twice), removing unwanted attributes
$html = ereg_replace("<([^>]*)(class|lang|style|size|face)=(\"[^\"]*\"|'[^']*'|[^>]+)([^>]*)>","<\\1>",$html);
$html = ereg_replace("<([^>]*)(class|lang|style|size|face)=(\"[^\"]*\"|'[^']*'|[^>]+)([^>]*)>","<\\1>",$html);
// sample word html <p class="aaa" style="background:dot">abc</p> will return <p > </p>
}
I assume I put it in a php document like, maybe 'cleanhtml.php'
and then outside of the function declaration and definition I call the function with the file I want to clean as an argument.
Something like: cleanHTML( myfilenameonthehardrive.html )
And save that file. cleanhtml.php in the same dir as the 'dirty' file I've quoted.
Then I open it with my browser and it should run and end and the output should be a cleaned file.
Yes?
Something like that?
But exactly how? Because I haven't made it work yet...
And then I