I would like to convert the value of every class name in an XHTML document to lowercase. I've researched this and have found a few examples, but I keep running into memory issues and I'm not sure that they would do exactly what I want, anyway. I'm trying to convert a 14,000 line file line-by-line. A class appears once on almost all of the last 13,000 lines.
Here's an example:
[text]<p class=BadClassName>[/text]
There are no quotes around the original in most of the lines--it's a Word HTML file that I didn't create. I add the quotes in later with Tidy, but there doesn't seem to be an option in Tidy to change all classes to lowercase.
I don't want to convert all text within all tags to lowercase, as there may be links to external URLs over which I have no control that are case-sensitive.
Any suggestions would be appreciated.
Convert value of class in tags to lowercase?
Moderator: General Moderators
Re: Convert value of class in tags to lowercase?
There is an example on the preg_replace_callback() manual page that does almost what you need. You just need to change the search pattern and write the lines to an output file instead of echoing them.
This script showcases a pattern that might be useful.
This script showcases a pattern that might be useful.
Code: Select all
<?php
header('Content-Type: text/plain');
$pattern = '~(?<=\bclass=)("[^"]+|\'[^\']+|[^>\s]+)~i';
$subject = '<P CLASS=BIGBADCLASS><P TITLE=""CLASS=BIG ID=NONE><P CLASS="BIG STUFF"><P CLASS=A_B><P>';
function callback ($matches) {
return strtolower($matches[1]);
}
echo preg_replace_callback($pattern, 'callback', $subject);
# <P CLASS=bigbadclass><P TITLE=""CLASS=big ID=NONE><P CLASS="big stuff"><P CLASS=a_b><P>Re: Convert value of class in tags to lowercase?
It will be more reliable if you use DOMDocument. Something like this would work great:
Code: Select all
$doc = new DOMDocument();
$doc->loadHTML($html);
foreach ($doc->getElementsByTagName('*') as $element) {
if ($element->hasAttribute('class')) {
$element->setAttribute('class', strtolower($element->getAttribute('class')));
}
}Re: Convert value of class in tags to lowercase?
Thank you. 
McInfo, that worked for me perfectly.
Tr0gd0rr, keeping in mind your indication that I might have trouble with reliability of the previously mentioned solution, I tried yours, as well. I had some trouble with implementing it--I read the information on the link you included, but I must have missed something... I don't think I defined it right. All of my HTML at this point in my document is in a gigantic string. I tried several variations to use DOMDocument (it has to be implemented as a class first, right?) and I kept getting any of various errors. If my variable name is $tidy (the output from when I ran it through Tidy), what do I need to do before I use the code sample you provided? Is there another line or two I need to have? If so, I can't figure out precising what it (they) should be.
Thanks!
--Erika
McInfo, that worked for me perfectly.
Tr0gd0rr, keeping in mind your indication that I might have trouble with reliability of the previously mentioned solution, I tried yours, as well. I had some trouble with implementing it--I read the information on the link you included, but I must have missed something... I don't think I defined it right. All of my HTML at this point in my document is in a gigantic string. I tried several variations to use DOMDocument (it has to be implemented as a class first, right?) and I kept getting any of various errors. If my variable name is $tidy (the output from when I ran it through Tidy), what do I need to do before I use the code sample you provided? Is there another line or two I need to have? If so, I can't figure out precising what it (they) should be.
Thanks!
--Erika
Re: Convert value of class in tags to lowercase?
I already have simple html dom parser working on this file, if that helps... but I just copy and pasted a solution, I don't know how to come up with it myself. :/