Tidy: anchor-as-name "false" not working?

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
erika
Forum Newbie
Posts: 17
Joined: Sat Oct 25, 2008 5:27 pm

Tidy: anchor-as-name "false" not working?

Post by erika »

I'm having some trouble with Tidy in PHP... It isn't working as expected... I'm not sure if that is because I'm using it wrong, it's broken, or what... though I suspect the former, and hope someone might be able to educate me.

Here's what I'm giving it:

Code: Select all

$mystring = '<h1><a name="anchor_id"></a>Anchor Text</h1>';

$tidy = new tidy;
$tidy_config = Array(

                     'indent' => true,
                     'output-xhtml' => true,
                     'anchor-as-name' => no,
                     'drop-empty-paras' => true,
                     'logical-emphasis' => true,
                     'quote-ampersand' => true,
                     'preserve-entities' => true,
                     'wrap' => 0,
                     'char-encoding' => 'utf8',
                     'vertical-space' => true,
                     'indent' => false,
                     'break-before-br' => true,
                     'lower-literals' => true,
                     'show-body-only' => true,
                     'output-xhtml' => true
                     );

$tidy->parseString($mystring,$tidy_config,'utf8');
$tidy->cleanRepair();

$clean_html = $tidy;
And what I get back:

Code: Select all

<h1><a name="anchor_id"></a>Anchor Text</h1>
This is what I wanted:

Code: Select all

<h1><a id="anchor_id"></a>Anchor Text</h1>
I managed to get that by running PHP Simple HTML DOM Parser on it and setting the value of "name" to null and then putting the DOM back into a string, but then my output ends up all on one line... So I thought, "Well, I'll just run tidy on it again to fix the line break thing..." And after doing so I got:

Code: Select all

<h1><a name="anchor_id" id="anchor_id"></a>Anchor Text</h1>
ARGH! Having both the name and the ID is worse!

I have horrible HTML (Word-generated) going in, so I have to run Tidy on it--my final product has to be W3C-valid and I have to parse the equivalent of 250-300 8.5" x 11" pages of formatted text every day that I didn't generate myself (I would have written the silly things in HTML to begin with!).

Any idea what I'm doing wrong? I thought by setting anchor-as-name to false (I've tried values of false, no, 0) that it would convert it to id and eliminate "name" entirely. :-/
Post Reply