Page 2 of 2
Posted: Mon Dec 10, 2007 5:11 am
by batfastad
Hi Mordred.
I do enclose all vars in quotes in my MySQL queries.
The actual reason for the trim() is a final check to remove any whitespace characters from the start/end of variables before they get entered into the database.
But so long my function #2 is correct, for escaping the input on every page load, then I'm happy!
Thanks for all your help!
Ben
Posted: Thu Dec 13, 2007 6:48 am
by batfastad
Ok I have one final query on this... relating more to the original question of outputting HTML to the page.
After Mordred's advice and this note at the PHP manual (
http://uk3.php.net/manual/en/function.h ... .php#78509), I started doing the following on any variables that will be output to HTML
Code: Select all
htmlspecialchars($var, ENT_QUOTES, 'UTF-8')
I changed our MySQL / PHP installation and all our scripts to use UTF-8 now, rather than ISO whatever it was.
So in theory we can handle any strange characters that come our way.
Now when I output data from our database, it gets output correctly even with strange eastern europe chars
But I thought that when outputting strange characters for valid HTML, you had to use the entity code?

In that case surely
htmlspecialchars() is not enough,
I also thought that it was preferred to use the numeric entity code rather than the text one... meaning the output of
htmlentities() is incorrect.
Is that right?
Or have those days gone?
If your variables are to be used in an <input> or <textarea> tag on say an 'edit' page, then
htmlspecialchars() is the one you need.
But I thought for valid HTML that you should always use the numeric entity codes. Mind you, I did learn HTML over 10 years ago now, and many things have changed

Posted: Thu Dec 13, 2007 10:00 am
by John Cartwright
htmlspecialchars/htmlentities does convert the character to it's "numerical entity code". Click View->Page Source to see what htmlentities actually returned.
Posted: Thu Dec 13, 2007 10:26 am
by batfastad
Jcart wrote:htmlspecialchars/htmlentities does convert the character to it's "numerical entity code". Click View->Page Source to see what htmlentities actually returned.
I know that one
But my question was:
1) Am I correct in thinking / remembering that to have
valid HTML code, all special chars must be encoded into their entities?
I'm sure that was the case when I learnt HTML... maybe some time ago now though
If that is true... surely for output on a valid HTML page you need to do
htmlentities()?
Apart from within an <input> or <textarea> where
htmlspecialchars() will work well enough.
This contradicts this note...
http://uk3.php.net/manual/en/function.h ... .php#78509 and what was said earlier.
Or is it valid HTML nowadays to just leave special chars
un-entity-ised in your HTML code?
eg: copyright symbol, accented chars, eastern europe chars... just leaving them as is, without replacing as entity codes?
Going one more step... I thought W3C recommendations were that numerical entity codes should be used whereever possible.
Not the text codes (even though they work just fine).
2) Is there a PHP function that returns the numeric entity codes for all the entities, rather than text ones?
Thanks
Ben
Posted: Thu Dec 13, 2007 10:30 am
by John Cartwright
I've never heard that you should always use the entities, that just doesn't make sense. htmlentities will not render as html, like you cannot render a <table> element.
I think what you mean is that the content (i.e. anything that is not supposed to be rendered as html) should always be htmlspecialchar()'d. Then yes.