Smartquotes: replaced in a test script, but not from POST
Posted: Tue Oct 21, 2008 6:34 am
Hi everyone
Ok it's yet another charset/HTML entity question. But this time with a difference!
I'm building a simple CMS on our intranet server, which adds/updates/deletes data stored in a MySQL database on our web host's server. It all works fine, and I'm in the final stages of development.
The users who have access are all decent HTML people, so the validation I have of the data only checks the basic structures of XHTML: closed tags, special chars are converted to entities, comments closed etc. And the users can type whatever HTML they like, so long as it's structurally correct. There will only be 2/3 users who will have access to this system. There's also an AJAX cURL call on the intranet page to the W3C Validator (using SOAP response) which displays a message as to whether the code they entered was properly valid or not.
The databases, HTML output and PHP5 is all set to using utf8.
On to my exact problem.
I don't mind what characters the users enter, but I'd like to completely get rid of any smart quotes, replacing them with their normal equivalents.
I'm leaving it up to the users to convert any other chars to their entity equivalents, like Euro sign etc.
This was the simple function I came up with:
It works fine as a stand-alone test script. All the smart quotes get converted to their proper equivalents.
However when I drop this function into the script that processes the CMS input and saves it to MySQL, they don't get replaced.
Also when I do a POST to my simple test script they get replaced fine, apart from if they are POSTed from the CMS edit page. So there seems to be something in that CMS edit page which is causing the receiving script to not carry out the replace properly.
I copy and pasted the smart quote characters from MS Word, and I've been writing these scripts in Notepad++ on Windows XP Pro... just in case there's any OS issues that I don't know about.
According to the Firefox page info, both the edit page and save page are UTF-8 and text/html
Anyone got any ideas on this?
It's been driving me nuts for the past couple of days
As well as smart quotes, are there any other chars I should replace with their regular equivalents? Different dash/space lengths etc?
Cheers, Ben
Ok it's yet another charset/HTML entity question. But this time with a difference!
I'm building a simple CMS on our intranet server, which adds/updates/deletes data stored in a MySQL database on our web host's server. It all works fine, and I'm in the final stages of development.
The users who have access are all decent HTML people, so the validation I have of the data only checks the basic structures of XHTML: closed tags, special chars are converted to entities, comments closed etc. And the users can type whatever HTML they like, so long as it's structurally correct. There will only be 2/3 users who will have access to this system. There's also an AJAX cURL call on the intranet page to the W3C Validator (using SOAP response) which displays a message as to whether the code they entered was properly valid or not.
The databases, HTML output and PHP5 is all set to using utf8.
On to my exact problem.
I don't mind what characters the users enter, but I'd like to completely get rid of any smart quotes, replacing them with their normal equivalents.
I'm leaving it up to the users to convert any other chars to their entity equivalents, like Euro sign etc.
This was the simple function I came up with:
Code: Select all
function smartquote_conv($var) {
$search = array('‘', '’', '“', '”');
$replace = array('\'', '\'', '"', '"');
return str_replace($search, $replace, $var);
}
$var = <<<HTML
THIS ’IS A BIG‘ ”TEST page FROM THE ’NEW“ SYSTEM, ””””””””“““““““““““ WITH SMART QUOTES ALL OVER IT ”“
HTML;
echo smartquote_conv($var);However when I drop this function into the script that processes the CMS input and saves it to MySQL, they don't get replaced.
Also when I do a POST to my simple test script they get replaced fine, apart from if they are POSTed from the CMS edit page. So there seems to be something in that CMS edit page which is causing the receiving script to not carry out the replace properly.
I copy and pasted the smart quote characters from MS Word, and I've been writing these scripts in Notepad++ on Windows XP Pro... just in case there's any OS issues that I don't know about.
According to the Firefox page info, both the edit page and save page are UTF-8 and text/html
Anyone got any ideas on this?
It's been driving me nuts for the past couple of days
As well as smart quotes, are there any other chars I should replace with their regular equivalents? Different dash/space lengths etc?
Cheers, Ben