PHP Smart Quotes & Encoding

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
bjh5537
Forum Newbie
Posts: 2
Joined: Wed Mar 16, 2005 11:39 am

PHP Smart Quotes & Encoding

Post by bjh5537 »

I've been having some recent troubles with PHP and encoding when sending e-mails. We have a form where our customers can enter a message and send it to a specific person. It's purposes is such that it makes sense to copy content from websites, such as msnbc. MSNBC in particular uses "smart quotes" which are angeled and are not the same for the beginning and end. These characters and others are included in the text box. When the send button is pressed and the message comes across, the smart quotes end up like this:
“ and â€
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Ah... this is a big problem because of stupid Microsoft Word. Stupid Smart Quotes. I use this code to clean my output:

Code: Select all

function speckit($string) {
   
   $trans = get_html_translation_table(HTML_ENTITIES, ENT_COMPAT);

   foreach ($trans as $key => $value) {
      $trans[$key] = '&#'.ord($key).';';
   }

   $trap = array_flip($trans);
   $texcep1 = $trap[' '];
   $texcep2 = $trap['­'];
   
   $trans['–'] = "--";
   $trans['"'] = """;
   $trans["’"] = "'";
   $trans["‘"] = "'";
   $trans['“'] = """;
   $trans['”'] = """;
   $trans['…'] = "...";
   
   return strtr($string, $trans);
}
It gets an HTML_TRANSLATION table and also adds a few extra translations (that I found useful). Hope it helps.
bjh5537
Forum Newbie
Posts: 2
Joined: Wed Mar 16, 2005 11:39 am

Post by bjh5537 »

Unfortunately this only made things worse. It seems like some kind of conversion is happening in the background with PHP. I submit a form with the open smart quote, and I get back this after running it through your function:

�

It's as if PHP is converting that smart quote character into three characters or something and your script is picking out the first one.

It absolutely boggles my mind that seemingly no one in the PHP community has run into this issue. In our specific case, it is old macintosh computers using our forms to submit data (not even copied from word or anything) and it seems as though these characters are coming up. I have no clue how to even begin diagnosing this.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

it's often from odd content-encoding changes between locales. PHP doesn't, by default, process things in UTF-8.. you need to use the mbstring stuff for that I believe.
Post Reply