Double encoding and mistaken encoding???
Posted: Wed Mar 19, 2008 9:49 am
I am trying to understand how Swift is messing up my subjects. I wrote a small test which hopefully will reproduce here:
My conclusionas are:
- On a straight call to Swift_Message the subject is somehow manipulated if it includes a "tag" - a sequence of "<" followed by one or more characters followed by ">". After the tag Swift seems to do some munging of the non-alpha characters in this case. I have tried sans any "special" characters like copyright, trademark, but is seems to still mess with anything "special".
- If I use setSubject it seems to both mess with the special characters if there is a tag and "double encode??" the copyright and trademark symbols. DO not know why.
- If the "<" or ">" is removed from the subject such that there is not a well-formed tag, then other than the double-encoding if I setSubject it seems to not mung the subject.
I could be wrong, but I do not see anything wrong with the subject containing a <XYZ>, and previously when I just used mail() in php5 it worked no matter what I put in the subject, so I am damned if I can see why it would try to do anything with the subjects I present.
Assistance please?
Code: Select all
<?php
require_once 'spcheck.inc.php';
// version 3.3.2 of SwiftMail with PHP 5
$conn = sp_swift_mail(); // using SMTP connection
$subject = 'This is tag = abc<XYZ>~! @#$%/;:.,^&*"[]{}|\?()-_ +1234567890 and ™®© in the subject';
$body = 'This is the body';
$message =new Swift_Message($subject, $body);
$s = $message->getSubject();
echo 'Orig Subject: '.$s;
$number_sent = $conn->send($message,'tim@possibilogy.com','tchambers@schellingpoint.com');
// recieved subject is This is tag = abc<XYZ>fiEgQCMkJS87Oi4sXiYqIltde318XD8oKS1fICsxMjM0NTY3ODkwIGFud ™®© in the subject
$tagged_subject = 'This is a tag <XYZ>in the subject';
$message->setSubject($tagged_subject);
$s = $message->getSubject();
echo '<br/>Tagged Subject: '.htmlentities($s,$ENT_QUOTES,'UTF-8');
$number_sent = $conn->send($message,'tim@possibilogy.com','tchambers@schellingpoint.com');
$tagged_subject = 'This is another tag = abc<XYZ>~! @#$%/;:.,^&*"[]{}|\?()-_ +1234567890 and ™®© in the subject';
$message->setSubject($tagged_subject);
$s = $message->getSubject();
echo '<br/>Tagged Subject: '.htmlentities($s,$ENT_QUOTES,'UTF-8');
$number_sent = $conn->send($message,'tim@possibilogy.com','tchambers@schellingpoint.com');
// when actually received this subject has
// This is another tag = abc<XYZ>fiEgQCMkJS87Oi4sXiYqIltde318XD8oKS1fICsxMjM0NTY3ODkw and �®© in the subject
// try stripping the "<" from the tag (stripping either '<' or '>' seems to make all well
if (strpos($tagged_subject,'>') > 0 && strpos($tagged_subject,'<')> 0 &&
strpos($tagged_subject,'<') < strpos($tagged_subject,'>')) $tagged_subject = str_replace('<',' ',$tagged_subject);
$message->setSubject($tagged_subject);
$s = $message->getSubject();
echo '<br/>Tagged Subject: '.htmlentities($s,$ENT_QUOTES,'UTF-8');
$number_sent = $conn->send($message,'tim@possibilogy.com','tchambers@schellingpoint.com');
// recieved subject is Subject: This is another tag = abc XYZ>~! @#$%/;:.,^&*"[]{}|\?()-_ +1234567890 and �®© in the subject
/*
on page out:
Orig Subject: This is tag = abc~! @#$%/;:.,^&*"[]{}|\?()-_ +1234567890 and ™®© in the subject
Tagged Subject: This is a tag <XYZ>in the subject
Tagged Subject: This is another tag = abc<XYZ>~! @#$%/;:.,^&*"[]{}|\?()-_ +1234567890 and ™®© in the subject
Tagged Subject: This is another tag = abc XYZ>~! @#$%/;:.,^&*"[]{}|\?()-_ +1234567890 and ™®© in the subject
*/
?>- On a straight call to Swift_Message the subject is somehow manipulated if it includes a "tag" - a sequence of "<" followed by one or more characters followed by ">". After the tag Swift seems to do some munging of the non-alpha characters in this case. I have tried sans any "special" characters like copyright, trademark, but is seems to still mess with anything "special".
- If I use setSubject it seems to both mess with the special characters if there is a tag and "double encode??" the copyright and trademark symbols. DO not know why.
- If the "<" or ">" is removed from the subject such that there is not a well-formed tag, then other than the double-encoding if I setSubject it seems to not mung the subject.
I could be wrong, but I do not see anything wrong with the subject containing a <XYZ>, and previously when I just used mail() in php5 it worked no matter what I put in the subject, so I am damned if I can see why it would try to do anything with the subjects I present.
Assistance please?