A client of mine, for some godforsaken reason, writes articles in rich text format. She wants to copy this text directly onto her webpage. This is all well and good, and I've got a system in place for her to put it into a database. Once it's in the database, the "display.php?article=name_here" file displays the text. Unfortunately, various characters are not being interpreted correctly. I've tried UTF-8 and ISO -8859-1 and neither work fully. What charset should I be using?
Also, related to this, I have a way for her to edit text that she has already loaded into the database. My edit.php page uses xmlhttprequest to pull the data out of the database, and uses "getElementByID('edit').innerHTML = data;" to insert the data into a <textarea>. This works just fine for about half of the articles in the database. The other half, only the first ~3000 characters are loaded, and the rest gets truncated. I can not for the life of me figure out why, because I've checked the database, and the database DOES have all the data in it. Help?!
Thanks.
Charset for Rich Text Format?
Moderator: General Moderators
Re: Charset for Rich Text Format?
eeek! Have fun man: http://www.biblioscape.com/rtf15_spec.htm
Re: Charset for Rich Text Format?
D:astions wrote:eeek! Have fun man: http://www.biblioscape.com/rtf15_spec.htm
I read that document yesterday and was not happy about it. Surely _someone_ has had to deal with this before? I'd rather not tell my client that they need to change an ingrained habit (saving in rtf). I've been googling all over the place and can't find anything very helpful.
Re: Charset for Rich Text Format?
You might want to try an RTE such as TinyMCE or FCKeditor, those can handle text pasted from word and rtf documents.
My guess is that there is an html entity in the data that breaks up the markup. You should escape those using htmlentities or htmlspecialchars in your php script that fetches the data.The other half, only the first ~3000 characters are loaded, and the rest gets truncated. I can not for the life of me figure out why, because I've checked the database, and the database DOES have all the data in it. Help?!
Re: Charset for Rich Text Format?
Thanks, I'll look into that.pytrin wrote:You might want to try an RTE such as TinyMCE or FCKeditor, those can handle text pasted from word and rtf documents.
I'm testing with a big long article, no html entities or anything like that, just plain text. I _think_ the issue has something to do with xml childnodes being broken up into 4k chunks: http://www.webmasterworld.com/javascript/3388031.htmpytrin wrote:My guess is that there is an html entity in the data that breaks up the markup. You should escape those using htmlentities or htmlspecialchars in your php script that fetches the data.
I see pretty much that same behavior when I test in FF3, however when I test in IE, the larger blocks of text won't even be input into the textarea (despite being loaded into a javascript variable). Here's the relevant bit of code... maybe I'm doing something wrong?
Code: Select all
reqsend.onreadystatechange = function()
{
if (reqsend.readyState == 4 && reqsend.status == 200)
{
title = reqsend.responseXML.getElementsByTagName("title");
content = reqsend.responseXML.getElementsByTagName("content");
day = reqsend.responseXML.getElementsByTagName("day");
month = reqsend.responseXML.getElementsByTagName("month");
year = reqsend.responseXML.getElementsByTagName("year");
alert("first alert");
i = month[0].childNodes[0].nodeValue - 1;
alert("second alert");
j = day[0].childNodes[0].nodeValue - 1;
document.getElementById("editContent").innerHTML = content[0].childNodes[0].nodeValue;
document.forms['edit'].title.value = title[0].childNodes[0].nodeValue;
document.forms['edit'].oldtitle.value = title[0].childNodes[0].nodeValue;
document.forms['edit'].month.options[i].selected = "selected";
document.forms['edit'].day.options[j].selected = "selected";
document.forms['edit'].year.value = year[0].childNodes[0].nodeValue;
}
else
document.forms['edit'].title.value = "Loading...";
};Code: Select all
header("Content-Type: text/xml");
echo "<article>";
echo "<title>$data[0]</title>";
echo "<content>$data[1]</content>";
echo "<day>$day</day>";
echo "<month>$month</month>";
echo "<year>$year</year>";
echo "</article>";Re: Charset for Rich Text Format?
How did you write your AJAX implementation? I highly recommend using a framework such as jQuery to handle those interactions in a cross-browser compatible way.
Re: Charset for Rich Text Format?
I will use jquery soon, but I need to finish this project tonight, and I'm not going to go back and change everything to jquery right now. I probably will in the future though. Anyways, I've figured out exactly what's causing the problems... these three characters:
“
”
’
Man I hate those things. I can get them to display properly when I'm just outputting to a page, but when I try to use ajax to pull them from a database, and place it into a textarea, it chokes.
None of these solutions have worked on the client side:
stringVariable.replace(/“/, """);
stringVariable.replace(/\“/, """);
and none of these have worked on the server side:
str_replace(/“/, """, $stringVariable);
str_replace(/\“/, """, $stringVariable);
Neither PHP or Javascript recognizes that character, and so it doesn't get replaced in either location. I keep getting "�" in the textarea which prevents the ajax from working in IE, and looks bad in FF.
“
”
’
Man I hate those things. I can get them to display properly when I'm just outputting to a page, but when I try to use ajax to pull them from a database, and place it into a textarea, it chokes.
None of these solutions have worked on the client side:
stringVariable.replace(/“/, """);
stringVariable.replace(/\“/, """);
and none of these have worked on the server side:
str_replace(/“/, """, $stringVariable);
str_replace(/\“/, """, $stringVariable);
Neither PHP or Javascript recognizes that character, and so it doesn't get replaced in either location. I keep getting "�" in the textarea which prevents the ajax from working in IE, and looks bad in FF.