problem with unknown characters in rss feed

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
mathruD
Forum Newbie
Posts: 14
Joined: Thu Jul 23, 2009 2:58 am

problem with unknown characters in rss feed

Post by mathruD »

i'm having a problem with unknown characters showing up in my rss feed. the problem appears to be the quote symbol ("), as well as a black diamond with a ? inside it. for instance, one title might show up like this:

?Cure for the Common Font? ? A Web Designer?s Introduction to Typeface�Selection (this is the way it is displaying)
Cure for the Common Font? A Web Designer's Introduction to Typeface Selection (this is how it should display)

for some reason, the (") symbol is showing up as a (?). i can use the str_replace function to replace the (?) with a ("), however, if there is a legit (?) in the title, as in the example above, it also gets changed to a ("). as far as the black diamond goes, i have no clue what is doing that so i don't know where to begin trying to replace it.

my code is currently as follows:

Code: Select all

<?php
$search = array("?","\n", "\r\n", "&#10", "&#09", "%09", "%20", "\0");
$replace = array('"',"", "", "", "&nbsp; &nbsp; ", ",", " ", "");
 ?>

<?php do { ?>

<div class="resourcesFeedCntr">
    <div class="resourcesFeedTitle"><a href="<?php echo $row_resourceFeed_rs['resource_link']; ?>"><?php echo $row_resourceFeed_rs['resource_title']; ?></a></div>
    
    <div class="resourcesFeedContent">
      
      <?php include('RSS/rss_fetch.inc');
	  
	  $rss = fetch_rss($row_resourceFeed_rs['resource_rssLink']);
// Split the array to show first 8 listings
$items = array_slice($rss->items, 0, 8);
// Cycle through and display the listings
foreach ($items as $item )
{  ?>
      
      <li><a href="<?php echo $item['link']; ?>"><?php echo str_replace($search,$replace,$item['title']); ?></a></li>

<?php } while ($row_resourceFeed_rs = mysql_fetch_assoc($resourceFeed_rs)); ?>
Can someone please give me an idea as to how to correct this problem? also, charset is set to UTF-8.
User avatar
pickle
Briney Mod
Posts: 6445
Joined: Mon Jan 19, 2004 6:11 pm
Location: 53.01N x 112.48W
Contact:

Re: problem with unknown characters in rss feed

Post by pickle »

Is the charset of the RSS feed the same as the charset of the page you're viewing it in?
Real programmers don't comment their code. If it was hard to write, it should be hard to understand.
mathruD
Forum Newbie
Posts: 14
Joined: Thu Jul 23, 2009 2:58 am

Re: problem with unknown characters in rss feed

Post by mathruD »

i just took a look at the source code for the rss feed that is causing the problem and the first line says:

<?xml version="1.0" encoding="UTF-8"?>

my php page is coded using xhtml 1.0 transitional. would that have anything to do with it?
also, here is a link to the actual rss feed that i am pulling from. it is the one that is causing the probem:
http://feeds2.feedburner.com/typographica

i've also tried using code along these lines that i found online when i was searching for a solution to the problem, but none of them worked (i tried them all separately).

// if your input encoding is ISO 8859-1
htmlspecialchars(utf8_encode($string), ENT_QUOTES)

// if your input encoding is UTF-8
htmlspecialchars($string, ENT_QUOTES, 'UTF-8')

$output = htmlentities(utf8_encode($source));
User avatar
flying_circus
Forum Regular
Posts: 732
Joined: Wed Mar 05, 2008 10:23 pm
Location: Sunriver, OR

Re: problem with unknown characters in rss feed

Post by flying_circus »

mathruD wrote:here is a link to the actual rss feed that i am pulling from. it is the one that is causing the probem:
http://feeds2.feedburner.com/typographica
<META http-equiv="Content-Type" content="text/html; charset=UTF-16">
mathruD
Forum Newbie
Posts: 14
Joined: Thu Jul 23, 2009 2:58 am

Re: problem with unknown characters in rss feed

Post by mathruD »

if you don't mind, can you explain where you are seeing the line that establishes the charset as UTF-16? when i view the source code for that link, it shows up as <?xml version="1.0" encoding="UTF-8"?>

also, what would i have to do to convert it to utf-8 to display properly?
User avatar
flying_circus
Forum Regular
Posts: 732
Joined: Wed Mar 05, 2008 10:23 pm
Location: Sunriver, OR

Re: problem with unknown characters in rss feed

Post by flying_circus »

mathruD wrote:if you don't mind, can you explain where you are seeing the line that establishes the charset as UTF-16? when i view the source code for that link, it shows up as <?xml version="1.0" encoding="UTF-8"?>

also, what would i have to do to convert it to utf-8 to display properly?
Interesting. I posted the above from work, and I believe I have IE9 installed there. At home, I am running IE8 and also see UTF-8. There is something weird though, because when you right click the page and go to encoding, it's all greyed out and "unicode" is selected, not "Unicode (UTF-8)".

You can look into PHP's mbstring extension. specifically mb_check_encoding() and mb_convert_encoding()
mathruD
Forum Newbie
Posts: 14
Joined: Thu Jul 23, 2009 2:58 am

Re: problem with unknown characters in rss feed

Post by mathruD »

ok. i'll look into it and see if i can get it working.
Post Reply