Hi there, someone please help. I've been trying to fix an xml feed for weeks and cannot seem to get past here.
The feed is generated by the following script, but I need it to read what's in the <div> tags that are on the site it pulls the info from. At the moment it just spits out this when you run it.
An invalid character was found in text content. Error processing resource 'http://www.**********.co.uk/rad.xml'. Line 1...
<description><div>
Here is the php file. PLEASE SOMEONE HELP ME SORT THIS OUT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type" />
<title>RAD Feed</title>
</head>
<body>
<?php
include "development/StripAttributes.php";
$username = "*********"; //Database Username
$password = "*********"; //Database Password
$hostname = "*********"; //Database Host
$dbname = "********"; //Database (or Catalog in MySQL parlance) name
$dbh = mysql_connect($hostname, $username, $password) or die("Could not connect to database server");
$selected = mysql_select_db($dbname, $dbh) or die("Database not found or problem connecting");
$result = mysql_query("SELECT position, postdate, jobref, jobid, country, description FROM jobs");
// if the file exists already, delete it first to flush data
$xmlfeedfile = "rad.xml";
$filehandle = fopen($xmlfeedfile, 'w');
$itemLink = $fullurl.'/info_jobid_'.$b[jobid].'.html';
$xmlString = '<'.'?'.'xml version="1.0" encoding="utf-8" '.'?' .'>
<source>
<publisher>******</publisher>
<publisherurl>*******</publisherurl>';
fwrite($filehandle, $xmlString);
while ($row = mysql_fetch_array($result,MYSQL_ASSOC)) {
$xmlString = '<job>
<title>' . $row{position} . '</title>
<date>' . $row{postdate} . '</date>
<referencenumber>' . $row{jobref} . '</referencenumber>
<url><a href="http://www.******.co.uk' . $fullurl . '/info_jobid_' . $row{jobid} . '.html" target="job' . $row{jobid} . '">' . $row{jobref} . '</a></url>
<country>United Kingdom</country>
<description>' . $row{description} . '</description>
</job>
';
fwrite($filehandle, $xmlString);
}
mysql_close($dbh);
fwrite($filehandle, "</source>");
fclose($filehandle);
?>
</body>
</html>
I'm way out of my coding depth and need desperate help.
Moderator: General Moderators
Re: I'm way out of my coding depth and need desperate help.
I'm not a php expert but i have work a lot with xml and it seems that the xml you are trying to create is invalid.Some character or quote(or double quote) is causing the problem.What is the output when you're trying to echo $xmlString??
Re: I'm way out of my coding depth and need desperate help.
It outputs a feed for indexing jobs into a job search engine.
If I view the source of the xml file it looks like this which is how it's meant to, but it doesn't display properly without viewing the source. Any ideas???
<?xml version="1.0" encoding="utf-8" ?>
<source>
<publisher>*******.co.uk</publisher>
<publisherurl>http://www.*******.co.uk</publisherurl><job>
<title>Editor, Men's Lifestyle Magazine</title>
<date>08/06/2008</date>
<referencenumber>SS1</referencenumber>
<url><a href="http://www.*******.co.uk/info_jobid_1.html" target="job1">SS1</a></url>
<country>United Kingdom</country>
<description><div> £££ Publication OverviewITP Publishing is set to launch an all-new ground-breaking monthly magazine for men, focusing on health, fitness, leisure and lifestyle. </div>
<div> </div>
<div>The magazine will be an invaluable guide for the modern man on how to attain peak fitness, look great, and cope with the stresses and strains of work and life. A wide spread of content covers everything from exercise and working-out, to nutrition and health, along with psychology, relationships, must-have gear, travel, adventure, style and grooming. </div>
<div> </div>
<div>The PositionThis is an opportunity to launch, develop and run an exciting new Dubai-based magazine for men. As the editor you will have the opportunity to shape the magazine’s content and style to cater to the diverse male populations of the GCC region, inspiring and encouraging men to lead a better, fitter and more fulfilling life. </div>
<div> </div>
<div>The successful applicant must have held a senior editorial position on a men’s lifestyle magazine, and must be actively involved in sports and fitness activities. He must display an intimate understanding of a gym, be familiar with work-out regimes, and be an ardent proponent of healthy living and self-improvement.Additionally the ideal candidate will be a passionate and skilled writer with excellent communication abilities, can-do attitude, a packed contacts book, and a never-ending stream of great ideas with which to fill the magazine, making it a must-read for men. </div>
<div> </div>
<div>The editor will have the ability to lead a team and inspire creativity, innovation and award-winning magazine journalism. A confident self starter, the candidate will naturally understand flatplanning, commissioning, deadlines, subbing and the magazine production process, and also be able to exercise strict quality control. </div></description>
</job>
If I view the source of the xml file it looks like this which is how it's meant to, but it doesn't display properly without viewing the source. Any ideas???
<?xml version="1.0" encoding="utf-8" ?>
<source>
<publisher>*******.co.uk</publisher>
<publisherurl>http://www.*******.co.uk</publisherurl><job>
<title>Editor, Men's Lifestyle Magazine</title>
<date>08/06/2008</date>
<referencenumber>SS1</referencenumber>
<url><a href="http://www.*******.co.uk/info_jobid_1.html" target="job1">SS1</a></url>
<country>United Kingdom</country>
<description><div> £££ Publication OverviewITP Publishing is set to launch an all-new ground-breaking monthly magazine for men, focusing on health, fitness, leisure and lifestyle. </div>
<div> </div>
<div>The magazine will be an invaluable guide for the modern man on how to attain peak fitness, look great, and cope with the stresses and strains of work and life. A wide spread of content covers everything from exercise and working-out, to nutrition and health, along with psychology, relationships, must-have gear, travel, adventure, style and grooming. </div>
<div> </div>
<div>The PositionThis is an opportunity to launch, develop and run an exciting new Dubai-based magazine for men. As the editor you will have the opportunity to shape the magazine’s content and style to cater to the diverse male populations of the GCC region, inspiring and encouraging men to lead a better, fitter and more fulfilling life. </div>
<div> </div>
<div>The successful applicant must have held a senior editorial position on a men’s lifestyle magazine, and must be actively involved in sports and fitness activities. He must display an intimate understanding of a gym, be familiar with work-out regimes, and be an ardent proponent of healthy living and self-improvement.Additionally the ideal candidate will be a passionate and skilled writer with excellent communication abilities, can-do attitude, a packed contacts book, and a never-ending stream of great ideas with which to fill the magazine, making it a must-read for men. </div>
<div> </div>
<div>The editor will have the ability to lead a team and inspire creativity, innovation and award-winning magazine journalism. A confident self starter, the candidate will naturally understand flatplanning, commissioning, deadlines, subbing and the magazine production process, and also be able to exercise strict quality control. </div></description>
</job>
Re: I'm way out of my coding depth and need desperate help.
hmm..
First i can't find any end tag for <source> root tag.
Second,try to keep the xml parent tags in separate lines (i have some old bad experience with that).
Third,i believe that the is causing a problem..because i think it will treat it like an entity that need to be closed. Try to remove it to see what will happen.
First i can't find any end tag for <source> root tag.
Second,try to keep the xml parent tags in separate lines (i have some old bad experience with that).
Third,i believe that the is causing a problem..because i think it will treat it like an entity that need to be closed. Try to remove it to see what will happen.
Re: I'm way out of my coding depth and need desperate help.
You've put HTML tags inside an XML, which treats it literally. You need to put all HTML inside CDATA blocks:
Code: Select all
<url><![CDATA[ ... link here ... ]]></url>
<description><![CDATA[ ... description here ... ]]></description>Re: I'm way out of my coding depth and need desperate help.
Hi thanks for that. I didn't have a clue. It worked with part of it, but I get this now. Any ideas.
An invalid character was found in text content. Error processing resource 'http://www.******.co.uk/rad.xml'. Line 1...
<description><![CDATA[<div>
An invalid character was found in text content. Error processing resource 'http://www.******.co.uk/rad.xml'. Line 1...
<description><![CDATA[<div>
Re: I'm way out of my coding depth and need desperate help.
Thanks all for your help.
I have put the CDATA in and that works, I have cannot however remove the  's at the source as they are added by clients.
Does anyone have an idea of a walk around that would fit into this code???????
Thanks in advance because at present I am
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type" />
<title>RAD Feed</title>
</head>
<body>
<?php
include "development/StripAttributes.php";
$username = "******"; //Database Username
$password = "******"; //Database Password
$hostname = "******"; //Database Host
$dbname = "******"; //Database (or Catalog in MySQL parlance) name
$dbh = mysql_connect($hostname, $username, $password) or die("Could not connect to database server");
$selected = mysql_select_db($dbname, $dbh) or die("Database not found or problem connecting");
$result = mysql_query("SELECT position, postdate, jobref, jobid, country, description FROM jobs");
function stripchars($str)
{
$row = htmlentities("$str");
$row = ereg_replace(128, "€", $row); // Euro symbol
$row = ereg_replace(133, "…", $row); // ellipses
$row = ereg_replace(8226, "″", $row); // double prime
$row = ereg_replace(8216, "'", $row); // left single quote
$row = ereg_replace(145, "'", $row); // left single quote
$row = ereg_replace(8217, "'", $row); // right single quote
$row = ereg_replace(146, "'", $row); // right single quote
$row = ereg_replace(8220, """, $row); // left double quote
$row = ereg_replace(147, """, $row); // left double quote
$row = ereg_replace(8221, """, $row); // right double quote
$row = ereg_replace(148, """, $row); // right double quote
$row = ereg_replace(8226, "•", $row); // bullet
$row = ereg_replace(149, "•", $row); // bullet
$row = ereg_replace(8211, "–", $row); // en dash
$row = ereg_replace(150, "–", $row); // en dash
$row = ereg_replace(8212, "—", $row); // em dash
$row = ereg_replace(151, "—", $row); // em dash
$row = ereg_replace(8482, "™", $row); // trademark
$row = ereg_replace(153, "™", $row); // trademark
$row = ereg_replace(169, "©", $row); // copyright mark
$row = ereg_replace(174, "®", $row); // registration mark
$row = ereg_replace("’","’",$row);//fix SQL
$final = htmlentities("$row");
return $final;
}
// if the file exists already, delete it first to flush data
$xmlfeedfile = "rad.xml";
$filehandle = fopen($xmlfeedfile, 'w');
$itemLink = $fullurl.'/info_jobid_'.$b[jobid].'.html';
$xmlString = '<'.'?'.'xml version="1.0" encoding="utf-8" '.'?' .'>
<source>
<publisher>*******.co.uk</publisher>
<publisherurl>http://www.********.co.uk</publisherurl>';
fwrite($filehandle, $xmlString);
while ($row = mysql_fetch_array($result,MYSQL_ASSOC)) {
$pos = stripslashes(strip_tags($row['position']));
$date = stripslashes(strip_tags($row['postdate']));
$ref = stripslashes(strip_tags($row['jobref']));
$desc = stripslashes(strip_tags($row['description']));
$jid = stripslashes($row['jobid']);
$xmlString = "<job>\n\t<title>{$pos}</title>\n\t<date>{$date}</date>\n\t
<referencenumber>{$jobref}</referencenumber>\n\t<url><a href=\"http://www.******.co.uk{$fullurl}/info_jobid_{$jobid}.html\" target=\"job{$jobid}\">{$jobref}</a></url>\n\t<country>United Kingdom</country>\n\t<description>{$desc}</description>\n</job>";
fwrite($filehandle, $xmlString);
}
mysql_close($dbh);
fwrite($filehandle, "</source>");
fclose($filehandle);
?>
</body>
</html>
I have put the CDATA in and that works, I have cannot however remove the  's at the source as they are added by clients.
Does anyone have an idea of a walk around that would fit into this code???????
Thanks in advance because at present I am
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type" />
<title>RAD Feed</title>
</head>
<body>
<?php
include "development/StripAttributes.php";
$username = "******"; //Database Username
$password = "******"; //Database Password
$hostname = "******"; //Database Host
$dbname = "******"; //Database (or Catalog in MySQL parlance) name
$dbh = mysql_connect($hostname, $username, $password) or die("Could not connect to database server");
$selected = mysql_select_db($dbname, $dbh) or die("Database not found or problem connecting");
$result = mysql_query("SELECT position, postdate, jobref, jobid, country, description FROM jobs");
function stripchars($str)
{
$row = htmlentities("$str");
$row = ereg_replace(128, "€", $row); // Euro symbol
$row = ereg_replace(133, "…", $row); // ellipses
$row = ereg_replace(8226, "″", $row); // double prime
$row = ereg_replace(8216, "'", $row); // left single quote
$row = ereg_replace(145, "'", $row); // left single quote
$row = ereg_replace(8217, "'", $row); // right single quote
$row = ereg_replace(146, "'", $row); // right single quote
$row = ereg_replace(8220, """, $row); // left double quote
$row = ereg_replace(147, """, $row); // left double quote
$row = ereg_replace(8221, """, $row); // right double quote
$row = ereg_replace(148, """, $row); // right double quote
$row = ereg_replace(8226, "•", $row); // bullet
$row = ereg_replace(149, "•", $row); // bullet
$row = ereg_replace(8211, "–", $row); // en dash
$row = ereg_replace(150, "–", $row); // en dash
$row = ereg_replace(8212, "—", $row); // em dash
$row = ereg_replace(151, "—", $row); // em dash
$row = ereg_replace(8482, "™", $row); // trademark
$row = ereg_replace(153, "™", $row); // trademark
$row = ereg_replace(169, "©", $row); // copyright mark
$row = ereg_replace(174, "®", $row); // registration mark
$row = ereg_replace("’","’",$row);//fix SQL
$final = htmlentities("$row");
return $final;
}
// if the file exists already, delete it first to flush data
$xmlfeedfile = "rad.xml";
$filehandle = fopen($xmlfeedfile, 'w');
$itemLink = $fullurl.'/info_jobid_'.$b[jobid].'.html';
$xmlString = '<'.'?'.'xml version="1.0" encoding="utf-8" '.'?' .'>
<source>
<publisher>*******.co.uk</publisher>
<publisherurl>http://www.********.co.uk</publisherurl>';
fwrite($filehandle, $xmlString);
while ($row = mysql_fetch_array($result,MYSQL_ASSOC)) {
$pos = stripslashes(strip_tags($row['position']));
$date = stripslashes(strip_tags($row['postdate']));
$ref = stripslashes(strip_tags($row['jobref']));
$desc = stripslashes(strip_tags($row['description']));
$jid = stripslashes($row['jobid']);
$xmlString = "<job>\n\t<title>{$pos}</title>\n\t<date>{$date}</date>\n\t
<referencenumber>{$jobref}</referencenumber>\n\t<url><a href=\"http://www.******.co.uk{$fullurl}/info_jobid_{$jobid}.html\" target=\"job{$jobid}\">{$jobref}</a></url>\n\t<country>United Kingdom</country>\n\t<description>{$desc}</description>\n</job>";
fwrite($filehandle, $xmlString);
}
mysql_close($dbh);
fwrite($filehandle, "</source>");
fclose($filehandle);
?>
</body>
</html>