How to get the right charset/encoding?

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
QueenZ
Forum Newbie
Posts: 2
Joined: Wed Feb 22, 2012 2:18 pm

How to get the right charset/encoding?

Post by QueenZ »

Hello, I am trying to parse the title from a Chinese website but I'm getting a wrong result. It seems like an encoding problem? What can I do about it?

I need to get the title, the text on the gray background: 我和哥哥的秘密花园

But instead it's outputting this: 脦脪潞脥赂莽赂莽碌脛脙脴脙脺禄篓脭掳


what's wrong?

Code: Select all

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"><html>
	<head>
		<title>TEST</title>
		<meta charset="gbk" />
	</head>
	
	<body>
		<?php
			$dom = new DomDocument;
			libxml_use_internal_errors(true);
			$am_link = "http://tieba.baidu.com/p/21993922";
			$dom->loadHTMLFile($am_link); 
			libxml_clear_errors();


			$xpath = new DomXpath($dom);
			$nodes = $xpath->query('//div[@class="l_thread_title"]/descendant::h1[1]');
			foreach ($nodes as $node)
			{
			  echo $node->nodeValue, "\n";
			  echo "<br />";
			}
		?>
	</body>
</html>
User avatar
social_experiment
DevNet Master
Posts: 2793
Joined: Sun Feb 15, 2009 11:08 am
Location: .za

Re: How to get the right charset/encoding?

Post by social_experiment »

Code: Select all

<meta http-equiv="Content-Type" content="text/html; charset=gbk"/>
Add the http-equiv and content attributes and see if it works;
“Don’t worry if it doesn’t work right. If everything did, you’d be out of a job.” - Mosher’s Law of Software Engineering
Post Reply