PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!
Moderator: General Moderators
voltrader
Forum Contributor
Posts: 223 Joined: Wed Jul 07, 2004 12:44 pm
Location: SF Bay Area
Post
by voltrader » Sun Apr 01, 2007 7:07 pm
When I pick up a page of yahoo.co.jp in UTF-8 using file_get_contents, it's somehow changed into charset=eucJP-win when it's echoed.
Code: Select all
$keyword=urlencode('日本');
$file="http://search.yahoo.co.jp/search?p=$keyword&ei=UTF-8&fr=top_v2&x=wrt";
$html=file_get_contents($file);
echo $html;
Not sure why this is!
aaronhall
DevNet Resident
Posts: 1040 Joined: Tue Aug 13, 2002 5:10 pm
Location: Back in Phoenix, missing the microbrews
Contact:
Post
by aaronhall » Sun Apr 01, 2007 8:45 pm
You have to tell the browser in what content-type you're data is encoded, or else it will guess.
Code: Select all
header('Content-Type: text/html; charset=UTF-8');
voltrader
Forum Contributor
Posts: 223 Joined: Wed Jul 07, 2004 12:44 pm
Location: SF Bay Area
Post
by voltrader » Mon Apr 02, 2007 12:24 am
Thanks. Before I try any charset conversion, I tried setting the header as aaron
hall suggested above:
Code: Select all
header('Content-Type: text/html; charset=UTF-8');
$keyword=urlencode('東京');
$file="http://search.yahoo.co.jp/search?p=$keyword&ei=UTF-8&fr=top_v2&x=wrt";
$html=file_get_contents($file);
echo $html;
But no dice. Somehow the page is output as charset=eucJP-win even though
http://search.yahoo.co.jp/search?p=%C5% ... p_v2&x=wrt is in UTF-8
dibyendrah
Forum Contributor
Posts: 491 Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:
Post
by dibyendrah » Mon Apr 02, 2007 1:37 am
Sometimes, you have to put extra meta tag to tell browser that the encoding is UTF-8 even though you have added
Code: Select all
<?php header('Content-Type: text/html; charset=utf-8'); ?>
So putting the following statement may help :
Code: Select all
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
dibyendrah
Forum Contributor
Posts: 491 Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:
Post
by dibyendrah » Mon Apr 02, 2007 1:51 am
Okay here it goes the modified script which works for me :
Code: Select all
<?php
header('Content-Type: text/html; charset=euc-jp');
$keyword=urlencode('東京');
$file="http://search.yahoo.co.jp/search?p=$keyword&ei=UTF-8&fr=top_v2&x=wrt";
$html=file_get_contents($file);
?><html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=euc-jp">
</head>
<body>
<?php echo $html; ?>
</body>
</html>
voltrader
Forum Contributor
Posts: 223 Joined: Wed Jul 07, 2004 12:44 pm
Location: SF Bay Area
Post
by voltrader » Thu May 10, 2007 3:55 pm
Ah, thank you for that. I will give it a try.