Page 1 of 1

charset problem

Posted: Sun Feb 25, 2007 11:35 pm
by itsmani1
I am having charset problem.

I am working on a crawler when i crawling 3 pages two of them are utf-8 and one is iso-8859-1
the page where i show the data if if use "utf-8" the data coming from other pages which are iso-8859-1 get disturbed and if i use iso-8859-1 data from the page utf-8 gets disturbed.

any suggestion.

thank you,

Posted: Mon Feb 26, 2007 12:21 am
by volka

Posted: Mon Feb 26, 2007 12:39 am
by itsmani1
ok, thanks hope this will do the trick.

Posted: Wed Feb 28, 2007 12:34 am
by itsmani1
i tried utf8_encode() to encode ISO strings to utf-8 and tried to show print them but no luck.

I also tried to use both these lines but its not working:

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<META http-equiv="Content-Type" content="text/html; charset=utf-8">


i tried:

Code: Select all

utf8_encode ("Pe&#351;in al&#305;&#351;veri&#351;lerinize <font color=\"ff0000\">%10 indirim</font> uygulanmaktad&#305;r.");
its gives me following output:
Pesin alisverislerinize %10 indirim uygulanmaktadir.

just wanted to know if there is any way so that i can make same as its in string for utf-8

Posted: Wed Feb 28, 2007 5:13 am
by volka
&#351; is already an numerical unicode entity. All non-ascii characters of your string have been encoded this way. A html browser is aware of this and therefore there's no need for utf8_encode for this string..

Posted: Wed Feb 28, 2007 6:07 am
by itsmani1
My problem is that i am fetching data from three pages.
one is using: UTF-8 and others are using ISO-8859-9

when i use utf-8 one gets fine and others don't work and vice versa.
I wanted all the work fine on one page.


i tired to use utf8_encode() and its output is not what is required.

Posted: Wed Feb 28, 2007 6:21 am
by volka
If you set the charset of your page to utf-8 and you fetch contents from an iso-8859-1 encoded page you need utf8_encode, only then.
If you set the charset of your page to iso-8859-1 and you fetch contents from an utf-8 encoded page you need utf8_decode, only then.