Page 1 of 1

UTF-8 setup and usage

Posted: Mon Mar 10, 2003 9:48 am
by Junk Guy
I have rebuilt PHP and Apache to support UTF-8. I also modified a few test sections of my code to use utf8_encode and utf8_decode (although I'm currently only using utf8_encode) for data entry to the database and for data extraction from the database.

When a user submits data from a form in my test section, I run it through utf8_encode then submit it to the database (Oracle). When I pull the data out, I run it through utf8_encode and display it. It works fine - for data originating from the form.

Another client app (Oracle Forms) is used for data entry/extraction on this same database. If I look at data, in Forms, that was entered into the database via my PHP scripts, it is in the form of "& # E345" (no spaces, and that is just an example). It won't display as text characters.

Conversely, if I enter data into the database using Oracle Forms, then view the data in my PHP pages, it shows up as "???". Through further testing with other client apps, I have determined that all the apps being used can see the data entered into the database from Oracle Forms can be seen in their correct textual representation, and that data entered from my PHP scripts is gibberish.

So, the question is, what is going wrong with my PHP configuration and usage such that the UTF-8 encoding isn't working correctly.

I also added the "<meta..." stuff into the HTML of the page, and also as a "headers('<meta...');" function call in my php at the top of each page.

The following is taken from my PHP.INI:
output_buffering = On
.
.
output_handler = mb_output_handler
.
.
default_mimetype = "text/html"
default_charset = "UTF-8"
.
.
[mbstring]
; language for internal character representation.
;mbstring.language =

; internal/script encoding.
; Some encoding cannot work as internal encoding.
; (e.g. SJIS, BIG5, ISO-2022-*)
mbstring.internal_encoding = UTF-8

; http input encoding.
;mbstring.http_input = auto

; http output encoding. mb_output_handler must be
; registered as output buffer to function
mbstring.http_output = UTF-8

; enable automatic encoding translation accoding to
; mbstring.internal_encoding setting. Input chars are
; converted to internal encoding by setting this to On.
; Note: Do _not_ use automatic encoding translation for
; portable libs/applications.
mbstring.encoding_translation = On

; automatic encoding detection order.
; auto means
;mbstring.detect_order = auto

; substitute_character used when character cannot be converted
; one from another
mbstring.substitute_character = none;


I have also modified httpd.conf as follows:
AddCharset UTF-8 .php

Thanks for your time.