Pretty UTF-8 urls

JavaScript and client side scripting.

Moderator: General Moderators

Post Reply
User avatar
CoderGoblin
DevNet Resident
Posts: 1425
Joined: Tue Mar 16, 2004 10:03 am
Location: Aachen, Germany

Pretty UTF-8 urls

Post by CoderGoblin »

The problem... /site/About in English = site/ÜberUns in German as a website page (Ok may not be a good translation but will serve for now). When a person clicks on a link the status bar shows the information correctly (http://www.website.de/ÜberUns). When the page loads shows"http://www.website.de/%C3%9CberUns" as the url. Can I set the url to show nicely somehow ? Is this a browser thing (client) or can I do something (i.e. with headers or something) so the browser shows the url as expected. At present I am testing with Firefox.

The background : Using ZendFramework 0.7 we set the routing, by default to english. The site has the possibility for additional languages. When changing languages a different routing is used (to reflect the current language). This language translation may contain characters other than the standard A-Z etc. At present everything works fine, the url is correctly processed but the url does not look nice for the user. We also have a future requirement for Polish.

Hopefully you can understand what I am asking...

EDIT: My current understanding is this is not possible, but if it is I would like to know how. At the moment it would be up to the translators to use UeberUns instead of ÜberUns.
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

Unfortunately, you are correct - it's not possible to have Unicode characters in a URL.

You can use mb_convert_encoding() to translate your Unicode URL to a URL-safe string for the link.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

No, you can have Unicode characters in your URL. They just won't pretty (they'll be the percent-encoded things you've seen)
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

Ambush Commander wrote:No, you can have Unicode characters in your URL. They just won't pretty (they'll be the percent-encoded things you've seen)
Sorry, yes - AC is correct. You "can" have unicode characters in a url, but they will be escaped.

You can use mb_convert_encoding() to translate your Unicode URL to a easy to read string for the link.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Hmm... I'd say punycode is kind of ugly too, and is designed for domain names. For readability, I think I would go with transliterated URIs. However, Wikipedia doesn't seem to mind oodles of percent encoded URIs all over the place.
User avatar
CoderGoblin
DevNet Resident
Posts: 1425
Joined: Tue Mar 16, 2004 10:03 am
Location: Aachen, Germany

Post by CoderGoblin »

Kieran Huggins wrote:You can use mb_convert_encoding() to translate your Unicode URL to a easy to read string for the link.
Thanks for everybodies replies. As stated originally, I thought this was the case. The actual link already shows correctly (in the status bar) without using mb_convert_encoding. I guess it is up to the translators. After all, the URL display is not a major thing, Most sites I know still retain the default language url name, despite the language you set the thing to.
Post Reply