Minimum Every Software Developer Should Know About Unicode

Ye' old general discussion board. Basically, for everything that isn't covered elsewhere. Come here to shoot the breeze, shoot your mouth off, or whatever suits your fancy.
This forum is not for asking programming related questions.

Moderator: General Moderators

Post Reply
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Minimum Every Software Developer Should Know About Unicode

Post by Luke »

I looked around to make sure this article wasn't already posted on this site, and found nothing, so I apologize if it already is here, but I found it very helpful as I was pretty clueless about unicode and character sets, and it really enlightened me.

http://www.joelonsoftware.com/articles/Unicode.html

I hope it helps you guys. :D
User avatar
RobertGonzalez
Site Administrator
Posts: 14293
Joined: Tue Sep 09, 2003 6:04 pm
Location: Fremont, CA, USA

Post by RobertGonzalez »

Great tutorial/advice/teaching Ninja.
User avatar
m3mn0n
PHP Evangelist
Posts: 3548
Joined: Tue Aug 13, 2002 3:35 pm
Location: Calgary, Canada

Post by m3mn0n »

Great find! Lots and lots of detail and history about the whole world of character sets.

I am interested in doing some internationalization/localization for a site of mine and this saves me buying a book about Unicode. :P
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Post by Chris Corbyn »

That's brilliant! :) I sense this could go in a useful resources sticky somewhere but I'm not sure where :?:
User avatar
RobertGonzalez
Site Administrator
Posts: 14293
Joined: Tue Sep 09, 2003 6:04 pm
Location: Fremont, CA, USA

Post by RobertGonzalez »

Me either. I was going to move it, but I couldn't find an acceptable location. I was thinking Usability, but that seems to shrink the scope of what the tutorial teaches. Anyway, it is still a great little piece of information. Thanks again Ninja goat man.
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Post by Chris Corbyn »

LOL :lol: So why are we so confused where it goes? Isn't that what Miscaellaneous is for? :lol:

(and why am I laughing? it's not even funny. Beer)
User avatar
RobertGonzalez
Site Administrator
Posts: 14293
Joined: Tue Sep 09, 2003 6:04 pm
Location: Fremont, CA, USA

Post by RobertGonzalez »

I feel you. I was up at 5:00 AM yesterday for work, came home and was up till 11:00 with the wife. Fell asleep on the couch, woke up at 1:45AM and have been coding ever since. It is about 7:20AM and I am frickin loopy.
User avatar
Maugrim_The_Reaper
DevNet Master
Posts: 2704
Joined: Tue Nov 02, 2004 5:43 am
Location: Ireland

Post by Maugrim_The_Reaper »

I probably see it as hugely relevant because I literally have a "funny" character in my name, Pádraic. It's incredible to see modern day applications from email systems to online open source websites (cough...Sourceforge...cough) serve HTML which clearly states it's serving UTF-8 mysteriously replacing á with a squiggly A character and an horizontal bar. This can be passed off as a PHP thing sometimes if someone if fiddling with the name string using a PHP string function without mbstring or other used.

I saw the funny side back a few months ago, when a blog post I wrote was syndicated through Devzone, PHP-Planet and PHPDeveloper. I think Devzone still requires using the á entity to cope...

But there's worse, people serving static HTML with a UTF-8 charset which isn't even encoded in UTF-8 at all... This is the worst of the lot, since that's when you're most likely to get ???? marks in place of non-ASCII/English characters in a browser using UTF-8. Even more confusing, the web designer/developer may not even notice this immediately because saving a file containing only ASCII and áíúóé characters as UTF-8 becomes futile unless you originally created the file as UTF-8 and added a funny character BEFORE saving it. Any other way forces you to open file, save as UTF-8 (again) and then and ONLY then type UTF-8 characters outside standard ASCII.

Personally I think half the editors available online are unreliable without a bit of coaxing.
Post Reply