Unicode and character set-related knowledge that Web developers must know

Source: Internet
Author: User
Tags character set

Original address :http://www.joelonsoftware.com/articles/Unicode.html
Author: Joel Spolsky
: Http://local.joelonsoftware.com/wiki/Talk:Chinese_ (Simplified)

The things that every programmer absolutely must know about character sets and Unicode (don't make excuses!) )

Unicode and Character Set

Have you ever felt the "Content-type" tag in HTML is full of mystery? Although you know that this thing must appear in HTML, you may have no idea what it is.

Have you ever received emails from your Bulgarian friends, everywhere "???? ?????? ??? ????"?

I was disappointed because I found that many software developers have not yet had a clear understanding of character sets, encodings, and Unicode, which is a fact. A few years ago, when I was testing the Fogbugz project, I suddenly wanted to see if it could receive emails written in Japanese. Will someone in the world write emails in Japanese? I do not know. The test results were bad. I took a close look at the ActiveX controls used to parse messages in MIME (Multipurpose Internet Mail extenisons) format and discovered the folly of its use in character sets. So we had to write a new piece of code to eliminate the error of the active control before we finished the correct conversion. The same thing happened when I was researching another business library, and the Library's implementation of the character encoding part was just awful. I found the developer of the problem and pointed it out to him, but he said there was nothing he could do about it. Like many programmers, he only hopes that this flaw will be forgotten by people.

The truth is not what he wants. Because I found that web development tools such as PHP are so popular that it completely ignores the existence of multiple character encodings in the implementation (Translator Note: This article was written in 2003, now PHP may have corrected the problem), blindly using only 8 bits to represent the characters, So the development of a good international web application has become a dream. I'd like to say that I'm fed up.

I declare: In 2003, if you are a programmer, but you know nothing about character, character set, encoding, and Unicode, then you don't let me catch you. If it's in my hands, I'll let you peel the onion for six months in the submarine, I swear.

In addition, one more thing:

This is not difficult at all.

In this article, I'm talking about the knowledge that programmers in every job should know. All the people who think "plain text = ASCII = one character is 8 bits" are not only wrong, but also outrageous. If you still insist on writing programs in this way, then you are no better than a doctor who doesn't believe in bacteria. So before you finish reading this article, do not write half a line of code.

Before I begin, it must be understood that if you already understand internationalization, you may find this article too simple. Yes, I really do want to frame a shortest bridge, so that anyone can understand what happened, know how to write in a non-English language environment is normal work code. It is also noted that character processing is only a small part of the internationalization of software, but eat not a fat, today we only see what is the character set.



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.