Mysql latin1 supports Chinese Characters

Source: Internet
Author: User
Beginners are often confused. Does mysql's default Character Set latin1 support Chinese? Preliminary analysis shows that, yes, Chinese is indeed supported! (Is a preliminary conclusion, only a preliminary analysis) 1. first take a look at latin1 (refer to Baidu encyclopedia) Latin1 is the alias of the ISO-8859-1, some environments write Latin-1. ISO-8859-1 encoding is single-byte encoding

Beginners are often confused. Does mysql's default Character Set latin1 support Chinese? Preliminary analysis shows that, yes, Chinese is indeed supported! (Is a preliminary conclusion, only a preliminary analysis) 1. first take a look at latin1 (refer to Baidu encyclopedia) Latin1 is the alias of the ISO-8859-1, some environments write Latin-1. ISO-8859-1 encoding is single-byte encoding

Beginners are often confused. Does mysql's default Character Set latin1 support Chinese?

Preliminary analysis shows that, yes, Chinese is indeed supported! (It is a preliminary conclusion and only a preliminary analysis is made)

1. Let's take a look at latin1 (refer to Baidu encyclopedia)

Latin1 is an alias for the ISO-8859-1, and Latin-1 is written in some environments.

ISO-8859-1 encoding is single-byte encoding, downward compatible with ASCII, its encoding range is 0x00-0xFF, 0x00-0x7F completely consistent with ASCII, 0x80-0x9F between control characters, 0xA0-0xFF between text symbols.

In addition to ASCII characters, the characters included in the ISO-8859-1 also include the characters corresponding to the Western European language, Greek, Thai, Arabic, and Hebrew. The euro symbol appeared late, not included in the ISO-8859-1.

Because the ISO-8859-1 encoding range uses all space within a single byte, byte streams that are transmitted and stored in systems that support ISO-8859-1 are not discarded. In other words, it is okay to treat any other encoded byte stream as ISO-8859-1 encoding. This is a very important feature. The default encoding of MySQL database is Latin1. ASCII code is a 7-bit container and ISO-8859-1 code is an 8-bit container.

2. Think about character sets.

Yes, you don't have to worry too much about it. If the character set of the table in the database is latin1, Chinese can also be supported by default!

· Latin1 overwrites all single-byte values, and any other code streams can be viewed as latin1.

· Writing a gbk encoded string to the latin1 table will not cause any problems. It will save the intact byte stream.

· Reading written strings from the table will not cause any problems, and the read byte stream will be exactly the same as the original written strings.

After reading the data, if it is on the terminal, it will be understood as the locale type (if locale is gbk, the gbk Chinese string written at that time can be normally echo)

After reading the file, if you want to write the file, the file encoding method is the byte stream encoding that was written at that time. For example, if gbk is written, the file encoding is gbk after the file is read and saved! However, if you mix the coding (UTF-8 + gbk), the editor will be blind and garbled characters may be displayed.

Note: Most plain text files have no file headers. The editor uses byte streams to identify encoding methods and character sets.

3. In summary, if the default latin1 is used for both database creation and database access, it not only supports Chinese, but also supports any encoding method!

Lessons learned from the Chinese encoding of several databases:

1. From the maintenance perspective, although latin1 is okay, it is still possible to replace it with utf8 or gb series.

2. garbled characters:

Show variables like 'character %'
Show variables like 'collation _ % ';

A. Ensure that the data stored in the database is consistent with the database encoding, that is, the data encoding is consistent with character_set_database;
B. Ensure that the character sets for communications are consistent with those for databases, that is, character_set_client and character_set_connection are consistent with character_set_database;
C. Ensure that the returned results of SELECT are encoded in the same way as those of the program, that is, character_set_results is encoded in the same way as the program;
D. Ensure that the program code is consistent with the browser and terminal code.

3. to be simpler, set the character sets to the same, write them to the mysql configuration file, and set the character set (set names 'xxx') each time on the client '), make sure that the byte stream encoding is OK when writing and reading data.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.