Discussion on mysql garbled code generation

Source: Internet
Author: User

 

Lab 1

1. First, in the following situations:

Mysql> show variables like 'character _ set _ % ';

+ -------------------------- + ----------------------------------------- +

| Variable_name | Value |

+ -------------------------- + ----------------------------------------- +

| Character_set_client | latin1 |

| Character_set_connection | latin1 |

| Character_set_database | latin1 |

| Character_set_filesystem | binary |

| Character_set_results | latin1 |

| Character_set_server | latin1 |

| Character_set_system | utf8 |

| Character_sets_dir | D: \ Programs \ mysql5045 \ share \ charsets \ |

+ -------------------------- + ----------------------------------------- +

Create a table and add three records: Big, ah, love

 

2. Set character_set_results = utf8;

It is displayed: (in the cmd window, the cmd window code page 936)

Big-> Lu Mao

A-> Yong Ridge

Love-> Lu

 

Analysis Code:

 

Large U: 5927, GBK: B4F3

Lu U: 9E93, GBK: C2B4

Trade U: 8D38, GBK: C3B3

 

 

A u: 963F, GBK: B0A2

U: 63B3, GBK: C2B0

Ridge U: 5784, GBK: C2A2

 

Love U: 7231, GBK: B0AE

U: 63B3, GBK: C2B0

Lu U: 5E90, GBK: C2AE

 

3. Set character_set_results = gb2312;

The same is garbled.

 

4. Conclusion:

Garbled characters are generated because a single byte is extended to multiple bytes. B0A2 is stored as a single byte (although it represents 1 Chinese character, but because it is latin1 single byte, it is considered that B0A2 is irrelevant ), in this case, if character_set_results is changed to utf8 multi-byte, mysql tries to extend each single byte to an approximate (unknown algorithm) Dual byte. So garbled

On the contrary, the conversion from multi-byte to single-byte is not changed, but the character 'b0a2 'represented by the original two bytes is changed to two characters. ---- This statement has been verified to be incorrect.

The content stored in the database (in disk and memory) will not be affected by character_set _, but will only be submitted. During the query process, it will be affected by character set conversion.

 

Lab 2

1.

Create table y (id int, name char (4) default charset gb2312;

 

2. If a Chinese character is inserted without changing the default character_set _ Is latin1, garbled characters are displayed.

 

3. Changed to set names gb2312. No problem is displayed. (In the cmd window, the code page of the cmd window is 936)

 

4. I thought, as in conclusion 2 of the above experiment, "There will be no change in the conversion from multiple bytes to a single byte ". So I started to think that, after set names gb2312, I changed character_set_results to latin1, so no problems will occur. Result,

A Chinese character is displayed with a question mark; two Chinese characters are displayed with garbled characters (it is estimated that a question mark represents a character ). That is to say, after changing to character_set_results = latin1, the multi-byte data storage will shrink the proposed Information and take two bytes, converted to one byte.

 

5. How can I avoid mysql shrinking? I thought of character_set_results = binary; the result is displayed normally.

 

PS

Developed applications that use mysql correspond to character sets that use their own character_set_client independently.

The cmd window login to mysql is also an independent application with its own character_set_client variable.

Similarly, opening different cmd windows has its own character_set_client variable.

 

Lab 3 07/16/2010

 

1. Create a table with the default Character Set utf8 (navicat, code page 65001 on the utf8 page) and insert the utf8 Encoded chinese character''

2. Switch to mysql console (code page 936)

3. Set names gbk; then the created table is displayed. Is it true? ---- Yes! Of course, only character_set_results into gbk can be displayed normally.

 

Lab 4

 

1. Mysql console (code page 936) creates a table x3 (name char (32), default Character Set default charset gbk;

2. Default environment variable

| Character_set_client | latin1

| Character_set_connection | latin1

| Character_set_database | latin1

| Character_set_filesystem | binary

| Character_set_results | latin1

| Character_set_server | latin1

| Character_set_system | utf8 // you do not know whether the following process and analysis are affected.

 

Character_set_client character_set_connection character_set_results when latin1 is used, insert data: insert x3 values ('day ');

Display: ERROR 1406 (22001): Data too long for column 'name' at row 1

3. Set character_set_client = gbk; then insert x3 values ('day'); there is no problem with insertion, but it is clear that the data has been converted (character_set_connection = latin1), which is already lossy

4. No matter character_set_results is set to gbk, The result cannot be displayed normally.

5. Set names gbk; The insertion is normal. At this time, the table of a uf8 character set is no problem (experiment 3 ). And the connection query is OK.

6. Of course, if set names utf8 is displayed on a utf8 software interface, the output is OK (navicat is verified)

7. If set to set names binary. On the page of the 936 code page, we can see that x3. however, tables created in experiment 3 cannot be displayed normally.

--------

Analysis 2nd: Data too long for column 'name' at row 1

My char is long enough and the inserted data is short enough, so the data is not too long. That is to say, this prompt is incorrect.

I know that if the default Character Set of table x3 is latin1, insertion is okay (it has always been so fun). This is because, although the mysql console code page at the input end is 936, however, because the three main environment variables character_set_c % are latin1, mysql considers insert x3 values ('day') to be two characters (of course, if you enter it from the utf8 interface, it may be considered as a 3 character input ). The storage is naturally 2 characters. The display is also a two-character display, but the 936 code page Naturally combines the two characters and displays them as Chinese characters (Common Phenomena in early dos environments ).

What happens when the default character set is gbk? I don't know .....

 

Lab 5

A terrible problem occurs: 936 mysql console

Environment variables are shown in Experiment 1. 1.

 

Mysql> set names latin1;

Query OK, 0 rows affected (0.00 sec)

 

Mysql> create table x4 (

-> Name char (32) primary key );

Query OK, 0 rows affected (0.09 sec)

 

Mysql> drop table x4;

Query OK, 0 rows affected (0.06 sec)

 

Mysql> create table x4 (

-> Name char (32) primary key) default charset utf8;

Query OK, 0 rows affected (0.10 sec)

 

Mysql> insert x4 values ('na ');

Query OK, 1 row affected (0.04 sec)

 

Mysql> create table x5 (

-> Name char (32) primary key) default charset gbk;

Query OK, 0 rows affected (0.09 sec)

 

Mysql> insert x5 values (Na ');

ERROR 1406 (22001): Data too long for column 'name' at row 1

Mysql>

 

Conclusion: I have come to a conclusion on the 3rd points analyzed in Experiment 4. Character_set_system utf8 ~~

 

From dubiousway's column

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.