Mysql Manual's bug in modifying column character encoding, mysqlbug

Source: Internet
Author: User
Tags mysql manual

Mysql Manual's bug in modifying column character encoding, mysqlbug

The project uses GBK encoding for historical reasons. garbled characters are not encoded in GBK format. This is a serious issue. We plan to modify the column's character encoding to utf8mb4.

View mysql manual:

 

Use GBK encoding to convert utf8:

It probably means that if it is char varchar text and other types, and the content of these columns is also correctly encoded (GBK ), that is, when the encoding of the column content is the same as the encoding specified in the column definition, you can directly use a statement similar to the following:

Alter table t modify column col_name varchar (60) character set utf8mb4;

If the content of a column is inconsistent with the encoding specified in the column definition, you must first convert it to binary and transfer the desired character set utf8mb4.

However, the actual test shows that the expression here is incorrect. If you follow his instructions, 100% will be garbled!Here: with the desird character set should be changed to: with the right charcter set, then to the desired character set.

 

If the content of a column is inconsistent with the encoding specified in the column definition, you must first transfer the binary file to the encoding Character Set of the Content Encoding (TheRight charcter set), And then convert it to the desired utf8mb4 (The desired character set). This will not cause garbled characters.

To sum up:

1) If you can ensure that the content in the gbk encoding column is also stored in the gkb encoding format, it is very easy to convert it to utf8mb4:

Alter table t modify column col varchar (60) character set utf8mb4;

2) If you are not sure that the content in your gbk encoding column is not stored in the gbk encoding format, you need to first convert it into binary, then transfer the actual encoding character set, and finally transfer it out utf8mb4:

Alter table t modify column col binary;

Alter table t modify column col varchar (60) character set Content Encoding character set;

Alter table t modify column col varchar (60) character set utf8mb4;

3) There is also a premise that the subset is converted to the superset without garbled characters. For example, UTF-8 can be encoded for GBK to utf8.

If you convert utf8 to GBK, the UTF-8 characters that can be encoded will be garbled and the content will be lost.

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.