Mysql DBA Advanced Operations Learning Note-MYSQL Database character Set knowledge

Source: Internet
Author: User
Tags dba i18n mysql client mysql manual create database

1.2 MySQL database character set knowledge

1.2.1 MySQL database Character Set introduction

Simply put, a set of text symbols and their encoding and comparison rules.

The MySQL database character set includes two concepts: Character set (CHARACTER) and proofing Rules (COLLATION). Where the character set is used to define how MySQL data strings are stored, proofing rules define how strings are compared. In the preceding statement, the CHARACTER set Latin1 is the database character set and collate latin1_wedish_ci for the proofreading character set, the details of the character set refer to the MySQL manual, and the 10th character set chapter.

1.2.2 MySQL Database common Character Set introduction

The commonly used character set when using MySQL is

1.2.3 MySQL How to choose the right character set

A. If you are working with a wide variety of text, publish to different countries and regions, you should choose the Unicode character set. For MySQL is UTF-8 (three bytes per kanji), if the application needs to deal with English, a small number of Chinese characters UTF-8 better.

B. If only support Chinese, and the volume of data is large, performance requirements are also high, optional GBK (fixed length, each Chinese character account for double-byte, English also account for double-byte), if a large number of operations, the comparison sequence and other fixed-length character set faster, high performance.

C. Dealing with mobile Internet services may require the use of the UTF8MB4 character set.

1.2.4 View the character sets supported by the current MySQL system

[[email protected] ~]# mysql -uroot -p123456 -e "SHOW CHARACTER SET"

The most commonly used are four kinds:

[[email protected] ~]# mysql -uroot -p123456 -e "SHOW CHARACTER SET;"|egrep "gbk|utf8|latin1"|awk ‘ {print $0}‘latin1    cp1252 West European    latin1_swedish_ci    1gbk    GBK Simplified Chinese    gbk_chinese_ci    2utf8    UTF-8 Unicode    utf8_general_ci    3utf8mb4    UTF-8 Unicode    utf8mb4_general_ci    4

View MySQL's current character set settings

mysql> show variables like ‘character_set%‘;+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | utf8 || character_set_connection | utf8 || character_set_database   | latin1   || character_set_filesystem | binary   || character_set_results| utf8 || character_set_server | latin1   || character_set_system | utf8 || character_sets_dir   | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+

Tip: By default, the character set of Character_set_client,character_set_connection,character_set_results is consistent with the character set of the system and is modified at the same time. That is:

[[email protected] ~]# cat /etc/sysconfig/i18n LANG="zh_CN.UTF-8"[[email protected] ~]# echo $LANGzh_CN.UTF-8

1.3 MySQL Database What is the default set of characters?

A. First look at MySQL set by default

mysql> show variables like ‘character_set%‘;+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | gb2312   || character_set_connection | gb2312   || character_set_database   | latin1   || character_set_filesystem | binary   || character_set_results| gb2312   || character_set_server | latin1   || character_set_system | utf8 || character_sets_dir   | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+

The meanings of the different character set parameters are as follows:

Variable_name  | Value  | character_set_client | latin1  客户端字符集| character_set_connection | latin1  连接字符集| character_set_database   | latin1数据库字符集,配置文件指定或建库建表指定| character_set_results| latin1  返回结果字符集| character_set_server | latin1服务器字符集,配置文件指定或建库建表指定

See changes to the character set in MySQL after changing the Linux system character Set variable

[[email protected] ~]# echo $LANGzh_CN.UTF-8[[email protected] ~]# mysql -uroot -p123456 -e "show variables like ‘character_set%‘;"+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | utf8 || character_set_connection | utf8 || character_set_database   | latin1   || character_set_filesystem | binary   || character_set_results| utf8 || character_set_server | latin1   || character_set_system | utf8 || character_sets_dir   | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+

We found that the character_set_connection,character_set_client,character_set_server of the character set and the system were also changed to UTF8.

1.4 Performing set names Latin1 what exactly did you do?

Whether the character set of the Linux system is gb2312 or UTF8, inserting data is garbled by default.

A. Viewing data at this time is garbled

mysql> use cuizhongDatabase changedmysql> select * from student-> ;+----+---------------------+| id | name|+----+---------------------+|  1 | zhangsan||  2 | lisi||  3 | wanger  ||  4 | xiaozhang   ||  5 | xiaowang||  6 | ??? ||  7 | ?°?o¢  ||  8 | ??è?¤èˉ?   ||  9 | ?????  |+----+---------------------+9 rows in set (0.10 sec)

B. After executing the set corresponding character set operation, it solves the garbled problem.

(1) Check the character set of the library and table first

mysql> show create database cuizhong\G*************************** 1. row ***************************   Database: cuizhongCreate Database: CREATE DATABASE `cuizhong` /*!40100 DEFAULT CHARACTER SET latin1 */1 row in set (0.00 sec)mysql> show create table student\G*************************** 1. row ***************************   Table: studentCreate Table: CREATE TABLE `student` (  `id` int(4) NOT NULL AUTO_INCREMENT,  `name` char(20) NOT NULL,  PRIMARY KEY (`id`)) ENGINE=InnoDB AUTO_INCREMENT=10 DEFAULT CHARSET=latin11 row in set (0.00 sec)

(2) We look at the library and the table's character set are Latin1, so execute set names latin1 ensure that the character set is not garbled.

mysql> set names latin1;Query OK, 0 rows affected (0.00 sec)mysql> select * from student;+----+-----------+| id | name  |+----+-----------+|  1 | zhangsan  ||  2 | lisi  ||  3 | wanger||  4 | xiaozhang ||  5 | xiaowang  ||  6 | ???   ||  7 | 小红  ||  8 | 不认识||  9 | 李四  |+----+-----------+

(3) The result of performing the set character set operation changes the parameters of the following character set character_set_client,character_set_connection,character_set_results three characters.

mysql> show variables like ‘character_set%‘;+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | latin1   || character_set_connection | latin1   || character_set_database   | latin1   || character_set_filesystem | binary   || character_set_results| latin1   || character_set_server | latin1   || character_set_system | utf8 || character_sets_dir   | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+

1.5 mysql command parameter what is-default-character-set=latin1 doing?

(1) Check the MySQL character set first

[[email protected] ~]# mysql -uroot -p123456 -e "show variables like ‘character_set%‘;"+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | utf8 || character_set_connection | utf8 || character_set_database   | latin1   || character_set_filesystem | binary   || character_set_results| utf8 || character_set_server | latin1   || character_set_system | utf8 || character_sets_dir   | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+

(2) login MySQL with-default-character-set=latin1 parameter

[[email protected]~]# mysql -uroot -p123456 --default-character-set=latin1Welcome to the MySQL monitor.  

(3) View the MySQL character set again now

mysql> show variables like ‘character_set%‘;+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | latin1   || character_set_connection | latin1   || character_set_database   | latin1   || character_set_filesystem | binary   || character_set_results| latin1   || character_set_server | latin1   || character_set_system | utf8 || character_sets_dir   | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+

(4) with parameter login is also temporarily modified without parameters login and change back.

[[email protected]~]# mysql-uroot-p123456--default-character-set=latin1-e "show variables like ' character_set% ‘;" +--------------------------+----------------------------------+| variable_name| value|+--------------------------+----------------------------------+| character_set_client | Latin1 | | character_set_connection | Latin1 | | Character_set_database | Latin1 | | Character_set_filesystem | binary | | character_set_results| Latin1 | | Character_set_server | Latin1 | | Character_set_system | UTF8 | | Character_sets_dir | /usr/local/mysql/share/charsets/|+--------------------------+----------------------------------+[[email  Protected] ~]# mysql-uroot-p123456-e "show variables like ' character_set% ';" +--------------------------+----------------------------------+| variable_name| value|+--------------------------+----------------------------------+| character_set_client | UTF8 | | character_set_connection | UTF8 | | Character_set_database | Latin1 | | CharacTer_set_filesystem | binary | | character_set_results| UTF8 | | Character_set_server | Latin1 | | Character_set_system | UTF8 | | Character_sets_dir | /usr/local/mysql/share/charsets/|+--------------------------+----------------------------------+

1.6 Ensure MySQL database insert data is not garbled solution

1.6.1 Unified MySQL Database client and service-side character set

(1) The following character sets (client and server) of the MySQL database are unified into one character set to ensure that the inserted Chinese database can output normally. Of course, the character set of a Linux system should be unified with the database character set as much as possible.

(2) Show variables like ' character_set% '; command output results are as follows

Variable_name| Value +--------------------------+--------------------------------+①character_set_client | latin1  客户端字符集②character_set_connection | latin1  连接字符集③character_set_database   | latin1   数据库字符集④character_set_results| latin1   返回结果字符集⑤character_set_server | latin1   服务器字符集,配置文件制定或建库建表指定

Among them, ①②④ three parameters by default with the Linux system character set, the manual login database to perform set names Latin1 and MySQL specified character set logon operation, is to change the client, connection, Results3 parameters of the character set are latin1, so as to solve the problem of inserting garbled, this operation can modify the MySQL client's character set in the MY.CNF configuration file, the configuration method is as follows:

[client]Default-character-set=latin1

Tip: No restart required

[[email protected] ~]# sed -n "18,22p" /etc/my.cnf [client]#password    = your_passwordport        = 3306socket        = /usr/local/mysql/tmp/mysql.sockdefault-character-set = latin1[[email protected] ~]# mysql -uroot -p123456 -e "show variables like ‘character_set%‘;"+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | latin1   || character_set_connection | latin1   || character_set_database   | latin1   || character_set_filesystem | binary   || character_set_results| latin1   || character_set_server | latin1   || character_set_system | utf8 || character_sets_dir   | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+

(3) After modifying the client character from the Set query table data will not garbled

[[email protected] ~]# mysql -uroot -p123456 -e "select * from cuizhong.student;"+----+-----------+| id | name  |+----+-----------+|  1 | zhangsan  ||  2 | lisi  ||  3 | wanger||  4 | xiaozhang ||  5 | xiaowang  ||  6 | ???   ||  7 | 小红||  8 | 不认识 ||  9 | 李四|+----+-----------+

1.6.2 changing the MySQL service-side character set

(1) Modify the MY.CNF parameters as required below

[mysqld]Default-character-set = latin1适合5.1及以前版本Default-character-server=latin1 适合5.5

(2) View the current character set before modifying

[[email protected] ~]# mysql -uroot -p123456 -e "show variables like ‘character_set%‘;"+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | utf8 || character_set_connection | utf8 || character_set_database   | latin1   || character_set_filesystem | binary   || character_set_results| utf8 || character_set_server | latin1   || character_set_system | utf8 || character_sets_dir   | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+

(3) View the modified parameters

[[email protected] ~]# sed -n "26,27p" /etc/my.cnf [mysqld]character-set-server = utf8(4)     重启mysql服务(生产环境是不允许重启的)[[email protected] ~]# /etc/init.d/mysqld restartShutting down MySQL.. SUCCESS! Starting MySQL.. SUCCESS!

(4) Viewing the changed character set

[[email protected] ~]# mysql -uroot -p123456 -e "show variables like ‘character_set%‘;"+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | utf8 || character_set_connection | utf8 || character_set_database   | utf8 || character_set_filesystem | binary   || character_set_results| utf8 || character_set_server | utf8 || character_set_system | utf8 || character_sets_dir   | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+

Tip: The above parameters set under [mysqld] Change the character set settings for the following 2 parameters

| Variable_name| Value|| character_set_database   | utf8 || character_set_server | utf8 |

This time we'll modify the system character set MySQL database character set is not changed

[[email protected] ~]# cat /etc/sysconfig/i18n LANG="zh_CN.GB2312"#LANG="zh_CN.UTF-8"[[email protected] ~]# source /etc/sysconfig/i18n [[email protected] ~]# mysql -uroot -p123456 -e "show variables like ‘character_set%‘;"+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | utf8 || character_set_connection | utf8 || character_set_database   | utf8 || character_set_filesystem | binary   || character_set_results| utf8 || character_set_server | utf8 || character_set_system | utf8 || character_sets_dir   | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+

1.6.3 Unified MySQL Database client and service-side character Set summary

Not garbled ideas: The proposed Chinese and English environment Choice UTF8, Linux system, client, server, library, table, program character set unity.

1.Linux system Character Set Unified UTF8

[[email protected] ~]# cat /etc/sysconfig/i18n LANG="zh_CN.UTF-8"

Tips for Linux customers to change the character set for example: Xshell

Example: SECURECRT

2.Mysql Database Client

Temporary:

Set names Latin1
Permanent:

Change the parameters of the MY.CNF client module to implement the set names Latin1 effect, and to take effect permanently.

3. Service-side

Changing the MY.CNF parameter

[mysqld]Default-character-set = latin1适合5.1及以前版本character-set-server = latin1适合5.5

4. Library table, program specifies the character set to build the library

Create database Cuizhong_utf8 DEFAULT CHARACTER SET UTF8 collate after proofing rules

We can show you. View supported proofing Rules

Mysql> Show character set;+----------+-----------------------------+---------------------+--------+| Charset | Description | Default Collation | MaxLen |+----------+-----------------------------+---------------------+--------+| Big5 | Big5 Traditional chinese|  Big5_chinese_ci | 2 | | Dec8 | DEC West European |  Dec8_swedish_ci | 1 | | cp850| DOS West European |  cp850_general_ci| 1 | | HP8 | HP West european|  Hp8_english_ci | 1 | | koi8r| Koi8-r relcom Russian |  koi8r_general_ci| 1 | | Latin1 | cp1252 West european|  Latin1_swedish_ci | 1 | | latin2 | ISO 8859-2 Central European |  Latin2_general_ci | 1 | | Swe7 | 7bit swedish|  Swe7_swedish_ci | 1 | | ascii| US ascii|  ascii_general_ci| 1 | | Ujis | EUC-JP Japanese |  ujis_japanese_ci| 3 | | Sjis | Shift-jis Japanese |  sjis_japanese_ci| 2 | | Hebrew | ISO 8859-8 Hebrew |  Hebrew_general_ci | 1 | | tis620 | TIS620 Thai |  Tis620_thai_ci | 1 | | euckr| EUC-KR Korean |  Euckr_korean_ci | 2 | | koi8u| Koi8-u Ukrainian|  koi8u_general_ci| 1 | | gb2312 | GB2312 Simplified Chinese |  Gb2312_chinese_ci | 2 | | greek| ISO 8859-7 greek|  greek_general_ci| 1 | | cp1250 | Windows Central european|  Cp1250_general_ci | 1 | | GBK | GBK Simplified Chinese |  Gbk_chinese_ci | 2 | | Latin5 | ISO 8859-9 Turkish |  Latin5_turkish_ci | 1 | | Armscii8 | ARMSCII-8 Armenian |  Armscii8_general_ci | 1 | | UTF8 | UTF-8 Unicode |  Utf8_general_ci | 3 | | UCS2 | UCS-2 Unicode |  Ucs2_general_ci | 2 | | cp866| DOS Russian |  cp866_general_ci| 1 | | KEYBCS2 | DOS Kamenicky Czech-slovak |  Keybcs2_general_ci | 1 | | macce| Mac Central european|  macce_general_ci| 1 | | Macroman | Mac West European |  Macroman_general_ci | 1 | | cp852| DOS Central european|  cp852_general_ci| 1 | | latin7 | ISO 8859-13 Baltic |  Latin7_general_ci | 1 | | UTF8MB4 | UTF-8 Unicode |  Utf8mb4_general_ci | 4 | | cp1251 | Windows cyrillic|  Cp1251_general_ci | 1 | | utf16| UTF-16 Unicode |  utf16_general_ci| 4 | | cp1256   | Windows Arabic |  Cp1256_general_ci | 1 | | cp1257 | Windows Baltic |  Cp1257_general_ci | 1 | | utf32| UTF-32 Unicode |  utf32_general_ci| 4 | | binary | Binary Pseudo CharSet |  binary | 1 | | Geostd8 | GEOSTD8 georgian|  Geostd8_general_ci | 1 | | cp932| SJIS for Windows Japanese |  Cp932_japanese_ci | 2 | | Eucjpms | Ujis for Windows Japanese |  Eucjpms_japanese_ci | 3 |+----------+-----------------------------+---------------------+--------+39 rows in Set (0.00 sec)

5. Character set of the development program

Simplified UTF8

Http://download.comsenz.com/Discuzx/3.2/Discuz_X3.2_SC_UTF8.zip

1.7 How to change the character set of the production MySQL database library table

Steps to modify the data character set

For an existing database to modify the character set cannot be directly through "ALTER DATABASE character set " or "ALTER TABLE TableName Character set " Neither of these commands updates the character set of the existing data. It only takes effect on the newly created table or data.
An adjustment to a character set that already has a record must export the data and re-import it after the character set has been modified before it can be completed.

Modify the database default encoding

Alter database [your db name] charset [your character setting]

The following simulates the process of modifying a database of the latin1 character set to the GBK character set.

(1) Export table structure

Mysqldump –uroot –p123456 –-default-character-set=latin1 –d dbname>alltable.sql –-default-character-set=gbk 表示以GBK字符集进行连接 –d只导表结构

(2) then edit Alltable.sql to change latin1 to GBK

Set names GBK

(3) Ensure that data is not updated to export all data

Mysqldmup –uroot –p123456 –-quick –-no-create-info –-extended-insert –-default-character-set=latin1 dbname>alltable.sql

Parameter description:

--quick: Used to dump large tables, forcing mysqldump to retrieve data from the server one row at a time instead of retrieving all rows and outputting the cache to memory.

--no-create-info: The Create TABLE statement is not created.

--extended-insert: Using a multiline insert syntax that includes several values lists, so that the file is smaller, and the IO also small import data is very fast.

--default-character-set=latin1 Export Data according to the original character set, so that all Chinese in the exported file is visible and will not be stored as garbled.

(4) Open Alltable.sql to modify the set names latin1 to set names GBK (or modify the service side and client of the system).

(5) Building a library

Create database dbname default charset gbk;

(6) Create table execution, Alltable.sql

Mysql –uroot –p123456 dbname<alltable.sql

(7) Import data

Mysql –uroot –p123456 dbname<alltable.sql

Summary: Latin1 changed into UTF8

    1. Build the database and build the statement export, SED batch modified to UTF8.

    2. Export all data.

    3. Modify the MySQL service side and the client code to UTF8.

    4. Delete the original library table and data.

    5. The statement that imports the new build table for the database.

    6. Import all data from MySQL.

Mysql DBA Advanced Operations Learning Note-MYSQL Database character Set knowledge

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.