Mysql Character Set and mysql Character Set

Source: Internet
Author: User

Mysql Character Set and mysql Character Set
The character set defines the character encoding method. ASCII, GBK, and UTF8 character sets are common. All character sets are compatible with the ASCII character set.

Character Set Is negative long Encoding Method
ASCII Yes Single-byte 7-bit encoding
GBK Yes Dual-byte encoding
UTF8 No (variable length) 1 ~ 4-byte encoding
UNICODE encoding is generally two bytes. It can encode 99% characters in use, improving efficiency and memory space. For MYSQL, the UNICODE Character Set is UTF8.

When MYSQL executes the explain command, the length of key_len has a great relationship with the character set.

(1) additional information of the index field: it can be divided into Variable Length and fixed length data types. When the index field is of a fixed-length data type, such as char, int, datetime, it must have a null flag, which occupies 1 byte. for variable-length data types, for example: in addition to null tags, varchar also requires length information, which occupies 2 bytes (so the extended data type requires three additional bytes). (Note: when the field is defined as non-empty, the null mark does not occupy bytes)

(2). At the same time, you also need to consider the character sets used in the table. Different character sets, gbk encoding is one character 2 bytes, utf8 encoding is one character 3 bytes;


MYSQL supports multiple character sets. You can view the character set show charater set supported by mysql (or use select * from information_schema.character_sets)

Mysql character sets include CHARACTER sets (CHARACTER) and COLLATION rules. CHARACTER sets are used to define how MYSQL stores strings, and COLLATION is used to define how strings are compared. Character sets have one-to-multiple relationship with verification rules. Each character set has at least one verification rule (default verification rule ).

You can use show collation like 'utf8' to view the character set verification rules.


Naming rules: Character Set Name + language name + suffix. Suffix _ ci indicates case insensitive, _ cs indicates case sensitive, and _ bin indicates that the comparison is based on character encoding values and is not related to language.
MYSQL supports character sets of multiple granularities, ranging from large to small: servers> databases> tables> fields. MYSQL Character Set settings:
When no replica set is set, latin1 is used as the server replica set by default. After character_set_server = utf8 is added under mysqld in the/etc/my. cnf file, restart the mysql service and run the command again.
The character set and collation rules of the database are specified when the database is created. You can also use alter database to modify the database after the database is created. However, note: If data already exists in the Database, because modifying the character set cannot store the existing data according to the new character set, you cannot modify the character set of the database to directly modify the data. Database character Set and verification rules: show variables like 'character _ set_database'
If the database does not specify a character set or collation, use the MYSQL service character set and the default Collation for this character set. You can also use create database character_set2 charset gbk collate gbk_chinese_ci to specify character sets and verification rules.
Character Set and proofreading rules of a table: the character set and proofreading rules of a table can be executed when a table is created or modified using alter table. Similarly, if data already exists in the table, modifying the character set will not affect the original records and will not be stored in accordance with the new character set. If no character set is set for a table, the table inherits the character set of the upper level.
You can use show create table.
MYSQL provides three different parameters for interaction between the client and the server, including character_set_client, character_set_connection, and character_set_results, which represent the client, character Set of the connection and returned results. Generally, these three character sets are the same to each other to ensure that the data is read and written correctly.
  1. Set names xxxx (need to be executed every time)
  2. Add default_character_set = xxxx to mysql segment in my. cnf (permanent change)

Copyright Disclaimer: This article is an original article by the blogger and cannot be reproduced without the permission of the blogger.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.