String type
The MySQL string is divided into two broad categories:
1 binary string: That is, a sequence of bytes, the interpretation of the byte does not involve the character set, so it has no character set and the concept of sorting
2 A non-binary string: A sequence of characters, the character set used to interpret the contents of a string, and the sort determines the size of the character
Character sets and Sorting methods
The relationship between the character set and the sort method is this: a character set can have one or more sorts, and there is a default sort, which we can illustrate with the following example:
Mysql> show character set like '%gbk% ';
+---------+------------------------+-------------------+--------+
| Charset | Description | Default Collation | MaxLen |
+---------+------------------------+-------------------+--------+
| GBK | GBK Simplified Chinese | Gbk_chinese_ci | 2 |
+---------+------------------------+-------------------+--------+
1 row in Set (0.00 sec)
mysql> Show Collation like '%gbk% ';
+----------------+---------+----+---------+----------+---------+
| Collation | Charset | Id | Default | Compiled | Sortlen |
+----------------+---------+----+---------+----------+---------+
| gbk_chinese_ci | GBK | 28 | Yes | Yes | 1 |
| Gbk_bin | gbk | | Yes | 1 |
+----------------+---------+----+---------+----------+---------+
2 rows in Set (0.00 sec)
From the example above we can see that the character set GBK has two sorting methods (Gbk_chinese_ci what Gbk_bin), where the default sort is gbk_chinese_ci.
The naming rules for sorting are: Character Set name _ language _ suffix, where the meaning of each typical suffix is as follows:
1) _ci: case-insensitive sorting
2) _cs: A case-sensitive sorting method
3) _bin: Binary ordering, size comparison will be based on character encoding, does not involve human language, so the _bin sorting method does not contain human language
Therefore, the gbk_chinese_ci sort means that the character set is GBK, the human language uses Chinese to compare size, and is case-sensitive when compared.
Common functions
Character Set boot
Character set booting allows MySQL to specify a character set to interpret the constants of the word, whose syntax is:
_charset Str
Such as:
_utf8 ' ABCD ' indicates that the string constant ' ABCD ' is introduced in the UTF8 character set
Character Set conversion
The CONVERT () function converts a string to the specified character set, and its syntax is:
Convert (str using charset)
such as convert (' ABCD ' using UTF8) that converts the ' ABCD ' character set to Uft8
Length () function--Returns the size of the byte
Char_length ()--return character length
System variables associated with the character set
You can view the system variables related to the character set by using the following statement:
Mysql> Show variables like ' character\_set\_% ';
+--------------------------+--------+
| Variable_name | Value |
+--------------------------+--------+
| character_set_client | latin1 |
| character_set_connection | Latin1 |
| Character_set_database | GBK |
| character_set_filesystem | binary | |
character_set_results | latin1 |
| Character_set_server | GBK |
| | character_set_system | UTF8
| +--------------------------+--------+
7 rows in Set (0.01 sec)
mysql> show variables like ' collation\_% '; c24/>+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | latin1_swedish_ci |
| Collation_database | gbk_chinese_ci |
| | collation_server | gbk_chinese_ci
| +----------------------+-------------------+
3 rows in Set (0.00 sec)
They have the following meanings:
Character_set_system:mysql the character set used by the database identifier, always the UTF8
Character_set_server and Collation_server: The default character set and sorting method for the server
Character_set_database and Collation_database: Default character Set and sort method for the current database
The following three variables affect communication between the client and the server:
Character_set_client: The character set used by the client to send SQL statements to the server
Character_set_results: The character set used by the server when returning results to the client
Character_set_connection: If it differs from character_set_client, the SQL statement sent from the client will be converted to the character set it specifies
By default, all three of these variables are set to the same value, and if a client wants to communicate with the server using another character set, you can modify them, such as:
Set character_set_client = UTF8;
Set character_set_results = UTF8;
Set character_set_connection = UTF8;
Alternatively, the simpler approach would be to use one of the following statements to achieve the same effect:
Set names ' UTF8 ';
See more highlights of this column: http://www.bianceng.cn/database/MySQL/