Mysql Character Set and verification set, mysql Verification
I. Character Set
1. mysql's Character Set settings are flexible
You can set the default Character Set of the server;
Default Character Set of the database;
The default Character Set of the table;
Column Character Set;
If no character set is specified for a level, the upper level is inherited.
View All character set statements: show character set;
2. Taking table declaration as UTF-8 as an example, the data stored in the table is UTF-8
① We want to tell the server what encoding is the data I sent to you? Character_set_client
② Tell the character set converter what encoding is converted? Character_set_connection
③ What encoding does the query result use? Character_set_results
If all three are character set N, You Can abbreviated it as set names N;
3. Under what circumstances will garbled characters occur?
First, create a test table and specify the table's character set as utf8.
Create table temp (
Name varchar (10)
) Charset utf8;
Then execute these three commands:
Set character_set_client = gbk;
Set character_set_connection = gbk;
Set_character_set_results = gbk; these three sentences can be abbreviated as set names gbk;
Insert a data entry to the table:
Insert into temp values ('China ');
Select * from temp; the result displayed at this time is normal.
When we execute this command again: set character_results = utf8;
Garbled,
The returned result may contain garbled characters when the character set of the result does not match the character set of the client.
In another case, garbled characters may occur: when the client statement is inconsistent with the facts, your client is utf8, but you have to convert it to gbk, for example:
There may be many other cases, which are not listed here.
4. Under what circumstances will data be lost?
Execute these three statements:
Set character_set_client = gbk;
Set character_set_connection = latin1;
Set character_set_results = gbk;
Insert a data entry into the table: insert into temp values ('u.s ');
The final query result is as follows:
The data is lost.
It can be concluded that when the character set of the connection and server is less than the client hour, data will be lost. The term may not be very accurate, but it can be understood. In my personal understanding, it is the same as when different data types in java are converted to each other. For example, if the double type is forcibly converted to the int type, the precision will be lost.
......
Ii. Checking set
What is a verification set?
Collation: the collation of character sets.
View All collation statements: show collation;
A character set can have one or more sorting rules.
Take utf8 as an example. The default setting is utf8_general_ci, which can be sorted in binary format.
How to declare a checkpoint?
Create table tableName
(
...
) Charset utf8 collate utf8_general_ci;
Note: The declared checkpoint set must be a valid checkpoint set of the character set. For example, if your character set is utf8, you cannot declare the checking set of gbk.
Iii. "compression" of proofreading rules"
Among the vast majority of queries, it is obvious which proofreading rules mysql uses for comparison. For example, in the following cases, the checking rules are clearly "column x checking Rules ":
|
select x from T order by x; select x from T where x = x; select distinct x from T; |
However, it may not be clear when multiple operands are involved. For example:
|
select x from T where x = 'Y' ; |
Should this query use the checking rules of column x or the checking rules of string text 'y?
Standardized SQL solves this problem by using "compression" rules. Basically, this means that since both x and 'y' have school rules, which school rules take precedence? This may be difficult to solve, but the following rules are applicable in most cases:
An external COLLATE clause is 0 (compression is not allowed at all .) The two strings connected using different proofread rules are 1. The compression of column checking rules is 2. The "system constant" (for example, the string returned by the USER () or VERSION () function) is 3. The compression of the text rule is 4. If the expression is NULL or the expression derived from NULL is 5.
The above compression value is currently used by mysql.
The above rules can be fuzzy:
Proofread rules that use the lowest possible compression value. If the two sides have the same condensibility, an error occurs if the proofreading rules are different.
Use the COERCIBILITY () function to determine the compression of a string expression:
|
mysql> SELECT COERCIBILITY( 'A' COLLATE latin1_swedish_ci); -> 0 mysql> SELECT COERCIBILITY(VERSION()); -> 3 mysql> SELECT COERCIBILITY( 'A' ); -> 4 |