MySQL character set and proofing set

Source: Internet
Author: User

One, Character set

1.mysql character set is very flexible

You can set the server default character set;

Database default character set;

Table default character set;

column character set;

If a level does not specify a character set, the previous level is inherited.

View all character Set statements: show character set;

2. In the case of a table declaration as utf-8, the data that is ultimately stored in the table is Utf-8

① we want to tell the server, what code do I send to you? character_set_client

② tell the character set converter what encoding to convert to? character_set_connection

What code does the result of ③ query use? Character_set_results

If the above three are character set N, can be abbreviated to set names N;

3. What happens when there are garbled characters?

We first create a test table that specifies the table's character set as UTF8

CREATE TABLE Temp (

Name varchar (10)

) CharSet UTF8;

Then execute these three commands:

Set CHARACTER_SET_CLIENT=GBK;

Set CHARACTER_SET_CONNECTION=GBK;

SET_CHARACTER_SET_RESULTS=GBK; These three sentences can be abbreviated as set names GBK;

Finally, insert a piece of data into the table:

INSERT into temp values (' China ');

SELECT * from temp; the results shown are normal.

When we execute this command again: set Character_results=utf8;

There is a garbled,

This results in garbled characters when the character set of the result returned does not match the character set of the client.

There is also a case where garbled characters are also present: When the client statement is inconsistent with the facts, your clients are UTF8, but you have to call me GBK , such as:

There may be a number of other cases that are not listed here.

4. Under what circumstances will the loss of data be caused?

Execute these three statements:

Set CHARACTER_SET_CLIENT=GBK;

Set character_set_connection=latin1;

Set CHARACTER_SET_RESULTS=GBK;

Insert a piece of data into the table: INSERT into temp values (' us ');

The results of the last query are as follows:

The data is missing.

It can be concluded that when the character set of the connection and the server is less than the client hour, the loss of the data may not be accurate, but it can be understood. Personal understanding, as in Java different data types are converted to each other, such as the double type cast to the int type, it will result in the loss of precision.

。。。。。。

   

Second, proofing set

What is a proofing set?

Proofing set: Refers to the collation of a character set.

View all proofing Set statements: Show collation;

A character set can have one or more collations.

Take UTF8 as an example, the default is to use the UTF8_GENERAL_CI proofing set, you can also press binary to row, Utf8_bin

How do I declare a proofing set?

CREATE TABLE TableName

(

...

) CharSet UTF8 collate utf8_general_ci;

Note: The declared proofing set must be a valid collation set of character sets. For example, if your character set is UTF8, you cannot declare GBK's proofing set.



Third, the "compressibility" of the proofreading rules

In most queries, it is clear that MySQL uses which collation to compare. For example, in the following cases, it is obvious that the proofing rules are "column X collation Rules":


123 selectx from T orderbyx;select x from T wherex = x;select distinctx fromT;

However, when multiple operands are involved, it may not be clear. For example:


1 selectx fromT wherex =‘Y‘;

Should this query use the collation of column x, or the collation of the string literal ' Y '?

Standardized SQL uses the "compressibility" rules to solve this problem. Basically, this means: since X and ' Y ' both have proofing rules, which collation rule is preferred? This may be difficult to resolve, but the following rules are appropriate for most situations:

an external COLLATE clause is 0 compressible (it cannot be compressed at all.) The compressibility of two string connections using different proofing rules is 1. The compressibility of the column proofing rules is 2. The compressibility of system constants (such as the string returned by the user () or version () function) is 3. The compressibility of the text rule is 4. Null or the compressibility of a null-derived expression is 5.

The above compressible values are currently used by MySQL.

Thus the above rules can be resolved in a vague way:

proofing rules that use the lowest compressible value. If the two sides have the same compressibility, an error occurs if the proofing rules are different.

Use the coercibility () function to determine the compressibility of a string expression:


123456 mysql> SELECTCOERCIBILITY(‘A‘COLLATElatin1_swedish_ci);    -> 0mysql> SELECTCOERCIBILITY(VERSION());    -> 3mysql> SELECTCOERCIBILITY(‘A‘);    -> 4 

MySQL Character set and proofing set

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.