MySQL Character Set and Verification

Source: Internet
Author: User

Use case

First, let's look at an example:

There is a user table

User (username VARCHAR (5) character set utf8COLLATE utf8_general_ci );

Record ('username'); and ('username ');

Select * from User where username = 'username'; two data items are selected.

Then change COLLATE of username to utf8_bin;

User (username VARCHAR (5) character set utf8 COLLATE utf8_bin );


Then select * from User where username = 'username'; the first data entry is selected.

Why? Please refer to the following link:

Character set is a SET of symbols and encodings. COLLATE is a set of rules used to compare characters in character sets. Let's use an example of a hypothetical character set to make the difference clear.
Assume that we have an alphabet with four letters: 'A', 'B', 'A', and 'B '. We assign A value for each letter: 'A' = 0, 'B' = 1, 'A' = 2, 'B' = 3. The letter 'A' is A symbol, and the number 0 is the encoding of 'A'. These four letters and Their encoding are combined into A character set.
Suppose we want to compare the values of two strings: 'A' and 'B '. The simplest way to compare is to find the encoding: 'A' is 0, and 'B' is 1. Because 0 is less than 1, we can say that 'A' is less than 'B '. We only apply a verification rule to our character set. The checking rule is a set of rules (in this case, it is only a set of rules): "compare the encoding ." We call the simplest proofread rule among all these possible rules a binary (binary) proofread rule.
But what if we want lower-case letters to be equivalent to upper-case letters? Then, we have at least two rules: (1) the lowercase letters 'A' and 'B' are considered to be equivalent to 'A' and 'B'; (2) then compare the encoding. This is a case-insensitive checking rule. This is more complex than binary verification rules.

10.3.1. Server Character Set and Verification
The MySQL server has a server character set and a server verification rule. They cannot be set to null. MySQL determines the server Character Set and server verification rules as follows:
· Set valid options when the server is started
· Set the value according to the running time
At the server level, it is easy to determine the method. When mysqld is started, the server Character Set and proofreading rules are determined based on the initial options used. You can use -- default-character-set to set the character set, and you can add -- default-collation to the proofreading rule after the character set. If no character set is specified, it is the same as -- default-character-set = latin1. If you specify only one character set (for example, latin1) but not one verification rule, it is the same as -- default-charset = latin1 -- default-collation = latin1_swedish_ci, because latin1_swedish_ci is the default proofreading rule of latin1. Therefore, the following three commands have the same effect:
Shell> mysqld
Shell> mysqld -- default-character-set = latin1
Shell> mysqld -- default-character-set = latin1 \
-- Default-collation = latin1_swedish_ci
One way to change the set value is by recompiling. If you want to change the default server Character Set and verification rules when building from the source program, use -- with-charset and -- with-collation as configure parameters. For example:
Shell>./configure -- with-charset = latin1 or:
Shell>./configure -- with-charset = latin1 \
-- With-collation = latin1_german1_ci
Both mysqld and configure verify that the character set/collation combination is valid. If it is invalid, each program will display an error message and terminate it.
The current server Character Set and verification rules can be used as the value of character_set_server and collation_server system variables. The value of these variables can be changed at runtime.
10.3.2. database character set and Verification
Each database has a database character set and a database proofreading rule. It cannot be blank. The create database and alter database statements have an optional clause to specify the DATABASE character set and collation:
Create database db_name [[DEFAULT] character set charset_name] [[DEFAULT] COLLATE collation_name] alter database db_name [[DEFAULT] character set charset_name] [[DEFAULT] COLLATE collation_name] example:
Create database db_name default character set latin1 COLLATE latin1_swedish_ci; MySQL selects the database character set and DATABASE proofreading rules as follows:
· If character set x and collate y are specified, character set x and checking rule Y are used.
· If character set x is specified but collate y is not specified, character set x and character set x are used as the default proofreading rules.
· Otherwise, server character sets and server verification rules are used.
The create database... default character set... syntax of MySQL is similar to the create schema... character set... syntax of standard SQL. Therefore, you can create a database on the same MySQL server that uses different character sets and verification rules.
If the TABLE Character Set and collation are not specified in the create table statement, the database character set and collation are used as the default values. They have no other purpose.
The character set and collation rules of the default database can be used as character_set_database and collation_database system variables. The server sets the values of these two variables whenever the default database is changed. If no default database exists, these two variables have the same value as the corresponding server-level variables (character_set_server and collation_server.
10.3.3. Table Character Set and Proofreading
Each table has a table character set and a proofreading rule. It cannot be blank. To specify the TABLE Character Set and collation, the create table and alter table statements have an optional clause:
Create table tbl_name (column_list) [default character set charset_name [COLLATE collation_name] alter table tbl_name [default character set charset_name] [COLLATE collation_name] example:
Create table t1 (...) default character set latin1 COLLATE latin1_danish_ci;
MySQL selects the table Character Set and proofreading rules as follows:
· If character set x and collate y are specified, character set x and collate y are used.
· If character set x is specified but collate y is not specified, character set x and character set x are used as the default proofreading rules.
· Otherwise, server character sets and server verification rules are used.
If the column Character Set and collation are not specified in the column definition, the table Character Set and collation are used by default. Table character sets and proofreading rules are extensions of MySQL; they are not in standard SQL.
10.3.4. Column Character Set and Proofreading
Each "character" Column (that is, a CHAR, VARCHAR, or TEXT column) has a column character set and a column validation rule. It cannot be blank. The column definition syntax has an optional clause to specify the column Character Set and proofreading rules:
Col_name {CHAR | VARCHAR | TEXT} (col_length) [character set charset_name [COLLATE collation_name] example:
Create table Table1
(
Column1 VARCHAR (5) character set latin1 COLLATE latin?german=ci
);
MySQL selects the column Character Set and proofreading rules as follows:
· If character set x and collate y are specified, character set x and collate y are used.
· If character set x is specified but collate y is not specified, character set x and character set x are used as the default proofreading rules.
· Otherwise, use the table Character Set and server verification rules.
The character set and COLLATE clauses are standard SQL statements.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.