MySQL collation method

Source: Internet
Author: User
In the past, Oracle was rarely used for its collation method. However, in MySQL, problems may occur if you do not pay attention to this.

The problem is as follows:
A test table. The character set uses latin1.
Select to_id from test where to_id = 'cn elephant _ Wang ';
+ --------------- +
| To_id |
+ --------------- +
| CN Tao _ Tao |
| CN xiang_wang |
+ --------------- +
2 rows in SET (0.00 Sec)
The data of CN elephant King was retrieved.
This is obviously not allowed.
View their encoding:
(Root @ im_offlog1a) [test]> select hex ('cn Tao _ tao ');
+ ---------------- +
| Hex ('cn Tao _ tao') |
+ ---------------- +
| 636eccd55fccd5 |
+ ---------------- +
1 row in SET (0.00 Sec)
(Root @ im_offlog1a) [test]> select hex ('cn elephant _ Wang ');
+ ---------------- +
| Hex ('cn elephant _ Wang ') |
+ ---------------- +
| 636ecff35fcdf5 |
+ ---------------- +
1 row in SET (0.00 Sec)
The encoding is indeed different, but why? MySQL Will the two records be the same?
At the beginning, we located the problem caused by collation.
Show variables view (use the command -- show variables like "character % ")
| Collation_connection | latin1_swedish_ci
| Collation_database | latin1_swedish_ci
| Collation_server | latin1_swedish_ci
(1)
Manually change these parameters to latin1_bin, and the result is the same. This is really strange.
Here, I will explain it first. MySQL Collation naming rules:
They start with their relevant Character Set names, usually include a language name, and end with _ Ci (Case Insensitive), _ CS (case sensitive), or _ Bin (Binary)
For example, the Latin1 character set has the following correction rules:
Proofread Rule Meaning
Latin1_german1_ci Germany din-1
Latin1_swedish_ci Sweden/Finland
Latin1_danish_ci Denmark/Norway
Latin1_german2_ci Germany din-2
Latin1_bin complies with Latin1 encoding binary
Latin1_general_ci multi-language (Western Europe)
Latin1_general_cs multi-language (Western Europe ISO), case sensitive
Latin?spanish_ci modern Spain
(2)
Finally, we recreate the table. Manually specify the table-level collation as latin1_bin.
This problem is solved.

The problem arises again. Why does it not take effect when I manually test latinrelease bin?
OriginalMySQLSelect the table Character Set and proofreading rules as follows:
If character set X and collate y are specified, character set X and collate y are used.
If the character set X is specified but the Collate y is not specified, the character set X and character set X are used as the default proofreading rules. Otherwise, the server Character Set and server verification rules are used.
The character set is specified during table creation, so it always adopts the corresponding default proofreading rules.
(3)Of course, we do not need to recreate the table. We only need to alter table db_allot convert to Character Set Latin1 collate latin1_bin to convert the table.In addition, we recommend that collation use the bin-type checking rules of the character set as much as possible to avoid errors..

let's talk about my own experience. I think Character Set Latin1 collate latin1_bin is an improvement of the old version of varchar binary, but in the new version, the character set is used to set the character set first, use this character set name to add _ bin to set the checking rules to binary, so that make sure that the Chinese Query is correct . After testing, we changed the attribute of this field to
alter table 'comment _ content_000001 'change 'thread' varchar (50) without binary) default null
then we can check whether the table structure is actually 'thread' varchar (50) default null, that is, it does not contain Character Set Latin1 collate latin1_bin, it can be seen that character set Latin1 collate latin1_bin is an improved version of varchar binary. In addition, I also read more convenient practices. you do not need to modify the field attributes one by one, as long as the table-level Collation is latin1_bin, .
test:
alter table comment_content_000001 convert to Character Set Latin1 collate latin1_bin
,
export the table structure

Create Table comment_content_000001 (
content_id int (11) not null auto_increment,
thread varchar (50) Collate latin1_bin default null,
uname varchar (100) collate latinrelease bin default null,
Nick varchar (100) Collate latinrelease bin default null,
uid int (11) unsigned default null,
content text collate latinrelease bin,
post_time datetime default null,
post_ip int (10) unsigned defau Lt null,
'status' Enum ('unaudit', 'normal', 'deleted') Collate latin1_bin not null default 'unaudit',
Primary Key (content_id)
) engine = MyISAM default charset = Latin1 collate = latin1_bin;
even if collate is not set for each field, it is all collate latin1_bin.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.