The problem is as follows:
A test table. The character set uses latin1.
Select to_id from test where to_id = 'cn elephant _ Wang ';
+ --------------- +
| To_id |
+ --------------- +
| Cn Tao _ Tao |
| Cn xiang_wang |
+ --------------- +
2 rows in set (0.00 sec)
The data of cn elephant King was retrieved.
This is obviously not allowed.
View their encoding:
(Root @ im_offlog1a :) [test]> select hex ('cn Tao _ tao ');
+ ---------------- +
| Hex ('cn Tao _ tao') |
+ ---------------- +
| 636ECCD55FCCD5 |
+ ---------------- +
1 row in set (0.00 sec)
(Root @ im_offlog1a :) [test]> select hex ('cn elephant _ Wang ');
+ ---------------- +
| Hex ('cn elephant _ Wang ') |
+ ---------------- +
| 636ECFF35FCDF5 |
+ ---------------- +
1 row in set (0.00 sec)
The encoding is indeed different, but why does mysql think the two records are the same?
At the beginning, we located the problem caused by collation.
View show variables
| Collation_connection | latin1_swedish_ci
| Collation_database | latin1_swedish_ci
| Collation_server | latin1_swedish_ci
Manually change these parameters to latin1_bin, and the result is the same. This is really strange.
Here we will first explain the naming rules for mysql collation:
They start with their relevant Character Set names, usually include a language name, and end with _ ci (Case Insensitive), _ cs (case sensitive), or _ bin (Binary)
For example, the latin1 character set has the following correction rules:
Proofread Rule Meaning
Latin1_german1_ci Germany DIN-1
Latin1_swedish_ci Sweden/Finland
Latin1_danish_ci Denmark/Norway
Latin1_german2_ci Germany DIN-2
Latin1_bin complies with latin1 encoding binary
Latin1_general_ci multi-language (Western Europe)
Latin1_general_cs multi-language (Western Europe ISO), case sensitive
Latin?spanish_ci modern Spain
Finally, we recreate the table and manually specify the table-level collation as latin1_bin.
This problem is solved.
The problem arises again. Why does it not take effect when I manually test latinrelease bin?
MySQL used to select the table Character Set and proofreading rules as follows:
If character set x and collate y are specified, character set x and collate y are used.
If the character set x is specified but the collate y is not specified, the character set x and character set x are used as the default proofreading rules.
Otherwise, the server Character Set and server verification rules are used.
The character set is specified during table creation, so it always adopts the corresponding default proofreading rules.
Of course, we do not need TO recreate the table. We only need TO alter table db_allot convert to character set latin1 COLLATE latin1_bin to convert the table.
In addition, we recommend that collation use the bin-type checking rules of the character set as much as possible to avoid errors.