MySQL Character set---case sensitive to proofing rules that's something.

Source: Internet
Author: User

In the relational database, ORACLE/SQL_SERVER/DB2 has a case-sensitive data, but MySQL is a bit strange, the data is not case-sensitive, see below for details.

MySQL the default rules for database name, table name, column name, and alias capitalization under Linux are:

1. The database name and table name are strictly case-sensitive.

2. The alias of the table is strictly case-sensitive.

3. The alias of the column name and column is case-insensitive in all cases.

4. The field content (that is, data) is case insensitive by default.

5. Variable names (functions and stored procedures) are also strictly case-sensitive.

So we would like to change, and what is the control of it? It is the collation of the character set and the character set that is limited.


What are character sets and proofing rules?

The character set is the encoding table of our character parsing, and in the bottom of the computer, any character is simply a code that cannot be parsed directly, and it needs these coded tables to parse what word it is. The most basic is the ASCII code table, but it is too small to represent English and some of the daily punctuation is OK, But to represent the world so many words, it is obviously not enough to use, so out of various character sets to parse a variety of text, such as Chinese Simplified gb2312, Chinese traditional big5, the most famous utf8, as well as support emoji expression utf8mb4.

We often say that the access garbled, it is due to the character encoding asymmetry, it may be between you and the server, it may be between the server and the database, there may be no conversion between the internal code, and so on.

The character set is universal, exists in the computer World environment, the database is only one of them, and the proofing rules are for MySQL, the rules are fixed. Some people may feel confused, have a character set, why also to proofread rules, easy to understand point, there are differences between the characters, but what to reflect the difference (such as sorting and grouping operations), that is the existence of this proofing rules. For example, a and a are of the same significance, When the proofing rules are strict, the case is differentiated, and a and a are different.


Character

We can use the following command to see which character sets and proofing rules are supported.

#查看支持哪些字符集, excerpt mysql> show character set;+----------+---------------------------------+------- --------------+--------+| charset  | description                      | Default  collation   | maxlen |+----------+---------------------------------+----------------- ----+--------+| big5     | big5 traditional chinese         | big5_chinese_ci     |       2 | |  latin1   | cp1252 West European             | latin1_swedish_ci   |      1  | |  ascii    | US ASCII                        |  ascii_general_ci    |      1 | |  gb2312   | GB2312 Simplified Chinese        | gb2312_chinese_ci   |      2 | |  gbk      | GBK Simplified Chinese           | gbk_chinese_ci      |       2 | |  utf8     | UTF-8 Unicode                    | utf8_general_ci      |      3 | |  utf8mb4  | utf-8 unicode                    | utf8mb4_general_ci   |      4 | |  utf16    | UTF-16 Unicode                   | utf16_general_ci    |       4 | |  utf32    | UTF-32 Unicode                   | utf32_general_ci    |       4 | |  binary   | Binary pseudo charset            | binary               |      1 | |  gb18030  |  china national standard gb18030 | gb18030_chinese_ci  |       4 |+----------+---------------------------------+---------------------+--------+41  rows in set  (0.01 SEC)

In general, MySQL still supports more character sets by default, but in most cases we use UTF8 more. The reason is to see the last column MaxLen, which represents the maximum number of bytes that are used by a character after using this character set. Big5 (Chinese Traditional) and gb2312 (Chinese Simplified) Although accounted for less, only 2 bytes, but the generality of poor, utf16 although very powerful, but the number of bytes accounted for a little more, took up 4 bytes, and it is not necessarily used so much, the tradeoff is still utf8 easy to use some, a character occupies 3 bytes.

Changing the default character set for MySQL can be added in the config file my.cnf

#要在 [MYSQLD] subkey add [mysqld] #全局默认字符集类型, according to the requirements set character-set-server = utf8# after the restart, go to MySQL to see, (Do not restart also line, slowly change it) mysql> show variables like  ' character% '; +--------------------------+------------------- --------------------------------------------+| variable_name             | Value                                                            |+--------------------------+--------------------- ------------------------------------------+| character_set_client     |  utf8                                                             | |  character_set_connection | utf8                                                             | |  character_set_database   | utf8                                                             | |  character_set_filesystem | binary                                                           | |  character_set_results    | utf8                                                             | |  character_set_server     | utf8                                                             | |  character_set_system     | utf8                                                             | |  character_sets_dir       | /usr/local/ mysql-5.7.18-linux-glibc2.5-x86_64/share/charsets/ |+--------------------------+------------------------- --------------------------------------+8 rows in set  (0.00 SEC)


Character Proofing Rules

To finish the character set, let's see what the collation rules are.

#显示utf8字符集下有哪些校对规则, excerpt mysql> show collation like  ' utf8\_% '; +--------------------------+- --------+-----+---------+----------+---------+| collation                 | charset | id  | default |  compiled | sortlen |+--------------------------+---------+-----+---------+----------+---- -----+| utf8_general_ci          | utf8     |  33 | Yes     | Yes       |       1 | |  utf8_bin                  | utf8    |  83 |          | yes      |       1 | |  utf8_unicode_ci          | utf8     | 192 |         | Yes       |       8 | |  utf8_icelandic_ci        | utf8    |  193 |         | Yes       |       8 | |  utf8_unicode_520_ci      | utf8    | 214  |         | yes      |        8 | |  utf8_vietnamese_ci       | utf8    |  215 |         | yes      |        8 | |  utf8_general_mysql500_ci | utf8    | 223 |          | Yes      |        1 |+--------------------------+---------+-----+---------+----------+---------+27  rows in set  (0.00 SEC)

There are many proofing rules, but most of us do not use, we also see the other rules a lot of Sortlen is 8, more than the first two more commonly used, will compare the cost of resources is certain.

Each character set has a default proofing rule, for example, the UTF8 default proofing rule is utf8_general_ci. Coexistence in proofing rules naming conventions: They start with their associated character set names, usually include a language name, and are _ci (case insensitive), _cs (case sensitive), or _ Bin (two yuan/case sensitive) end.
As a result, MySQL's UTF8 character set is not sensitive to data capitalization by default.

Change the default character proofing rules for MySQL, or you can add them in the config file my.cnf

#要在 [MYSQLD] subkey add [mysqld] #全局默认字符校对规则, according to the requirements set collation_server = utf8_bin# After the restart, go to MySQL to see, (Do not restart also line, slowly change it) mysql> Show variables like ' collation_% '; +----------------------+----------+| variable_name | Value |+----------------------+----------+| collation_connection | Utf8_bin | | Collation_database | Utf8_bin | | Collation_server | Utf8_bin |+----------------------+----------+3 rows in Set (0.00 sec) #当然你也可以视实际情况只改库级别和链接级别, is allowed to be different, But you know the risk is collation_database = Utf8_bincollation_connection = Utf8_bin


Attention

Note that these changes take effect only on subsequent database tables and users, that existing databases are not affected, and that you want to change the definition of a table, you must change the structure of the tables, or rebuild the database.

The method of change is alter, the method created if the default does not need to do, if not by default will be specified.

#更改表的字符集和字符校对规则ALTER Table name MODIFY COLUMN field name varchar (CHARACTER) SET UTF8 COLLATE utf8_bin;

It is important to note that changing the character proofing rules for tables that already have databases does not affect the use of tables, as long as there is no conflict on the table data, only the partial query and sorting/grouping operations will cause gaps, but the character set will directly result in garbled data, so be very cautious, Sometimes it's better to rebuild the database than to actually.

If you are only modifying the database's table name case sensitivity, then MY.CNF has a dedicated configuration option, and after the direct configuration is complete, the token name's casing rules will be changed by the reboot Lower_case_table_names

#要在 [MYSQLD] Subkey adds [mysqld] #表名的大小写敏感选项, 0 is case sensitive, 1 is case insensitive lower_case_table_names = # restart takes effect


MySQL Character set---case sensitive to proofing rules that's something.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.