標籤:
1:CHARACTER_SETS首先看一下查詢前十條的結果:[email protected] [information_schema]>select * from CHARACTER_SETS order by MAXLEN DESC limit 10;+--------------------+----------------------+---------------------------------+--------+| CHARACTER_SET_NAME | DEFAULT_COLLATE_NAME | DESCRIPTION | MAXLEN |+--------------------+----------------------+---------------------------------+--------+| utf32 | utf32_general_ci | UTF-32 Unicode | 4 || utf16le | utf16le_general_ci | UTF-16LE Unicode | 4 || gb18030 | gb18030_chinese_ci | China National Standard GB18030 | 4 || utf8mb4 | utf8mb4_general_ci | UTF-8 Unicode | 4 || utf16 | utf16_general_ci | UTF-16 Unicode | 4 || eucjpms | eucjpms_japanese_ci | UJIS for Windows Japanese | 3 || ujis | ujis_japanese_ci | EUC-JP Japanese | 3 || utf8 | utf8_general_ci | UTF-8 Unicode | 3 || gbk | gbk_chinese_ci | GBK Simplified Chinese | 2 || ucs2 | ucs2_general_ci | UCS-2 Unicode | 2 |+--------------------+----------------------+---------------------------------+--------+看一下官方給的解釋:
INFORMATION_SCHEMA Name |
SHOW Name |
Remarks |
CHARACTER_SET_NAME |
Charset字元集 |
|
DEFAULT_COLLATE_NAME |
Default collation預設排序 |
|
DESCRIPTION |
Description描述 |
MySQL extension |
MAXLEN |
Maxlen最大長度,位元組數 |
MySQL extension |
這個表包括了MySQL支援的所有的字元集,一共是41中字元集,拿utf8 來說,預設排序utf8_general_ci ,一個字元最多佔用三個位元組。漢字在UTF8下就佔用三個位元組。show create table 一下:| CHARACTER_SETS | CREATE TEMPORARY TABLE `CHARACTER_SETS` (`CHARACTER_SET_NAME` varchar(32) NOT NULL DEFAULT ‘‘,`DEFAULT_COLLATE_NAME` varchar(32) NOT NULL DEFAULT ‘‘,`DESCRIPTION` varchar(60) NOT NULL DEFAULT ‘‘,`MAXLEN` bigint(3) NOT NULL DEFAULT ‘0‘) ENGINE=MEMORY DEFAULT CHARSET=utf8 |我們可以看到,ENGINE=MEMORY預設的引擎是memory的,也就是每次重啟會重建一個一模一樣的表2:COLLATIONS首先看一下查詢前十條的結果:[email protected] [information_schema]>select * from COLLATIONS order by id limit 10;+-------------------+--------------------+----+------------+-------------+---------+| COLLATION_NAME | CHARACTER_SET_NAME | ID | IS_DEFAULT | IS_COMPILED | SORTLEN |+-------------------+--------------------+----+------------+-------------+---------+| big5_chinese_ci | big5 | 1 | Yes | Yes | 1 || latin2_czech_cs | latin2 | 2 | | Yes | 4 || dec8_swedish_ci | dec8 | 3 | Yes | Yes | 1 || cp850_general_ci | cp850 | 4 | Yes | Yes | 1 || latin1_german1_ci | latin1 | 5 | | Yes | 1 || hp8_english_ci | hp8 | 6 | Yes | Yes | 1 || koi8r_general_ci | koi8r | 7 | Yes | Yes | 1 || latin1_swedish_ci | latin1 | 8 | Yes | Yes | 1 || latin2_general_ci | latin2 | 9 | Yes | Yes | 1 || swe7_swedish_ci | swe7 | 10 | Yes | Yes | 1 |+-------------------+--------------------+----+------------+-------------+---------+老規矩,貼一下官方解釋:
INFORMATION_SCHEMA Name |
SHOW Name |
Remarks |
COLLATION_NAME |
Collation 連線校對 |
|
CHARACTER_SET_NAME |
Charset對應的字元集 |
MySQL extension |
ID |
Id排序第幾個,這個應該是MySQL自己編排的,不深究 |
MySQL extension |
IS_DEFAULT |
Default 表示的字元集是否被編譯到伺服器 |
MySQL extension |
IS_COMPILED |
Compiled 涉及的儲存空間中的字元集表達的字串進行排序所需的量。 |
MySQL extension |
SORTLEN |
Sortlen 涉及的儲存空間中的字元集表達的字串進行排序所需的量。 |
MySQL extension |
一般情況下,我們可以使用 SHOW COLLATION這個語句查看一下。show create table 一下:------------------------------------------------------------------------------------------------------------+| COLLATIONS | CREATE TEMPORARY TABLE `COLLATIONS` ( `COLLATION_NAME` varchar(32) NOT NULL DEFAULT ‘‘, `CHARACTER_SET_NAME` varchar(32) NOT NULL DEFAULT ‘‘, `ID` bigint(11) NOT NULL DEFAULT ‘0‘, `IS_DEFAULT` varchar(3) NOT NULL DEFAULT ‘‘, `IS_COMPILED` varchar(3) NOT NULL DEFAULT ‘‘, `SORTLEN` bigint(3) NOT NULL DEFAULT ‘0‘) ENGINE=MEMORY DEFAULT CHARSET=utf8 |+------------+-------------------------------------------------------記憶體表,系統自動產生,不會改變。3:COLLATION_CHARACTER_SET_APPLICABILITY看一下前十條資料,我們根據條件查詢一下。[email protected] [information_schema]>select * from COLLATION_CHARACTER_SET_APPLICABILITY where CHARACTER_SET_NAME like ‘%utf%‘ limit 10;+-------------------+--------------------+| COLLATION_NAME | CHARACTER_SET_NAME |+-------------------+--------------------+| utf8_general_ci | utf8 || utf8_bin | utf8 || utf8_unicode_ci | utf8 || utf8_icelandic_ci | utf8 || utf8_latvian_ci | utf8 || utf8_romanian_ci | utf8 || utf8_slovenian_ci | utf8 || utf8_polish_ci | utf8 || utf8_estonian_ci | utf8 || utf8_spanish_ci | utf8 |+-------------------+--------------------+10 rows in set (0.00 sec)老規矩,貼一下官方解釋:
INFORMATION_SCHEMA Name |
SHOW Name |
Remarks |
COLLATION_NAME |
Collation |
|
CHARACTER_SET_NAME |
Charset |
|
很明顯,就是一個字元集和連線校對的一個對應關係而已。毫無疑問的是這也是一個記憶體表,在初始化的會根據資料庫的版本自動產生。 下面我們說一下character sets和collations的區別:字元集(character sets)儲存字串,是指人類語言中最小的表義符號。例如’A‘、’B‘等;連線校對(collations)規則比較字串,collations是指在同一字元集內字元之間的比較規則每個字元序唯一對應一種字元集,但一個字元集可以對應多種字元序,其中有一個是預設字元序(Default Collation) MySQL中的字元序名稱遵從命名慣例:以字元序對應的字元集名稱開頭;以_ci(表示大小寫不敏感)、_cs(表示大小寫敏感)或_bin(表示按編碼值比較)結尾。例如:在字元序“utf8_general_ci”下,字元“a”和“A”是等價的看一下有關於字元集和校對相關的MySQL變數:– character_set_server:預設的內部操作字元集– character_set_client:用戶端來來源資料使用的字元集– character_set_connection:串連層字元集– character_set_results:查詢結果字元集– character_set_database:當前選中資料庫的預設字元集– character_set_system:系統中繼資料(欄位名等)字元集再看一下MySQL中的字元集轉換過程:1. MySQL Server收到請求時將請求資料從character_set_client轉換為character_set_connection;2. 進行內部操作前將請求資料從character_set_connection轉換為內部操作字元集,其確定方法如下:• 使用每個資料欄位的CHARACTER SET設定值;• 若上述值不存在,則使用對應資料表的DEFAULT CHARACTER SET設定值(MySQL擴充,非SQL標準);• 若上述值不存在,則使用對應資料庫的DEFAULT CHARACTER SET設定值;• 若上述值不存在,則使用character_set_server設定值。3. 將操作結果從內部操作字元集轉換為character_set_results。 其中有借鑒別人部落格,把地址貼下邊方便大家理解,也感謝博主的貢獻精神:http://www.laruence.com/2008/01/05/12.html
information_schema系列之字元集校正(CHARACTER_SETS,COLLATIONS,COLLATION_CHARACTER_SET_APPLICABILITY)