- mysql資料庫編碼層次:系統層,server層,database層,table層,column層,還有client,connection和result三種和用戶端通訊相關的情境;
- A character set is a set of symbols and encodings. A collation is a set of rules for comparing characters in a character set;
- To connect to MySQL from Java you have to use the JDBC driver from MySQL. The MySQL JDBC driver is called MySQL Connector/J.
- MySQL 4.1以下對unicode支援不好
- jdbc3.0.16及以上才支援使用資料庫本身編碼,否則使用ISO8859-1
- 在mysql控制台下輸入show variables like 'character_set_%'; 查看當前編碼相關係統變數,後面會解析其中幾項
mysql> SHOW VARIABLES LIKE 'character%';+--------------------------+---------------------------------+| Variable_name | Value |+--------------------------+---------------------------------+| character_set_client | latin1 || character_set_connection | latin1 || character_set_database | latin1 || character_set_filesystem | binary || character_set_results | latin1 || character_set_server | latin1 || character_set_system | utf8 || character_sets_dir | D:"mysql-5.0.37"share"charsets" |
mysql的多層字元編碼支援1.server層作用
整個資料庫伺服器預設編碼。
配置點
通過系統變數character_set_server參數指定server層編碼
可以在mysqld執行時加入該參數
或者在編輯mysql時候,設定該參數
2.database層作用
資料庫層級預設編碼。
配置點
通過系統變數character_set_database參數指定database層編碼
建表時候指定編碼
3.table層同理,table層的編碼設定僅影響當前表的所有未指定編碼的列的編碼。但這個指定是mysql專屬的,而且只能通過sql在建表或修改表時指定, 在標準sql中,沒有可指定表編碼的sql文法
4.column層作用
設定列的編碼。
配置點
建表或修改列時設定。這是標準sql文法。
mysql server與client互動時編碼如何轉換1.用戶端發送語句character set and collation system variables are involved in handling traffic for the connection between a client and the server. Every client has connection-related character set and collation system variables.
The server takes the character_set_client system variable to be the character set in which statements are sent by the client.
在用戶端和服務端通訊時,會涉及到另外幾個編碼設定相關的系統變數的,每個用戶端都有屬於自己的編碼連結相關編碼。
服務端使用系統變數character_set_client來處理用戶端發來的語句。
2.服務端處理語句The server uses the character_set_connection and collation_connection system variables. It converts statements sent by the client from character_set_client to character_set_connection (except for string literals that have an introducer such as _latin1 or _utf8)
服務端會把用戶端發來的語句(以character_set_client 編碼)轉換為character_set_connection編碼。
A character string literal may have an optional character set introducer and COLLATE clause [_charset_name]'string' [COLLATE collation_name]
如:SELECT _latin1'string' COLLATE latin1_danish_ci;
在缺少編碼指定是,預設會使用character_set_connection指定的編碼。
The character set used for literals that do not have a character set introducer and for number-to-string conversion.
沒有前置編碼修飾(introducer)的文本和數字到字元的轉換會應用character_set_connection編碼。
For comparisons of strings with column values, collation_connection does not matter because columns have their own collation, which has a higher collation precedence.
表的欄欄位與用戶端傳來的語句進行比較時,會把用戶端語句轉成列對應編碼再進行比較,這是因為欄欄位擁有更高優先順序。
3.服務端返回內容The character_set_results system variable indicates the character set in which the server returns query results to the client
系統變數character_set_results用來把資料以該編碼方式返回給用戶端。
下面用一張圖來大致描述下上面的內容(個人理解所畫)
服務端如何自動判斷並設定編碼?The character encoding between client and server is automatically detected upon connection. You specify the encoding on the server using the character_set_server for server versions 4.1.0 and newer, and character_set system variable for server versions older than 4.1.0. The driver automatically uses the encoding specified by the server.
如果用戶端串連時沒有提供編碼(串連串無characterEncoding),則服務端會使用character_set_server變數來作為用戶端編碼(4.1.0後)。
For example, to use 4-byte UTF-8 character sets with Connector/J, configure the MySQL server with character_set_server=utf8mb4, and leave characterEncoding out of the Connector/J connection string. Connector/J will then autodetect the UTF-8 setting.
To override the automatically detected encoding on the client side, use the characterEncoding property in the URL used to connect to the server.
When a client connects to the server, it sends the name of the character set that it wants to use. The server uses the name to set the character_set_client, character_set_results, and character_set_connection system variables. In effect, the server performs a SET NAMES operation using the character set name.
用戶端與服務端建立連結時,會發送用戶端所希望使用的編碼集。服務端會用這個編碼集去初始化三個系統變數character_set_client, character_set_results, and character_set_connection。如執行了語句 SET NAMES XXX一般:
SET NAMES xx可以指定connection編碼為xx:character_set_connection,character_set_results,character_set_client 系統變數可修改;
SET NAMES 'charset_name' 這句SQL等同與執行下面3個語句:
SET character_set_client = charset_name;SET character_set_results = charset_name;SET character_set_connection = charset_name;
A SET CHARACTER SET charset_name 等同於執行下面3個語句:
SET character_set_client = charset_name;SET character_set_results = charset_name;SET collation_connection = @@collation_database;
參考
http://dev.mysql.com/doc/refman/5.5/en/charset.html