The calculation of field lengths for Oracle and MySQL has recently been found to be different (all UTF8 encoded), such as:
Defined under Oracle: Name VARCHAR2, the Name field can hold: 10 characters or 3 kanji
Defined under MySQL: Name varchar, name field can hold: 10 characters or 10 kanji
From the above you can tell: under Oracle, 1 Kanji = 3 bytes
Why under MySQL, 1 characters = 1 bytes??
After investigation, said: MySQL5 After the unit is a character, and Oracle's VARCHAR2 is a byte
Code is different. A character occupies a different byte:
UTF-8 1 Kanji = 3 bytes
GDK 1 Kanji = 2 bytes
MySQL varchar (50), both Chinese and English, is 50.
MySQL5 document, where the varchar field type is described as: varchar (m) variable length string. M represents the maximum column length. The range of M is 0 to 65,535. (The maximum actual length of varchar is determined by the size of the longest row and the character set used, and the maximum effective length is 65,532 bytes).
Why is it so transformed? I really feel that the MySQL handbook is too unfriendly, because you have to read it carefully to see this description: MySQL 5.1 complies with the standard SQL specification and does not remove trailing spaces for varchar values. VarChar is saved with a byte or two bytes long prefix + data. If the varchar column declaration is longer than 255, the length prefix is two bytes.
Well, it seems to understand a little. But specifically he said the length is greater than 255 when using a 2-byte length prefix, primary subtraction: 65535-2 = 65533 AH. Do not know how these Daniel calculate, for the moment to reserve doubt it?
Note: I tested it using UTF8 encoding, the maximum length of varchar is 21854 bytes.
In MySQL version 5.0.45, database encoding UTF8 is tested: varchar is defined as a maximum of 21785. That is, no matter the letters, numbers, Chinese characters, can only put 21,785.
Presumption: varchar byte maximum 65535,utf8 encodes a character of 3 bytes 65535/3=21785. However, when using the length function to find a value, a Chinese character occupies 3 bytes, a letter and other characters occupy a byte. For char (10), is the actual length variable?
Reference Links:
http://www.oschina.net/question/59889_12699
http://zhidao.baidu.com/question/132054814
mysql5.1 UTF8 encoding The next Chinese character takes up the doubt of a char