MySQL's varchar length problem

Source: Internet
Author: User

if the setting in an item is varchar (So, of course, the English isSo what about Chinese?Utf-8 in Chinese is 3 bytesSo, does this varchar (50) have only 16 characters to save? No, this is the case with MySQL, but not after 5.0. MySQL varchar (50) Both Chinese and English are stored 50, but the total length of all varchar fields in a table is related to the encoding, if it is utf-8, then about 65535/3, if it is GBK, then probably 65535/2.
MySQL5 document, where the varchar field type is described as: varchar (m) variable length string. M represents the maximum column length. The range of M is 0 to 65,535. (The maximum actual length of varchar is determined by the size of the longest row and the character set used, and the maximum effective length is 65,532 bytes). Why is it so transformed? I really feel that the MySQL handbook is too unfriendly, because you have to read it carefully to see this description: MySQL 5.1 complies with the standard SQL specification and does not remove trailing spaces for varchar values. VarChar is saved with a byte or two bytes long prefix + data. If the varchar column declaration is longer than 255, the length prefix is two bytes.

What is the maximum length of varchar in MySQL? This is not a fixed number. This article briefly describes the restriction rules.

1. Restriction rules

The restriction of a field has the following rules when the field is defined:

a) storage limits

The varchar field is to store the actual content separately from the clustered index, the actual storage starts at the second byte, and then the actual length is 1 to 2 bytes (2 bytes longer than 255), so the maximum length cannot exceed 65535.

b) encoding length limit

The character type is GBK, with a maximum of 2 bytes per character

The character type is UTF8, with a maximum of 3 bytes per character

If the limit above is defined, the varchar field is forcibly converted to the text type and generates warning.

c) line length limit

The length of a row definition is the limit of the varchar length in the actual application. MySQL requires that a row's definition length cannot exceed 65535. If the defined table length exceeds this value, the prompt

ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. You have the to change some columns to TEXT or BLOBs.


2. Calculation Example

Give two examples to illustrate the actual length of the calculation.

A) If a table has only one varchar type, as defined as

CREATE table t4 (c varchar (N)) CHARSET=GBK;

The maximum value for n here is (65535-1-2)/2= 32766.

The reason for minus 1 is that the actual row storage starts with the second byte ';

The reason for minus 2 is that the 2 bytes of the varchar header represent the length;

The reason for the addition of 2 is that the character encoding is GBK.

b) If a table is defined as

CREATE table t4 (c int, C2 char (+), C3 varchar (N)) Charset=utf8;

The maximum value of n here is (65535-1-2-4-30*3)/3=21812

Minus 1 and minus 2 are the same as in the previous example;

The reason for minus 4 is that the int type C accounts for 4 bytes;

The reason for reducing 30*3 is that char (30) occupies 90 bytes and the encoding is UTF8.

If the varchar exceeds the B rule above, and is strongly turned into the text type, then each field occupies a defined length of 11 bytes, which is not "varchar", of course.

===========================================================from:http://www.cnblogs.com/doit8791/archive/2012/ 05/28/2522556.html

Today's new start of the project in the database design, found that they are not familiar with the varchar type of MySQL, so the online collection of data collated as follows.

Variations of the 1.varchar type

The maximum length of the MySQL database's varchar type is limited to 255 in versions under 4.1, and its data range can be 0~255 or 1~255 (depending on the database of the different versions). In versions above MySQL5.0, the varchar data type is supported to 65535, which means that 65,532 bytes of data can be stored, and the starting and ending bits take up 3 bytes, which means Data that needs to be stored in a fixed text or blob format in 4.1 or later can be stored using variable-length varchar, which effectively reduces the size of the database file.

The varchar type of the MySQL database is in versions below 4.1, nvarchar (characters that store Unicode data types), whether a character or a Chinese character, are stored as 2 bytes, which is generally used as input in Chinese or other languages, which is not easily garbled; varchar: The kanji is 2 bytes, the other character designators is 1 bytes, and varchar is suitable for inputting English and numerals.

4.0 versions, varchar (20), refers to 20 bytes, if stored UTF8 kanji, can only save 6 (each Kanji 3 bytes), 5.0 version above, varchar (20), refers to 20 characters, regardless of the number of storage, Letters or UTF8 Kanji (3 bytes per kanji), can be stored 20, the maximum size is 65532 bytes, varchar (20) in MYSQL4 is the largest is only 20 bytes, but MYSQL5 according to the code, the storage size is different, the following rules:

A) storage limits

The varchar field stores the actual content separately from the clustered index, and the content begins with 1 to 2 bytes representing the actual length (2 bytes in length over 255), so the maximum length cannot exceed 65535.

b) Encoding length limit

If the character type is GBK, each character can be up to 2 bytes, and the maximum length cannot exceed 32766.

If the character type is UTF8, each character can be up to 3 bytes and the maximum length cannot exceed 21845.

If the limit above is defined, the varchar field is forcibly converted to the text type and generates warning.

c) Limit of the length of the president

The length of a row definition is the limit of the varchar length in the actual application. MySQL requires that a row's definition length cannot exceed 65535. If the defined table length exceeds this value, the prompt

ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. You have the to change some columns to TEXT or BLOBs.

2.CHAR (M), VARCHAR (m) different

CHAR (m) defines the length of the column as fixed, and the M value can be between 0~255, and when the char value is saved, a space is padded on their right to reach the specified length. When a char value is retrieved, the trailing space is removed. No case conversions are made during the storage or retrieval process. Char is convenient to store the fixed-length data, the index on the Char field is more efficient, such as the definition of char (10), then regardless of whether you store data reached 10 bytes, take up 10 bytes of space, insufficient to automatically fill with spaces.

varchar (m) defines the length of the column as a variable length string, and the M value can be between 0~65535, (the maximum effective length of varchar is determined by the maximum row size and the character set used. The overall maximum length is 65,532 bytes). The varchar value is saved with only the required number of characters, plus one byte to record the length (two bytes if the column declaration is longer than 255). The varchar value is not populated when it is saved. The trailing spaces on the value save and retrieve are still preserved, conforming to standard SQL. VARCHAR stores variable-length data, but the storage efficiency is no higher than char. If the possible value of a field is not fixed length, we only know that it cannot exceed 10 characters, it is the most advantageous to define it as VARCHAR (10). The actual length of the varchar type is +1 of the actual length of its value. Why "+1"? This byte is used to hold the length that is actually used. From the space consideration, with the varchar suitable, from the efficiency consideration, uses the char to be suitable, the key is to find the tradeoff point according to the actual situation.

The biggest difference between char and varchar is that one is fixed length and one is variable length. Because it is variable length, it is actually stored as the actual string plus a byte of the length of the record string (if more than 255 requires two bytes). If the value assigned to a char or varchar column exceeds the maximum length of the column, the value is clipped so that it fits. If the character being cropped is not a space, a warning is generated. If you crop a non-whitespace character, it causes an error (rather than a warning) and disables the insertion of values by using strict SQL mode.

3. The difference between varchar and text, BLOB types

The Varchar,blob and text types are variable-length types, and for their storage requirements depend on the actual length of the column values (denoted by L in the preceding table), rather than depending on the maximum possible size of the type. For example, a varchar (10) column can hold a string with a maximum length of 10 characters, the actual storage needs to be the length of the string, plus 1 bytes to record the length of the string. For the string ' ABCD ', L is 4 and the storage requirement is 5 bytes.

The blob and text types require either three-or 4-byte to record the length of the column values, depending on the maximum possible length of the type. varchar requires a defined size, with a maximum limit of 65535 bytes, and the text is not required. If you assign a value that exceeds the maximum length of a column type to a BLOB or text column, the value is truncated to fit.

A blob is a large binary object that can hold a variable amount of data. 4 blob types Tinyblob, blobs, Mediumblob, and Longblob differ only in the maximum length that they can hold values.

BLOBs can store pictures, text does not, text can only store plain text files. The 4 text types Tinytext, text, Mediumtext, and Longtext correspond to 4 blob types and have the same maximum length and storage requirements. The only difference between blob and text types is that the sorting and comparison of BLOB values is case-sensitive, and the text value is case insensitive. In other words, a text is a case-insensitive blob.

4. Summarize Char,varchar,text Differences

The length of the difference, Char range is 0~255,varchar the longest is 64k, but note here 64k is the length of the entire row, to consider the other column, and if there is not NULL when it will occupy a bit, for different character sets, effective length is not the same, such as UTF8, up to 21845, but also to remove the other column, but varchar in general, the storage is sufficient. If you encounter a large text, consider using text, which can be up to 4G.

Efficiency is basically char>varchar>text, but if you are using a InnoDB engine, we recommend using varchar instead of char

Char and varchar can have default values, text cannot specify default values

It is necessary to choose the appropriate data type storage for the database, which has a certain effect on performance. Here in the piecemeal record two, for the int type, if you do not need to access negative values, preferably with unsigned; For fields that often appear in the Where statement, consider indexing, especially for indexing.

MySQL's varchar length problem

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.