Java Learning Article---Summary of varchar types in MySQL

Source: Internet
Author: User
Tags md5 md5 hash

Summary of varchar types in MySQL

Today to do a new project, the need for their own database design, found themselves on the use of varchar is not very familiar with, so review the data summarized if:

variations of the 1.varchar type

The maximum length of the MySQL database's varchar type is limited to 255 in versions under 4.1, and its data range can be 0~255 or 1~255 (depending on the database of the different versions). In versions above MySQL5.0, the varchar data type is supported to 65535, which means that 65,532 bytes of data can be stored, and the starting and ending bits take up 3 bytes, which means Data that needs to be stored in a fixed text or blob format in 4.1 or later can be stored using variable-length varchar, which effectively reduces the size of the database file.

The varchar type of the MySQL database is in versions below 4.1, nvarchar (characters that store Unicode data types), whether a character or a Chinese character, are stored as 2 bytes, which is generally used as input in Chinese or other languages, which is not easily garbled; varchar: The kanji is 2 bytes, the other character designators is 1 bytes, and varchar is suitable for inputting English and numerals.

4.0 versions, varchar (20), refers to 20 bytes, if stored UTF8 kanji, can only save 6 (each Kanji 3 bytes), 5.0 version above, varchar (20), refers to 20 characters, regardless of the number of storage, Letters or UTF8 Kanji (3 bytes per kanji), can be stored 20, the maximum size is 65532 bytes, varchar (20) in MYSQL4 is the largest is only 20 bytes, but MYSQL5 according to the code, the storage size is different, the following rules:

A) storage limits

The varchar field stores the actual content separately from the clustered index, and the content begins with 1 to 2 bytes representing the actual length (2 bytes in length over 255), so the maximum length cannot exceed 65535.

b) Encoding length limit

If the character type is GBK, each character can be up to 2 bytes, and the maximum length cannot exceed 32766.

If the character type is UTF8, each character can be up to 3 bytes and the maximum length cannot exceed 21845.

If the limit above is defined, the varchar field is forcibly converted to the text type and generates warning.

c) Limit of the length of the president

The length of a row definition is the limit of the varchar length in the actual application. MySQL requires that a row's definition length cannot exceed 65535. If the defined table length exceeds this value, the prompt

ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. You have the to change some columns to TEXT or BLOBs.

2.CHAR (M), VARCHAR (m) different

CHAR (m) defines the length of the column as fixed, and the M value can be between 0~255, and when the char value is saved, a space is padded on their right to reach the specified length. When a char value is retrieved, the trailing space is removed. No case conversions are made during the storage or retrieval process. Char is convenient to store the fixed-length data, the index on the Char field is more efficient, such as the definition of char (10), then regardless of whether you store data reached 10 bytes, take up 10 bytes of space, insufficient to automatically fill with spaces.

varchar (m) defines the length of the column as a variable length string, and the M value can be between 0~65535, (the maximum effective length of varchar is determined by the maximum row size and the character set used. The overall maximum length is 65,532 bytes). The varchar value is saved with only the required number of characters, plus one byte to record the length (two bytes if the column declaration is longer than 255). The varchar value is not populated when it is saved. The trailing spaces on the value save and retrieve are still preserved, conforming to standard SQL. VARCHAR stores variable-length data, but the storage efficiency is no higher than char. If the possible value of a field is not fixed length, we only know that it cannot exceed 10 characters, it is the most advantageous to define it as VARCHAR (10). The actual length of the varchar type is +1 of the actual length of its value. Why "+1"? This byte is used to hold the length that is actually used. From the space consideration, with the varchar suitable, from the efficiency consideration, uses the char to be suitable, the key is to find the tradeoff point according to the actual situation.

The biggest difference between char and varchar is that one is fixed length and one is variable length. Because it is variable length, it is actually stored as the actual string plus a byte of the length of the record string (if more than 255 requires two bytes). If the value assigned to a char or varchar column exceeds the maximum length of the column, the value is clipped so that it fits. If the character being cropped is not a space, a warning is generated. If you crop a non-whitespace character, it causes an error (rather than a warning) and disables the insertion of values by using strict SQL mode.

3. The difference between varchar and text, blob types

The Varchar,blob and text types are variable-length types, and for their storage requirements depend on the actual length of the column values (denoted by L in the preceding table), rather than depending on the maximum possible size of the type. For example, a varchar (10) column can hold a string with a maximum length of 10 characters, the actual storage needs to be the length of the string, plus 1 bytes to record the length of the string. For the string ' ABCD ', L is 4 and the storage requirement is 5 bytes.

The blob and text types require either three-or 4-byte to record the length of the column values, depending on the maximum possible length of the type. varchar requires a defined size, with a maximum limit of 65535 bytes, and the text is not required. If you assign a value that exceeds the maximum length of a column type to a BLOB or text column, the value is truncated to fit.

A blob is a large binary object that can hold a variable amount of data. 4 blob types Tinyblob, blobs, Mediumblob, and Longblob differ only in the maximum length that they can hold values.

BLOBs can store pictures, text does not, text can only store plain text files. The 4 text types Tinytext, text, Mediumtext, and Longtext correspond to 4 blob types and have the same maximum length and storage requirements. The only difference between blob and text types is that the sorting and comparison of BLOB values is case-sensitive, and the text value is case insensitive. In other words, a text is a case-insensitive blob.

4.char,varchar,text Differences

The length of the difference, Char range is 0~255,varchar the longest is 64k, but note here 64k is the length of the entire row, to consider the other column, and if there is not NULL when it will occupy a bit, for different character sets, effective length is not the same, such as UTF8, up to 21845, but also to remove the other column, but varchar in general, the storage is sufficient. If you encounter a large text, consider using text, which can be up to 4G.

Efficiency is basically char>varchar>text, but if you are using a InnoDB engine, we recommend using varchar instead of char

Char and varchar can have default values, text cannot specify default values

It is necessary to choose the appropriate data type storage for the database, which has a certain effect on performance. Here in the piecemeal record two, for the int type, if you do not need to access negative values, preferably with unsigned; For fields that often appear in the Where statement, consider indexing, especially for indexing.

5.the difference between varchar (10) and varchar (100)

The general first learned that the two occupy the same space. For example, I store 5 char, both of which are actually taking up 5 char "errata: varchar is actually stored with a byte to hold the length."
But what about the two when designing the database?
The answer is no. "At least the varchar type needs to use one or two bytes to store the length of the data before the data" "the way they operate in memory is also different, as shown in the following example."
As now, the user needs to store an address information. According to the evaluation, only 100 characters are used. However, some database administrators would argue that the varchar data types are allocated lengths based on actual needs. It's better to give it a bigger one. For this reason they may allocate a 200-character storage space for this field at once. Is this varchar (100) really the same as varchar (200)? The result is negative. Although they are used to store 90 characters of data, their storage space is the same. However, the consumption of memory is different. For a varchar data type, the storage space on the hard disk allocates storage space based on the actual character length, but not for memory. It uses a fixed-size block of memory to hold the value. Simply put, it is the length defined in the character type, which is 200 character space. Obviously, this can have a significant adverse effect on the sort or temporary tables (which all need to be implemented through memory). Explanations can be found here. So if some fields involve file sorting or disk-based temporal tables, it is still not too generous to allocate the varchar data type. Or, evaluate the length of the actual need, and then select a longest field to set the character length. If you want to consider redundancy, you can leave a character length of about 10%. You cannot think of it as allocating storage space based on actual length, arbitrarily allocating length, or simply using the maximum character length.

6.The pros and cons of Char

1), from the point of view of fragmentation, when using char character type, because the storage space is a one-time allocation. For this purpose, the contents of a field are stored together. From this point of view, there is no fragmentation problem. The variable-length character data type has a variable length of storage. When the data length is inconsistent before and after the change, the problem of fragmentation inevitably occurs. Therefore, when using variable-length character data, the database administrator periodically organizes the fragments. such as performing a database export import job, to eliminate fragmentation.
2), consider whether its length is similar, if the length of a field is relatively long, but its length is always approximate, such as generally between 90 to 100 characters, or even the same length. It is more appropriate to use Char character types at this time. A more typical application is the MD5 hash value. When using the MD5 hash value to store user passwords, it is very useful to use char character types. Because its length is the same. In addition, it is generally recommended to use char type data, such as for storing the user's ID number and so on.
Please also consider a question, CHAR (1) and varchar (1) The definition of two, what is the difference? Although both can only be used to hold a single character, varchar consumes a storage location more than char. This is mainly because when you use the varchar data type, you use more than 1 bytes to store the length information. This administrative overhead of the char character type is not.

3), the space on disk is the same. The difference is two. The first to first variable grows a fixed length. Second, in-memory operation, varchar is the longest way to operate in memory. For example, in order to sort, Varcahr (100) is based on the length of 100.

4), MySQL vachar field type Although the maximum length is 65535, but not to save so much data, up to 65533 (do not allow non-empty fields when), when the non-empty field can only be allowed to 65532 "when the null allowed, varchar (65532) 'll be 2 bytes (length) + Up to 65532 chars (latin1) + 1 null byte "

5), please note that all MySQL proofing rules belong to the Padspace class. This means that there is no trailing space to consider when comparing all char and varchar values in MySQL. Note that all MySQL versions are the same, and that it is not affected by SQL Server mode. If a unique index is created on a char or varchar column, then ' a ' and ' a ' will cause duplicate-key error.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Java Learning Article---Summary of varchar types in MySQL

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.