1, the logical meaning of varchar (N)
Starting with MySQL4.1, n in varchar (n) refers to the maximum number of characters (characters) that can be stored in the field, not the number of bytes.
Whether it's a Chinese-English character or a number, or a kanji, it's a character. Before 4.1, n represents the maximum number of bytes of storage (bytes).
2, varchar (N) in the end can save how long data
On the MySQL reference manual, varchar can store up to 65,535 bytes of data. The maximum length of the varchar is limited to the maximum line length (max Row size,65535bytes). 65535 is not a very precise cap and can continue to shrink that limit. The 65,535 bytes include the length of all fields, the length of the variable long field (each variable length field uses 1 or 2 bytes to record the actual data length), and the cumulative of the null identity bit.
The null identity bit, if the varchar field definition has default NULL to allow the column to be empty, it needs to be identified by 1bit, and the identity of each 8 bits makes up a field. There are N varchar fields in a table, then (N+7)/8 (rounding) is required to store all the null identifiers. bytes
If the datasheet has only one varchar field and the field is default NULL, the maximum length of the varchar field is 65,532 bytes, that is, the 65535-2-1=65532 bytes.
Copy Code code as follows:
CREATE TABLE ' vchar1 ' (' Name ' VARCHAR (65533) not NULL) Engine=innodb DEFAULT charset=latin1;
CREATE TABLE ' vchar2 ' (' Name ' VARCHAR (65533) not NULL) Engine=myisam DEFAULT charset=latin1;
If the datasheet has only one varchar field and the field is not NULL, the maximum length of the varchar field is 65,533 bytes, that is, 65535-2=65533bytes.
Copy Code code as follows:
CREATE TABLE ' vchar3 ' (' Name ' VARCHAR (65532) default NULL) Engine=innodb default charset=latin1;
CREATE TABLE ' vchar4 ' (' Name ' VARCHAR (65532) default NULL) Engine=myisam default charset=latin1;
To a slightly more complex table structure,->
Copy Code code as follows:
CREATE TABLE ' TV ' (
' A ' VARCHAR (MB) DEFAULT NULL,
' B ' VARCHAR (MB) DEFAULT NULL,
' C ' VARCHAR (MB) DEFAULT NULL,
' d ' VARCHAR (MB) DEFAULT NULL,
' E ' VARCHAR (MB) DEFAULT NULL,
' F ' VARCHAR (MB) DEFAULT NULL,
' G ' VARCHAR (MB) DEFAULT NULL,
' H ' VARCHAR (MB) DEFAULT NULL,
' I ' VARCHAR (N) DEFAULT NULL
) Engine=innodb DEFAULT charset=latin1
What can be the maximum number of n in ' i ' varchar (n) default null?
This calculation: The known determined field length is 100*8 bytes,8 varchar (100) fields require a variable length field in total to indicate that the byte is 1*8=8 bytes. Each null field is identified with a 1bit, 9 fields are default NULL, and you need to store a null identity bit with a (9+7)/8bit = 2 bytes. 65535-100*8-1*8-2 = 64725 > 256, then the maximum length of the field i is 64725-2 =64723 bytes, or n=64723.
How many characters can varchar save? This is related to the character set used, where the latin1, GBK, and UTF8 encodings take up 1, 2, and 3 bytes for each character.
3. VarChar Physical storage
On physical storage, varchar uses 1 to 2 additional bytes to represent the actual stored string length (bytes). If the maximum length of a column is less than 256 bytes, it is represented by a byte (identity). If the maximum length is greater than or equal to 256, use two bytes.
When the selected character set is Latin1, a character occupies a byte
varchar (255) stores one character, using a total of 2 bytes physical spaces to store the actual data length and data values of the data.
VARCHAR (256) stores one character, using 2 bytes to represent the actual data length, which requires 3 bytes physical storage space.
varchar for different RDBMS engines, there is a physical storage mode, although there is a unified logical meaning. For MySQL different storage engine, its implementation method and data physical storage way is also different.
4, the varchar in the InnoDB
The physical storage mode of varchar in INNODB is related to the Innodb_file_format used by InnoDB. Early Innodb_file_forma used the Antelope file format to support redundant and compact two kinds of row_format. Starting with 5.5 or InnoDB1.1, you can use a new file Format,barracuda. Barracuda compatible with redundant, and also supports dynamic and compressed two kinds of row_format.
When innodb_file_format=antelope,row_format=redundant or compact.
The InnoDB clustered index (cluster index) stores only the first 768 bytes of the varchar, text, and Blob fields, and the extra bytes are stored in a separate overflow page, which is also called Off-page. The 768-byte prefix is followed by a 20-byte pointer pointing to the location of the overflow pages.
In addition, in the case of Innodb_file_format=antelope, the InnoDB can store up to 10 large segments (the use of off-page storage is required). The default page size for INNODBD is 16kb,innodb for the length of a single line that cannot exceed 16k/2=8k bytes, (768+20) *10 < 8k.
When Innodb_file_format=barracuda, row_format=dynamic or compressed
All the varchar, text, and BLOB field data in the InnoDB is completely off-page stored, depending on the length of the field and the total length of the entire row. For off-page-stored columns, only 20-byte pointers are stored in cluster index, pointing to the actual overflow page storage location. If the length of a single line is too large to fit fully cluster index PAGE,INNODB will select the longest column as the Off-page store until the length of the row can fit cluster index page.
5, the varchar in the MyISAM
For the MyISAM engine, all data in the varchar field is stored in the data row (In-line). The Row_format of the MyISAM table also affects the physical storage behavior of varchar.
MyISAM Row_format can be set to fixed and dynamic by a create or alter SQL statement. In addition, the Row_format=compresse storage format can be generated by myisampack.
When the MyISAM table does not have a text or blob Type field, the Row_format can be set to fixed (or dynamic), or dynamic only.
When the varchar field exists in the table, the Row_format can be set to fixed or dynamic. Use row_format=fixed to store varchar field data, wasting storage space, varchar storage at this time. Row_format is also different for both fixed and Dynamic,varchar physical implementations (you can view source code files Field.h and field.cc), so MyISAM Row_ When the format transitions between fixed and dynamic, the physical storage of the varchar field will also change.