1. Logic meaning of varchar (N)
From MySQL4.1, N in varchar (N) indicates the maximum number of characters (characters) that can be stored in this field, not the number of bytes.
Whether it is a Chinese character or a number, or a Chinese character, it is regarded as a character. Before 4.1, N indicates the maximum number of bytes stored (bytes ).
2. How long can varchar (N) be stored?
On mysql reference manual, varchar can store up to 65535 bytes of data. The maximum length of a varchar is limited by the maximum row length (max row size, 65535 bytes ). 65535 is not a very accurate limit. You can continue to narrow it down. The second byte includes the length of all fields, the length identifier of the variable length field (each variable length field uses one or two additional bytes to record the actual data length), and the accumulation of NULL identification spaces.
NULL flag. If the varchar field definition contains a default null that allows the column to be NULL, 1 bit is required for identification. Each 8 bits identifiers form a field. If a table contains N varchar fields, all NULL identifiers must be stored in (N + 7)/8 (rounded up) bytes.
If the data table has only one varchar field and this field defaults NULL, the maximum length of this varchar field is 65532 bytes, that is, 65535-2-1 = 65532 bytes.
Copy codeThe Code is as follows: create table 'vchar1' ('name' VARCHAR (65533) not null) ENGINE = InnoDB default charset = latin1;
Create table 'vchar2' ('name' VARCHAR (65533) not null) ENGINE = MyISAM default charset = latin1;
If the data table has only one varchar field and this field is not null, the maximum length of this varchar field is 65533 bytes, that is, 65535-2 = 65533 bytes.
Copy codeThe Code is as follows: create table 'vchar3' ('name' VARCHAR (65532) default null) ENGINE = InnoDB default charset = latin1;
Create table 'vchar4' ('name' VARCHAR (65532) default null) ENGINE = MyISAM default charset = latin1;
To create a table structure with a slight complexity,->
Copy codeThe Code is as follows: create table 'TV '(
'A' VARCHAR (100) default null,
'B' VARCHAR (100) DEFAULT NULL,
'C' VARCHAR (100) default null,
'D' VARCHAR (100) default null,
'E' VARCHAR (100) default null,
'F' VARCHAR (100) default null,
'G' VARCHAR (100) default null,
'H' VARCHAR (100) default null,
'I' VARCHAR (N) DEFAULT NULL
) ENGINE = InnoDB default charset = latin1
'I' varchar (N) What is the maximum value of N in DEFAULT NULL?
Calculated as follows: the length of the field is 100*8 bytes, and the length of the eight varchar (100) fields must be increased to 1*8 = 8 bytes. Each NULL field is identified by 1 bit, and the nine fields are default null. Therefore, the NULL flag must be stored by (9 + 7)/8bit = 2 bytes. 65535-100*8-1*8-2 = 64725> 256, the maximum length of field I is 64725-2 = 64723 bytes, that is, N = 64723.
How many characters can varchar store? This is related to the character set used. latin1, gbk, and utf8 encoding occupy 1, 2, and 3 bytes for each character.
3. varchar physical storage
On physical storage, varchar uses one to two additional bytes to indicate the actually stored String Length (bytes ). If the maximum length of a column is less than 256 bytes, one byte is used ). If the maximum length is greater than or equal to 256, two bytes are used.
When the selected character set is latin1, one character occupies one byte
Varchar (255) stores one character and uses two bytes of physical space to store the actual data length and value.
Varchar (256) stores one character and uses 2 bytes to indicate the actual data length. A total of 3 bytes of physical storage space is required.
Varchar has different physical storage methods for different RDBMS engines, although it has a unified logic significance. For different storage engines of mysql, the implementation method is different from the physical storage method of data.
4. varchar in InnoDB
The physical storage method of varchar in InnoDB is related to the innodb_file_format used by InnoDB. Early versions of innodb_file_forma used the Antelope file format, which supports two row_formats: redundant and compact. A new file format, Barracuda, can be used from 5.5 or InnoDB1.1. Barracuda is compatible with Redundant and supports two row_format formats: dynamic and compressed.
When innodb_file_format = Antelope, ROW_FORMAT = REDUNDANT or COMPACT.
The clustered index of innodb only stores the first 768 bytes of varchar, text, and blob fields. The excess bytes are stored in an independent overflow page, this column is also called off-page. The first byte prefix is followed by a 20-byte pointer, pointing to the location of overflow pages.
In addition, in the case of innodb_file_format = Antelope, InnoDB can store up to 10 large fields (off-page storage is required ). The default page size of innodbd is 16 KB. The length of a single InnoDB row cannot exceed 16 k/2 = 8 k bytes (768 + 20) x 10 <8 k.
When innodb_file_format = Barracuda, ROW_FORMAT = DYNAMIC or COMPRESSED
Whether all varchar, text, and blob field data in innodb is completely off-page storage depends on the length of the field and the total length of the entire row. For columns stored in off-page, the cluster index only stores 20-byte pointers, pointing to the actual storage location of overflow page. If the length of a single row is too large to fully adapt to the cluster index page, innodb selects the longest column as the off-page storage until the row length can adapt to the cluster index page.
5. varchar in MyISAM
For the MyISAM engine, all data in the varchar field is stored in the data row (in-line ). The row_format of the myisam table also affects the physical storage behavior of varchar.
The row_format of MyISAM can be set to fixed and dynamic through the create or alter SQL statement. In addition, you can use myisampack to generate the row_format = compresse storage format.
If the myisam table does not contain text or blob fields, you can set row_format to fixed (or dynamic). Otherwise, it can only be dynamic.
When the varchar field exists in the table, row_format can be set to fixed or dynamic. Row_format = fixed is used to store varchar field data, which wastes storage space. varchar is stored for a fixed duration. Row_format is fixed and dynamic, and the physical implementation of varchar is also different (you can view the source code file field. h and field. therefore, when the row_format of myisam is converted between fixed and dynamic, the physical storage mode of the varchar field will also change.