InnoDB logical storage structure for storage engine tables

Source: Internet
Author: User
Tags compact uncompress

1. Index Organization table:In the InnoDB storage engine, tables are organized in the primary key order. Such a table of storage is called an indexed organization table, and each table has a primary key in the InnoDB storage engine table. Assuming that the primary key is not explicitly defined when it is created, InnoDB chooses or creates the primary key as follows:1). First infer whether there is a non-empty unique index in the table, fake. The column is the primary key. 2). Assuming that the above conditions are not met, InnoDB will voluntarily create a 6-byte-size pointerAssuming that there are multiple non-null unique indexes in the table, InnoDB will select the first non-null unique index defined when the table is built, and the primary key of the table can be displayed by _rowid, but only if a single column as the primary key is viewed, but not for the primary key consisting of multiple columns.

 2.InnoDB logical Storage structureall data is logically stored in a single space. Called a tablespace, the tablespace consists of segments (segment), extents (extent), pages (page). Pages sometimes become blocks. Table Space:all data is stored in the table space. By default, there is a shared tablespace ibdata1, assuming that the parameter innodb_file_per_table is enabled. The data for each table can be placed in a single table space (default feeling 96kb). However, with only a certain amount of data (data, index, and insert buffer bitmap pages), other data is stored in the shared table space, so after this parameter is enabled. The size of the shared table space will continue to grow. and the InnoDB storage engine does not shrink the table space when the transaction is running rollback, it infers whether the information is still needed, and does not need to be marked as free space for the next use. segment ( segment): The table space consists of each segment, the common segment has data segment (b + Tree leaf node), index segment (b + Tree non-leaf node), rollback segment and so on. The management of segments in the storage engine is self-completed.

Zone ( extent): A zone is a space consisting of contiguous pages, in which case each area is 1MB in size. In order to maintain the continuity of the page in the zone, InnoDB requests 4-5 extents from disk at a time, by default the size of the InnoDB page is 16kb, that is, a zone has a common 64 consecutive pages. Starting with the innodb1.0.x version number, a compressed page is introduced, that is, the size of each page can be set to 2K, 4k, 8k by the number of parameters key_block_size.

innodb1.2.x version number new innodb_page_size to set the default page size to 4k, 8k, but the database in the page is not compressed.              includes a problem that the user has enabled innodb_file_per_table after the number of references. The default size of the created table is 96kb, but the extents are 64 contiguous pages, and the table created should be at least 1MB in size, because at the beginning of each segment. Use 32 page-size fragmented pages to hold the data before applying 64 consecutive pages to save disk capacity overhead.     page:   page is the smallest unit of INNODB Disk Management. The default size for each page is 16KB. Common pages are: data pages, undo pages, System pages, transactional data pages, insert buffer bitmap pages, and so on.
      line : The row Records stored on each page agree to hold the 16kb/2-200 line record, 3.innodb row record format:           The records in the InnoDB storage engine are stored as rows. This means that a row of data is saved in the page, and before innodb1.0.x, the compact and redundant two formats are provided for storing row record data,    3.1. Compact row record format:           mysql5.0 introduced later, in order to efficiently store data, the more row data in a page, the higher the performance, it is stored in the following way:  
The header is a list of non-null variable-length field lengths with a maximum length of 2 bytes and a null flag bit for the second part. Indicates whether the row data contains a null value, which is indicated by 1, the header information is fixed at 5 bytes (40 bits), and finally the data for each column is actually stored. It is important to note that NULL does not occupy that part regardless of space, except for the use of the null flag bit. Also, each row of data includes hidden columns, in addition to the user's own definition. The Transaction ID column and the rollback pointer column. Each is 6 bytes and 7 bytes in size, assuming no assertive. Each row also adds a 6-byte rowid column. 3.2. Redundant line record format:     is the way the row records are stored before the mysql5.0 version number. The format is then supported for compatibility with the previous version number of the page format, which is stored in the following way:


The header is a field-length offset list, which is placed in reverse order of the columns, the second part takes 6 bytes of header information, and finally the data for each column that is actually stored.
3.3. Row overflow tree according toThe InnoDB storage engine is able to store some of the data for a record outside of a real data page, typically storing large object column types such as BLOBs, lobs, which store data outside the data page, but this understanding is somewhat biased. The varchar data type can be stored as row overflow data, and the varchar type of the MySQL database can hold 65535 bytes, but it does not actually hold 65535 bytes. There are other expenses. Only 65532 bytes can actually be stored. and the official definition of 65535 length is the sum of the lengths of all varchar columns, assuming that the sum of the length of the column exceeds this length. Still cannot be created. the data for the INNODB storage engine is stored in B-tree node of the page type, but when a row overflow occurs, the data is stored in the page Type uncompress blob page. In fact, only 768 bytes of prefix data are saved in the data page, followed by offsets. Point to the Row overflow page (uncompress BLOB pages), then how long is the varchar saved in a single data page, from how long to save on the Bloib page? Thinking: InnoDB is indexed. The structure of the B + tree. So that there are at least two rows in each page (otherwise the meaning of the B + tree is lost, it becomes a linked list), so if a page can only hold a single record, then the INNODB storage engine will voluntarily store the row data on the overflow page. 3.4 Compressed and dynamic formats  inndob1.0.x introduced the new file format, once supported by the compact and redundant format called Antelope file format, the new file format is called Barracuda file format, The Barracuda file format has two new row record formats:compressed and dynamic format. The new row record format uses a full line overflow for the data stored in the BLOB, storing only 20 bytes of pointers in the data page, and the actual data is stored in the off page (unlike the compact and redundant formats, which hold 768 prefix bytes ),compressed Another function is that the data stored in it will be compressed with the zlib algorithm, so large-length data such as blobs and text can be stored efficiently.
3.5 Row structure storage for charStore fixed-length character types, mysql4.1 the version number starts with N in char (n), which is the length of the character, not the length of the previous byte, so it is under a different character set. The corresponding columns of the char type may not be stored in a fixed length of data, so for the storage of multibyte-character encoded char types, INNODB internally treats it as a variable-length character type.

4.INNODB Data page structure Consists of the following 7 parts:file Header (header): Fixed length 38 bytesPage Header (header): 56 bytes, used to record the state information of a data page, 0X45BF represents a data page. infimum and Supermum Records:There are two virtual row records in each page of data that are used to limit the boundaries of a record. A infimum record is a value that is smaller than any primary key in the page. Supermum Records is a value that is larger than whatever is likely to be a large value, and these values are created at page creation time and are not deleted under any circumstances, and differ in the number of bytes in two different row record formats. User records (subscriber record, on-record) and free space (spare space): free space is a linked list array structure after a record has been deleted. The space will be added to the spare linked list. page Directory (pages folder):holds the relative position of the page, which is sometimes referred to as a slot or folder slot in InnoDB. Not every record has a slot, and the InnoDB slot is a sparse folder, where a slot may include multiple records. When the record is deleted or is inserted. Requires a split or balanced maintenance operation on the groove,file Trailer (end information): Fixed length 8 bytesto check if the page has been fully written to the disk

InnoDB logical storage structure for storage engine tables

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.