M has SQL database InnoDB storage Engine----Storage page structure

Last Update:2018-04-20 Source: Internet

Author: User

Tags compact

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Previous Blog review:

　　　　1: The database has a large number of storage engines, now the main use is INOODB, this storage engine has compact,redundant,dynamic,compressed four kinds of line format

The structure of the 2:compact line format is divided into variable-length data length lists, null value lists, record header information, really data storage

3: variable-length data-length list stores the inverse order of byte numbers for variable-length data type data, null-value columns are not stored, null-value lists store non-primary keys, and columns without a not-NULL modifier are stored in bits reverse order.

4: The header information includes the offset, the number of slots, the amount of data in this group, whether it is deleted, the data type, is not B + Tree sub-nodes and so on information.

5: Real data will have three three virtual columns, row_id (automatically generated when no primary key), roll_pointer,transaction_id (transaction management ID)

5: Row overflow data processing the compact line format is the way to use the last record of the next page address, however redundant and compressed are used to record the page address of the page, and the compressed of the latter is a compression algorithm.

6: For a char-like type of data, if we operate in a variable character set, it is stored in the variable-length data list.

7: For the variable data length list, the memory consumption byte is 1 or the 2,null value is bits, the header information occupies 5 bytes in the compact line format, redundant occupies 6 bytes. One page of space 16kb.

Page Storage structure:

　　　We all know that the database is a page of storage space is 16kb, then this 16kb storage space is how to allocate, the data in this storage space is a format? What do you do with these data databases? That's what we need to learn today.

It may not be too convenient to speak in order, and it may not seem to work too well, so let's distinguish between 1.1 points based on the partitioning of the function.

　　　　User Records:

This area saves the data we insert, and it needs to be explained that this piece is not there, and that the area will be divided when we insert the data. And it's a division from free space. When all the space in our free space has become userrecords, it's time to re-create a storage page.

So when we put the data in this area, is not that there is no rule, casually placed, in fact, think about it, when our data is too large, we casually placed, look how painful. So the next thing we need to figure out when we know where the data is in that location is the data in user records. Here we assume that four of our own records are inserted:

Two virtual data: As we can see, we have inserted four records, but in our page there are six records, which are two virtual records that will exist on every page we say: Maximum record and minimum record. They all save infumum_supremum inside, because it is not the record that we insert, so it is not in User_record. Minimum record is started by default and maximum record ends at the end 　　

We will then describe some of the questions based on these six records:

1: The Record_type that we refer to in the record header of each data is the type of this data, when we say there are 0 normal data, 1 leaf node data, 2 min data, and 3 maximum data. What we can see is that the top and the smallest data are 3 and 2 respectively, and the header information of our own insert data is here at Record_type 0.

2: We can also see in the record header information is delete_mask this data, indicates whether the data is deleted, 0 means no, 1 means has been deleted, so the above data are 0

3:heap_no we've talked about marking the location of the data in the page, we can see that the insertion data is 2,3,4,5, respectively. So where 0 and 1 go, don't worry, take a look at the minimum and maximum records of this data, is not 0 and 1, respectively. The data we insert will be counted from 2, and the virtual data will occupy the default position of 0 and 1.

4: The insertion of data is the order of data insertion, it is obviously not possible, you are not wrong, the data will be arranged according to the size, then what is the size of the data arrangement? is obviously the primary key for comparison.

5:next_record record is relative to this data, the next data address offset, that is, through this data to find so many bytes to find the next piece of data, yes. He is linked using a linked list. Such as:

6: If a piece of data is deleted, that is, its delete_mask is marked for 1, then how will this change? is consistent with the linked list, the link can be cut off.

7: When we are doing data search, is this one after another to find? Start with the smallest record and find it based on Next_record? It must be a time-consuming activity, obviously impossible, so there is the concept of grouping.

Group:

What we can see is that six pieces of data are divided into two groups, first of all the smallest virtual data, and then the remaining five pieces of data are divided into a group. What you need to know here is that the MySQL database is grouped by data in each page. The default minimum data is the first group, which has a single piece of data, which is the smallest data and cannot be inserted into other data. The largest data is the second group, we insert the data at the time of inserting the largest data group, when the largest data set to meet the time to split, the formation of ordinary groups, and then the data into the largest data group, so the cycle, complete the data grouping.

　　　　　　　　Slot:

What we can also see is that there are two strange things in the group diagram, slots, which we told on an article page. The offset of the page-relative address for each grouped data is a slot data, a slot in which the slot exists, and the page Directory where the slots exist. What I need to emphasize here is that in the header information there is an address offset of next_record, which is relative to the next data position, and then the offset in the slot is the offset of the last data in the grouping relative to the page.

Find:

With the grouping, when we are doing the data lookup, we determine the slot location of the corresponding data according to the dichotomy, and then we use the Next_record of the record header information to find it.

　　　　　　　　N_owned:

This data we have seen in the record header information, in fact, here can end this data. It represents the number of data in the grouping that exists in the last piece of information in the grouping. What we can see is that the data n_owned in front of each packet is 0, and it has a value only on the last data.

Page_header

Above, we introduced the use of the four-block space of User_records,infumum_supremum,page_directory,free SPCE through data, and the next explanation is Page_header,file_ Header,file Tailer These three blocks of space.

The first thing to say is Page_header, a place that stores some information about the data:

`PAGE_N_DIR_SLOTS`	`2`Bytes	Number of slots in the page directory
`PAGE_HEAP_TOP`	`2`Bytes	Address of the first record
`PAGE_N_HEAP`	`2`Bytes	Number of records on this page (including minimum and maximum records and records marked for deletion)
`PAGE_FREE`	`2`Bytes	Addresses that point to reusable space (that is, record addresses marked for deletion)
`PAGE_GARBAGE`	`2`Bytes	Total number of deleted bytes, `delete_flag` 1 of record size in row record structure
`PAGE_LAST_INSERT`	`2`Bytes	Where the record was last inserted
`PAGE_DIRECTION`	`2`Bytes	The direction of the last insertion
`PAGE_N_DIRECTION`	`2`Bytes	Number of records consecutively inserted in one Direction
`PAGE_N_RECS`	`2`Bytes	Number of records in the page (excluding minimum and maximum records and records marked for deletion)
`PAGE_MAX_TRX_ID`	`8`Bytes	Modifies the maximum transaction ID for the current page, which is defined only in a two-level index
`PAGE_LEVEL`	`2`Bytes	Position of the current page in the index tree, height
`PAGE_INDEX_ID`	`8`Bytes	Index ID that indicates which index the current page belongs to
`PAGE_BTR`	`10`Bytes	Segment header for the segment of the non-leaf node, defined only on the root page of the B + Tree
`PAGE_LEVEL`	`10`Bytes	The segment header of the section of the B + tree, defined only on the root page of the B + Tree

What we need to say on top is page_direction and Page_n_recs. The first point is the direction of the last insertion, as opposed to the previous data, we insert a new one larger than him, on the right, and vice versa on the left, which is the direction. The number is recorded when we insert the data continuously on the right or on the left. Of course, if you change direction, the data will be emptied from the zero-based count.

File_header:

The Page_header is the description of the page storage record, then the file_header here is the description of the page information:

name	Occupy space Size	Description
`FIL_PAGE_SPACE_OR_CHKSUM`	`4`Bytes	Checksum of the page (checksum value)
`FIL_PAGE_OFFSET`	`4`Bytes	Page number
`FIL_PAGE_PREV`	`4`Bytes	Page number of the previous page
`FIL_PAGE_NEXT`	`4`Bytes	Page number of the next page
`FIL_PAGE_LSN`	`8`Bytes	The last modified log sequence location (English name is: Log Sequence number)
`FIL_PAGE_TYPE`	`2`Bytes	The type of the page
`FIL_PAGE_FILE_FLUSH_LSN`	`8`Bytes	Defined only in one page of the system tablespace, representing the file being updated at least to the LSN value, which is 0 in the stand-alone tablespace
`FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID`	`4`Bytes	Which table space the page belongs to

We can see three values, one record is the page number of this page, the previous page, the next page. For the data in one of the pages we're talking about is connected in the form of a one-way list, we can think of a doubly linked list of pages in a database.

What we can see on the top isFIL_PAGE_TYPE这个值，描述的是这个页的类型，显然我们数据库不可能就只有一种数据页，上面我们讲的储存真实数据页就是数据页。FIL_PAGE_INDEX，也就是我们提到的　　　B+树叶子节点。

name	hexadecimal	Description
`FIL_PAGE_ALLOCATED`	0x0000	Latest assignment, not yet used
`FIL_PAGE_UNDO_LOG`	0x0002	Undo Log Page
`FIL_PAGE_INODE`	0x0003	node for segment information
`FIL_PAGE_IBUUF_FRE_LIST`	0x0004	Insert Buffer Free List
`FIL_PAGE_IBUF_BITMAP`	0x0005	Insert buffer Bitmap
`FIL_PAGE_TYPE_SYS`	0x0006	System page
`FIL_PAGE_TYPE_TRX_SYS`	0x0007	Transactional system data
`FIL_PAGE_TYPE_FSP_HDR`	0x0008	File Space Header
`FIL_PAGE_TYPE_XDES`	0x0009	Extension description page
`FIL_PAGE_TYPE_BLOB`	0x000A	Blob page
`FIL_PAGE_INDEX`	0x45bf	Sub-nodes of B + Tree

File Trailer

This thing needs to be FIL_PAGE_SPACE_OR_CHKSUM said together with this property system checksum in the file header above. We all know that the page is a 16kb of storage space, whether it is memory flush to the database is a single operation 16KB. If there is an irresistible factor such as a power outage in the middle, this is where it works. The File header is located at the beginning of the page, and it calculates a checksum, which you can understand that when we need a very complex string, we tend to calculate an integer value according to a certain algorithm, which is used when comparing it with other strings. So the checksum is also the reason, File trailer is located at the end of the page, he will also store a checksum. If the data is incomplete, then the two checksum cannot be consistent, then you can determine that the data page is corrupt.

M has SQL database InnoDB storage Engine----Storage page structure

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More