InnoDB row format in two formats (compact,redundant) The storage format of the compact Compact is a non-null, variable-length list of field lengths, and is placed in reverse order by column, when the length of the column is less than 255 bytes, denoted by 1 bytes , if it is greater than 255 bytes. In 2 bytes, the maximum varchar length is 65535>, since two bytes are 16 bits, 65535, and the second part is the null flag bit, which indicates whether the row has a null value, Utility 01 indicates that none is represented by 00. The next part is the recording header information (record header) is fixed 5 bytes, the final part is the actual storage of each column of data, NULL does not occupy the part of any data, in addition to the possession of the null flag bit, the actual storage does not occupy any space, in addition, note that Each row of data in addition to user-defined columns, there are two hidden columns, transaction ID (6 bytes), the pointer column (7 bytes), if the InnoDB table is undefined, primay key, then each line is added? A 6-byte rowid, fake, how to have a 4-byte index field.
The redundant storage format is the first part of the field length offset list (each field occupies a byte length and its corresponding displacement), the same is the column in reverse order, when the length of the column is less than 255 bytes, 1 bytes, if greater than 255 bytes, with 2 bytes table > shown. The second part is the header information (record headers), different from the compact line format, its row format is fixed 6 bytes, the last part is the actual storage of each column of data, NULL does not occupy the part of any data, but Char with null value is required to occupy the corresponding byte, in addition to note that each row of data in addition to user-defined columns, there are two hidden columns, transaction ID (6 bytes), Roll pointer column (7 bytes), if the InnoDB table is undefined, primay key, then each line is added? A 6-byte rowid, fake, how to have a 4-byte indexed field
Q: How do I understand the length offset list? A: The length offset list represents the length of each folder and its relative position.
Now let's do an experiment to see the difference in detail.
CREATE TABLE Test1 (
T1 varchar () DEFAULT NULL,
T2 varchar () DEFAULT NULL,
T3 char (Ten) DEFAULT NULL,
T4 varchar () DEFAULT NULL
) Engine=innodb DEFAULT charset=latin1 row_format=compact
INSERT into test1 values (' A ', ' BB ', ' BB ', ' CCC ');
INSERT into test1 values (' d ', ' ee ', ' ee ', ' fff ');
INSERT into test1 values (' d ', null,null, ' FFF ');
Via Python py_innodb_page_info.py-v/vobiledata/mysqldata/test/ TEST1.IBD Analysis The data page exists on 00000003 pages, one page has 16K, the third page is converted to 0000c000 by Hex.
Page Offset 00000000, page type <file Space header>
Page offset 00000001, page type <insert Buffer bitmap>
Page offset 00000002, page type <file Segment inode>
Page offset 00000003, page type <b-tree node>, page level <0000>
Page Offset 00000000, page type <freshly allocated page>
Page Offset 00000000, page type <freshly allocated page>
Total number of Page:6:
Freshly allocated Page:2
Insert Buffer bitmap:1
File Space header:1
B-tree node:1
File Segment inode:1
Through the analysis of Hexdump-c-v/vobiledata/mysqldata/test/test1.ibd>tes1.txt
0000bff0 xx (XX) (4b) FF 5e C7 | ...., XX rkg.cf.|
0000c000 4e, AE, F, F, F, FF FF FF FF FF FF FF |0n..............| 0000c010 5f, BF 00 00 00 00 00 00 | ... pc_.ve.......|
0000c020 (XX) 9c (80 05 00 00 00 00 |................|
0000c030 D8 00 02 00 02 00 03 00 00 00 00 00 00 00 00 |................|
0000c040 (XX) at XX, 9c 00 00 | ....... 6.......|
0000c050 xx F2 (9c 00 00 00 02 00 32 01 00 | ........ 2..|
0000c060 1e 6e, 6d, 6d, XX, 0b, xx |...infimum......|
0000c070, 6d, 6d, and the |supremum........|.
0000c080 2c (0c)--------------|,.... x....b.....|
0000c090 2d 01 10 61 62 62 62 62 20 20 20 20 20 20 20 |. -.. abbbb |
0000C0A0------2b 0c | ccc........+...|
0000C0B0------------------F8 01 10 X....b ...-.. |
0000C0C0-A-a-and-a-all-in-a-|deeee fff|
0000C0D0 (98): 0c 84 58 05 00 00 | ..... ..... x...|
0000C0E0:2d 01 10 64 66 66 66 00 | B.C ....-.. dfff.|
0000c0f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
0000c100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
0000c110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
We were able to analyze the data above to see how InnoDB stored the data.
Store data page, start saving data at 0000c078,
03,02,01 represents a mutable field storage list (the longest byte of a variable field is recorded),
00 indicates that there is no null in the record (if there is a binary calculation in the case where null is used),
The 2c represents the header information (specified as 5 bytes length), which is used to connect the records, and is used in compact format by default InnoDB
Stores the data page, 0000c078 the data, 03,02,01 represents the Variable field storage list (the longest byte of the variable field is recorded),
00 indicates that there is no null in the record (if there is a binary calculation in the case where null is used),
The 2c represents the header information (specified as 5 bytes length), which is used to connect the records and also for row-level locks.
0c 84 58 03 means ROWID, we do not set the primary key, so (hidden six bytes), thus, in order to reduce the table space, we design the table is as far as possible to set the primary key.
00 00 00 62 81 976 x transation ID
2d 01 107-byte rollback pointer
Ps:null does not occupy space
We look at the data stored in the third column: 03,01 indicates that the fourth and first columns are not empty, 06 means there is a null value, so far the second and third digits, 00000110 = 06
Next we look at the changes in the space after the data is deleted:
[Email protected] 12:19:09>select * from Test1;
T1 T2 T3T4
A Bb BbCcc
D Ee EeFff
D Null Null Fff
3 Rows in Set (0.00 sec)
[email protected] 12:27:40>delete from test1 where t1 = ' a ';
Query OK, 1 row Affected (0.00 sec)
0000c070-Ten 6d 6d, the |supremum of the above-xx-... |
0000c080 (0c) (0c 00 00 00 |), XX/xx X....c ' .... |
0000c090 00 33 26 06 61 62 62 62 62 20 20 20 20 20 20 20 |. 3&.abbbb |
0000C0A0------2b 0c | ccc........+...|
0000c0b0-up to 7c 94, 2d, 1f |. x....b| ...-.. |
0000C0C0-A-a-and-a-all-in-a-|deeee fff|
0000C0D0 (98): 0c 84 58 02 00 00 | ..... ..... x...|
0000C0E0 7c 94 2d 00 2e 64 66 66 66 | B| ...-.. dfff.|
0000c0f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
We found no change in data storage
[email protected] 12:35:01>insert into test1 values (' A ', ' BB ', ' BB ', ' CCC ');
Query OK, 1 row Affected (0.00 sec)
[Email protected] 12:35:08>select * from Test1;
T1 T2 T3T4
D Ee EeFff
D Null Null Fff
A Bb BbCcc
0000c070 (6d) 6d, Geneva, XX, |supremum........|
0000c080 EF (0c) 0c xx (00) (8f 80 00): x....cc....|
0000c090 2d 01 10 61 62 62 62 62 20 20 20 20 20 20 20 |. -.. abbbb |
0000C0A0------2b 0c | ccc........+...|
0000c0b0-up to 7c 94, 2d, 1f |. x....b| ...-.. |
0000C0C0-A-a-and-a-all-in-a-|deeee fff|
0000c0d0 (A9): 0c 84 58 02 00 00 | ..... ..... x...|
0000C0E0 7c 94 2d 00 2e 64 66 66 66 | B| ...-.. dfff.|
This indicates that the data space is not released when deleted, waiting for the next time to insert data, assuming that the same data inserted can be reused, assuming no, there is fragmentation
[email protected] 12:35:13>insert into test1 values (' C ', ' AAA ', ' AAA ', ' cc ');
Query OK, 1 row Affected (0.00 sec)
[Email protected] 12:37:24>select * from Test1;
0000c070, 6d, 6d, and the |supremum........|.
0000c080 (0c), 0c-------------|w x....cc....|
0000c090 2d 01 10 61 62 62 62 62 20 20 20 20 20 20 20 |. -.. abbbb |
0000C0A0------2b 0c | ccc........+...|
0000c0b0-up to 7c 94, 2d, 1f |. x....b| ...-.. |
0000C0C0-A-a-and-a-all-in-a-|deeee fff|
0000c0d0 (A9): 0c 84 58 02 00 00 | ..... ..... x...|
0000C0E0 7c 94 2d 02 2e 64 66 66 66 | B| ...-.. dfff.|
0000C0F0, the FF 0c 0d 00 00 | ....... (. x .....) x...|
0000c100 B5 2d 01 10 63 61 61 61 61 | CD ...-.. caaaa|
0000C110-----------------|AA cc.....|
Thus, assuming that a new data is inserted, the space will then be stored down
We then look at the InnoDB storage of the case with the primary key:
CREATE TABLE te1 (
ID Int (one) not NULL DEFAULT ' 0 ',
T1 varchar () DEFAULT NULL,
T2 varchar () DEFAULT NULL,
T3 char (6) DEFAULT NULL,
PRIMARY KEY (ID)
) Engine=innodb DEFAULT charset=latin1 row_format=compact
INSERT into TE1 values (1, ' AA ', ' BB ', ' BB ');
INSERT into TE1 values (2, ' cc ', null,null);
The same analysis data: 0000c070, 6d 6d, Geneva, XX, |supremum.......&|
0000c080, 98 e7, XX, XX, 2d |.......b......-.|
0000c090 10 61 61 62 62 62 62 20 20 20 20 20 20 20 20 02 |. AABBBB. |
0000C0A0-------FF CA----2e |.............b..|
0000c0b0 (2d), XX-------XX cc.......|
0000c0c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
We found that assuming there is a primary key, the header file (00 00 10 00 26) will have the information of the record index (all record lines 80 00 00 01) 01 represents the primary key 1, which will save one byte less than the index
[Email protected] 12:44:05>update te1 Set t1 = ' CC ' WHERE id = 1;
Query OK, 1 row Affected (0.00 sec) Rows matched:1 changed:1 warnings:0
0000c070 (6d) 6d-----------|supremum
0000c080 to the C3 of the XX (xx), the 0f |.......ch. 3.|
0000c090 A7 63 63 62 62 62 62 20 20 20 20 02 06 00 00 18 |. CCBBBB ... |
0000C0A0 ff CE------XX
0000c0b0 2d 1d--------XX cc...........|
This shows that the update operation is a change operation on the original data location, and does not produce fragmentation
Redundancy (redundant)
[Email protected] 11:00:55>create table test3 engine = InnoDB Row_format = redundant as SELECT * from Test1;
[Email protected] 07:40:21>select * from Test3;
T1 T2 T3T4
A Bb BbCcc
D Ee EeFff
D Null Null Fff
Same Analysis data:
0000c070 (XX), the 6d, 6d |....supremum.#. |
0000c080 0c (0f), 06, 0c, 58, 84, and so on.---XX x.|
0000c090 (XX) 0a 2d 01 10 61 62 62 | [Email protected]|
0000C0A0-A-and-a-|bb ccc#. |
0000c0b0 0c (0f) EA (84 58 07 |), and so on.----XX x.|
0000C0C0 (XX) 0a, 2d, 1f 64 65 65 | [Email protected]|
0000c0d0-A-9e 94 |ee FFF!..---
0000c0e0 0c (0f) (0c) 84 58 08 | ...... T.... x.|
0000c0f0 (XX) 0a, 2d, 2e 64 00 00 | [Email protected]|
0000c100 xx (xx) xx (xx) (XX)
0000c110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
The first row in the Test3 table is recorded as (A,BB,BB,CCC) their lengths are 1,2,10,3
Plus there are three hidden fields (rowID (length 6), transaction ID (6), rollback ID (7))
So the length offset list has 7, respectively (6), (0c) 12, (13) 19, (14) 20, (16) 22, (20) 32, (23) 35
23,20,16,14,13,0c,06 The tablespace data record is reversed
0c 06 Length Offset list
0f XX BA header file ID
XX 0c rowID
XX 0a TransactionID
2d 01 10 Rollback ID
As for the primary key, the same rowid changed to 4 bytes of primary key information.
For the order of the inserted data, how is the bottom layer inserted?
Experiment: CREATE TABLE t3 (ID int not null,t1 varchar (TEN) CHARACTER set latin1 DEFAULT null,t2 varchar CHARACTER set latin1 D Efault Null,t3 char (6) CHARACTER SET latin1 default null,primary KEY (ID)) Engine=innodb default Charset=utf8 row_format= Compact
[email protected] 05:52:24>insert into T3 values (1, ' AA ', ' BB ', ' cc ');
Query OK, 1 row Affected (0.00 sec)
[email protected] 05:53:39>insert into T3 values (3, ' AAA ', ' BBB ', ' CCC ');
Query OK, 1 row Affected (0.00 sec)
[email protected] 05:54:50>insert into T3 values (2, ' AA ', ' bbbb ', ' CCC ');
Query OK, 1 row Affected (0.00 sec)
0000c070-Ten 6d 6d------------|supremum ... h|
0000C080 80 00 00 01 00 00 00 82 46 58 80 00 00 00 32 01 | ... Fx.... 2.|
0000c090 10 61 61 62 62 63 63 20 20 20 20 03 03 00 00 00 |. AABBCC ... |
0000C0A0 FF CD 80 00 00 03 00 00 00 82 46 65 80 00 00 | ... fe...|
0000c0b0 00 32 01 10 61 61 61 62 62 62 63 63 63 20 20 20 |. 2..AAABBBCCC |
0000C0C0, XX (FF) DB 80 00 00 02 00 00 00 82 | ....... |
0000C0D0 8a 80 00 00 00
32 01 10 61 61 62 62 62 62 63 | F..... 2..aabbbbc|
0000C0E0----------------------------xx xx xx |cc ...
0000c0f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
CREATE TABLE T1 (
ID Int (one) not NULL DEFAULT ' 0 ',
T1 varchar (TEN) CHARACTER SET latin1 DEFAULT NULL,
T2 varchar (Ten) CHARACTER SET latin1 DEFAULT NULL,
T3 char (6) CHARACTER SET latin1 DEFAULT NULL,
PRIMARY KEY (ID)
) engine=innodb DEFAULT Charset=utf8 row_format=redundant |
[email protected] 05:57:17>insert into T1 values (1, ' AA ', ' BB ', ' cc ');
Query OK, 1 row Affected (0.00 sec)
[email protected] 06:03:43>insert into T1 values (3, ' A ', ' B ', ' C ');
Query OK, 1 row Affected (0.00 sec)
[email protected] 06:04:01>insert into T1 values (2, ' AA ', ' AB ', ' cc ');
Query OK, 1 row Affected (0.00 sec)
0000c070---------6d 6d, 1b, |....supremum....|
0000c080 0a 0d d5 80 00 00 01 00 00 00 |................|
0000c090 D3 80 00 00 00 32 01 10 61 61 62 62 63 63 |. F..... 2..aabbcc|
0000C0A0 0a 0d 00 74 | (in a) ...... t|
0000c0b0-D5 80 00 00 00 32 01 | .....--XX F..... 2.|
0000C0C0 1b, 0a 04 00 |.-Ten. ABC ... |
0000c0d0 0d B0 (80)----D6 ........... f..|
0000c0e0 00 00 00 32 01 10 61 61 61 62 63 63 20 20 20 20 | ... 2..AAABCC [BR]]
Thus, whether it is the compact, or redundant, with the ID key, the order of insertion in accordance with your order, rather than according to the order of the ID
Optimize table T3;
[Email protected] 10:16:09>optimize table T3;
Table Op Msg_type Msg_text
Test.t3 OptimizeNote Table does not support optimize, doing recreate + analyze instead
Test.t3 OptimizeStatus Ok
Sorting data after collating
0000c070 (6d) 6d, Geneva, XX |supremum.......#|
0000c080 (XX) 5f FF 80 00 00 00 32 01 |........_ ..... 2.|
0000c090 10 61 61 62 62 63 63 20 20 20 20 04 02 00 00 00 |. AABBCC ... |
0000C0A0-------------5f FF 80 00 00 | %........_....|
0000c0b0 1d 61 61 62 62 62 62 63 63 63 20 20 20 |. 2..AABBBBCCC |
0000C0C0 (A8) 80 00 00 03 00 00 00 82 | .......... |
0000C0D0 5f FF, 2a 61 61 61 62 62 62 63 |_ ... 2.*aaabbbc|
0000C0E0----------------------------xx xx xx |cc ...
Delete data:
Optimize table T3;
0000c000 6e 94 DF, F, F, FF FF FF FF FF FF FF |n...............|
0000c010 5a (EB) 00 00 00 00 00 00 | ... Z).. he.......|
0000c020 xx C9 C0 80 04 00 00 00 00 |................|
0000c030 A3 00 02 00 01 00 02 00 00 00 00 00 00 00 00 |................|
0000c040, XX, xx, 2f C9 00 00 |........8/......|
0000c050 xx F2 C9 00 00 00 02 00 32 01 00 | ......... 2..|
0000c060 1d 6e, 6d, 6d, XX, 0b, xx |...infimum......|
0000c070 (6d) 6d, Geneva, XX |supremum.......#|
0000c080 (XX)----------------- 2.|
0000c090 10 61 61 62 62 63 63 20 20 20 20 04 02 00 00 00 |. AABBCC ... |
0000C0A0 FF CD-XX----------------... ' p...|
0000c0b0 1d 61 61 62 62 62 62 63 63 63 20 20 20 |. 2..AABBBBCCC |
0000c0c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
Thus, assuming that a large amount of data is deleted, we will run the Optimize table T3 operation to free up space.
The above is the difference between the InnoDB line format (compact,redundant), plus the insert,delete,update of the essential difference.
InnoDB row Format (compact,redundant) comparison