Comparison of InnoDB row format (compact, redundant), innodbredundant

Source: Internet
Author: User

Comparison of InnoDB row format (compact, redundant), innodbredundant
The InnoDB row format is divided into two formats (COMPACT, redundant). By default, the storage format of COMPACT compact is the first to be a list of non-NULL variable length fields, it is placed in reverse order according to the column sequence. When the column length is smaller than 255 bytes, it is expressed as 1 byte. If the column length is greater than 255 bytes. As shown in two bytes, the maximum length of varchar is 65535>, because the two bytes are 16 bits, that is, 65535, and the second part is the NULL flag, this bit indicates whether the row has a NULL value, expressed by 01, and 00 if no value exists. The subsequent part is that the record header occupies 5 bytes, And the last part is the data of each column actually stored. NULL does not occupy any data of this part, in addition to the NULL flag, the actual storage does not occupy any space. In addition, note that each row of data has two hidden columns, the transaction ID (6 bytes), in addition to the User-Defined columns ), the pointer column (7 bytes) will be rolled. If the INNODB table is not defined and Primay key is used, a six-byte rowid will be added for each row. If yes, how can we have four byte index fields.


The redundant storage format is the header of a Field Length offset list (the byte length occupied by each field and its corresponding displacement). It is also placed in reverse order of columns, if the column length is less than 255 bytes, 1 byte is used. If the column length is greater than 255 bytes, 2 byte table> is used. The second part is the record header. Different from the compact row format, the row format occupies 6 bytes. The last part is the data of each column actually stored, NULL does not occupy any data in this part, but if there is a NULL value in char, it needs to occupy the corresponding bytes. Note that each row of data has two hidden columns besides the User-Defined column, the transaction ID (6 bytes) will roll the pointer column (7 bytes). If the INNODB table is not defined, Primay key, a six-byte rowid will be added per row. If yes, how can I create a four-byte index field?


Q: How do I understand the length offset list? A: The length offset list indicates the length of each directory and its relative position.


Now let's make an experiment to see their differences.


Create table test1 (
T1 varchar (10) default null,
T2 varchar (10) default null,
T3 char (10) default null,
T4 varchar (10) DEFAULT NULL
) ENGINE = InnoDB default charset = latin1 ROW_FORMAT = COMPACT
Insert into test1 values ('A', 'bb ', 'bb', 'ccc ');
Insert into test1 values ('D', 'ee ', 'ee', 'fff ');
Insert into test1 values ('D', NULL, NULL, 'fff ');
Through python py_innodb_page_info.py-v/vobiledata/mysqldata/test/test1.ibd analysis, we can see that the data page has a page of 16 KB, and the third page is converted to 0000C000 in hexadecimal format.
Page offset 00000000, page type <File Space Header>
Page offset 00000001, page type <Insert Buffer Bitmap>
Page offset 00000002, page type <File Segment inode>
Page offset 00000003, page type <B-tree Node>, page level <0000>
Page offset 00000000, page type <Freshly Allocated Page>
Page offset 00000000, page type <Freshly Allocated Page>
Total number of page: 6:
Freshly Allocated Page: 2
Insert Buffer Bitmap: 1
File Space Header: 1
B-tree Node: 1
File Segment inode: 1
According to the analysis of hexdump-C-v/vobiledata/mysqldata/test/test1.ibd> tes1.txt
2017bff0 00 00 00 00 00 00 00 52 4b 47 ff 63 5e 66 c7 | ...... RKG. cf. |
2017c000 30 4e 95 AE 00 00 00 03 ff | 0N .............. | 2017c010 00 00 00 50 63 5f 15 76 45 bf 00 00 00 00 00 00 |... pc _. vE ....... |
2017c020 00 00 00 00 15 9c 00 02 00 ef 80 05 00 00 00 00 | ...... |
2017c030 00 d8 00 02 00 00 03 00 00 00 00 00 00 00 | ...... |
2017c040 00 00 00 00 00 00 00 36 01 00 00 15 9c 00 00 | ...... 6 ...... |
2017c050 00 02 00 f2 00 00 15 9c 00 00 00 02 00 32 01 00 | ...... |
2017c060 02 00 1e 69 6e 66 69 6d 75 6d 00 04 00 0b 00 00 |... infimum ...... |
2017c070 73 75 70 72 65 6d 75 6d 03 02 01 00 00 10 00 | supremum ...... |
2017c080 2c 00 00 0c 84 58 03 00 00 00 62 81 97 80 00 00 |,... X... B... |
2017c090 00 2d 01 10 61 62 62 62 20 20 20 20 20 20 |... abbbb |
2017c0a0 20 63 63 63 03 02 01 00 00 00 18 00 2b 00 00 0c | ccc ...... + ...... |
2017c0b0 84 58 04 00 00 00 62 81 f8 80 00 00 00 2d 01 10 |. X... B ...... |
2017c0c0 64 65 65 65 65 65 20 20 20 20 20 20 20 66 66 66 | deeee fff |
2017c0d0 03 01 06 00 00 20 ff 98 00 00 00 0c 84 58 05 00 00 | ...... |
2017c0e0 00 62 82 43 80 00 00 00 2d 01 10 64 66 66 66 00 |. B. C ...... dfff. |
2017c0f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ...... |
0000c100 00 00 00 00 00 00 00 00 00 00 00 00 00 | ...... |
2017c110 00 00 00 00 00 00 00 00 00 00 00 00 00 | ...... |


We can analyze the above data to see how INNODB stores data.
Store the data page and start saving the data in 2017c078,
, 02, 01 indicates the storage list of variable fields (the longest byte of variable fields recorded ),
00 indicates that there is no NUll in the record (if any, how to calculate the null position in binary format ),
00 00 10 00 2c indicates the record header information (defined as the length of 5 bytes). This header is used to connect the contact record, and InnoDB uses COMPACT format by default.
Data storage page, where data is stored starting at 2017c078, 01 indicates the variable field storage list (the longest byte of the variable field is recorded ),
00 indicates that there is no NUll in the record (if any, how to calculate the null position in binary format ),
00 00 10 00 2c indicates the record header information (defined as the length of 5 bytes). This header is used to connect the Contact Record and also for Row-level locks.
00 00 0c 84 58 03 indicates rowid. We didn't set the primary key, so (the hidden six bytes), we can see that in order to reduce the tablespace, we try to set the primary key in the design table.
00 00 00 62 81 97 six transation IDS
80 00 00 00 2d 01 10 seven-byte rollback pointer

PS: null does not occupy space

We observe the data stored in the third column: 00000110 indicates that the fourth column and the first column are not empty, 06 indicates that there is a null value, so far in the second and third places, = 06


Next, let's take a look at the changes in the space after the data is deleted:
Lin_ren @ test 12:19:09> select * from test1;
T1 t2 t3t4
A bb bbccc
D ee eefff
D NULL fff
3 rows in set (0.00 sec)


Xue_binbin @ test 12:27:40> delete from test1 where t1 = 'a ';
Query OK, 1 row affected (0.00 sec)
2017c070 73 75 70 72 65 6d 75 6d 03 02 01 00 20 00 10 00 | supremum ...... |
2017c080 00 00 00 0c 84 58 00 00 00 00 63 60 0c 00 00 00 |... X... c '... |
2017c090 00 33 26 06 61 62 62 62 20 20 20 20 20 20 20 |. 3 &. abbbb |
2017c0a0 20 63 63 63 03 02 01 00 00 00 18 00 2b 00 00 0c | ccc ...... + ...... |
2017c0b0 84 58 01 00 00 00 62 7c 94 80 00 00 00 2d 01 1f |. X... B | ......-... |
2017c0c0 64 65 65 65 65 65 20 20 20 20 20 20 20 66 66 66 | deeee fff |
2017c0d0 03 01 06 00 00 20 ff 98 00 00 00 0c 84 58 02 00 00 | ...... |
2017c0e0 00 62 7c 94 80 00 00 00 2d 01 2e 64 66 66 66 00 |. B | ......-... dfff. |
2017c0f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ...... |
We found that data storage has not changed.
Xue_binbin @ test 12:35:01> insert into test1 values ('A', 'bb ', 'bb', 'ccc ');
Query OK, 1 row affected (0.00 sec)
Xue_binbin @ test 12:35:08> select * from test1;
T1 t2 t3t4
D ee eefff
D NULL fff
A bb bbccc
2017c070 73 75 70 72 65 6d 75 6d 03 02 01 00 00 00 10 ff | supremum ...... |
0000c080 ef 00 00 0c 84 58 0c 00 00 00 00 63 63 8f 80 00 00 | ...... X ...... cc ...... |
2017c090 00 2d 01 10 61 62 62 62 20 20 20 20 20 20 |... abbbb |
2017c0a0 20 63 63 63 03 02 01 00 00 00 18 00 2b 00 00 0c | ccc ...... + ...... |
2017c0b0 84 58 01 00 00 00 62 7c 94 80 00 00 00 2d 01 1f |. X... B | ......-... |
2017c0c0 64 65 65 65 65 65 20 20 20 20 20 20 20 66 66 66 | deeee fff |
2017c0d0 03 01 06 00 00 20 ff a9 00 00 0c 84 58 02 00 00 | ...... |
2017c0e0 00 62 7c 94 80 00 00 00 2d 01 2e 64 66 66 66 00 |. B | ......-... dfff. |
This indicates that the data space is not released when it is deleted. If the same data is inserted, it can be reused while waiting for the next data insertion time. If no data is inserted, there will be fragments.
Xue_binbin @ test 12:35:13> insert into test1 values ('C', 'aaa', 'aaa', 'cc ');
Query OK, 1 row affected (0.00 sec)
Xue_binbin @ test 12:37:24> select * from test1;
2017c070 73 75 70 72 65 6d 75 6d 03 02 01 00 00 10 00 | supremum ...... |
2017c080 77 00 00 0c 84 58 0c 00 00 00 00 63 8f 80 00 00 | w ...... X ...... cc ...... |
2017c090 00 2d 01 10 61 62 62 62 20 20 20 20 20 20 |... abbbb |
2017c0a0 20 63 63 63 03 02 01 00 00 00 18 00 2b 00 00 0c | ccc ...... + ...... |
2017c0b0 84 58 01 00 00 00 62 7c 94 80 00 00 00 2d 01 1f |. X... B | ......-... |
2017c0c0 64 65 65 65 65 65 20 20 20 20 20 20 20 66 66 66 | deeee fff |
2017c0d0 03 01 06 00 00 20 ff a9 00 00 0c 84 58 02 00 00 | ...... |
2017c0e0 00 62 7c 94 80 00 00 00 2d 01 2e 64 66 66 66 02 |. B | ......-... dfff. |
2017c0f0 03 01 00 00 00 00 28 ff 78 00 00 00 0c 84 58 0d 00 00 | ...... (. x... X... |
0000c100 00 63 64 b5 80 00 00 00 2d 01 10 63 61 61 61 61 |. cd ...... caaaa |
2017c110 61 61 20 20 20 20 20 20 63 63 00 00 00 00 | aa cc ...... |
It can be seen that if a new data is inserted, the data will be stored in the lower space.
Next, let's look at the INNODB Storage with primary keys:
Create table te1 (
Id int (11) not null default '0 ',
T1 varchar (10) default null,
T2 varchar (10) default null,
T3 char (6) default null,
Primary key (id)
) ENGINE = InnoDB default charset = latin1 ROW_FORMAT = COMPACT
Insert into te1 values (1, 'A', 'bb ', 'bb ');
Insert into te1 values (2, 'cc', NULL, NULL );
Same analysis data: 109c070 73 75 70 72 65 6d 75 6d 02 00 00 00 00 10 00 26 | supremum ...... & |
2017c080 80 00 00 01 00 00 00 62 98 e7 80 00 00 00 2d 01 | ...... B ...... |
2017c090 10 61 61 62 62 62 20 20 20 20 20 20 20 02 |. aabbbb. |
2017c0a0 06 00 00 18 ff ca 80 00 00 00 00 00 00 62 99 2e | ............ B |
2017c0b0 80 00 00 00 00 2d 01 10 63 00 00 00 00 00 00 | ......-...... cc ...... |
0000c0c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ...... |
If a primary key exists, the index information will be recorded in the header file (00 00 10 00 26) (all record rows 80 00 00 01) 01 indicates primary key 1, save one byte less than the one without indexes.
Xue_binbin @ test 12:44:05> update te1 set t1 = 'cc' where id = 1;
Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0
109c070 73 75 70 72 65 6d 75 6d 02 00 00 00 10 00 22 | supremum ...... "|
2017c080 80 00 00 01 00 00 00 63 68 c3 00 00 00 00 33 0f | ...... ch ...... 3. |
2017c090 a7 63 63 62 62 62 20 20 20 02 06 00 00 18 |. ccbbbb ...... |
2017c0a0 ff ce 80 00 00 00 00 00 00 00 00 63 31 07 80 00 00 00 | ...... |
2017c0b0 2d 01 1d 63 63 00 00 00 00 00 00 00 00 00 |-... cc ...... |
It can be seen that the update operation changes the original data location without generating fragments.
Redundancy (redundant)
Xue_binbin @ test 11:00:55> create table test3 engine = innodb row_format = redundant as select * from test1;
Xue_binbin @ test 07:40:21> select * from test3;
T1 t2 t3t4
A bb bbccc
D ee eefff
D NULL fff
Also analyze data:
2017c070 08 03 00 00 73 75 70 72 65 6d 75 6d 00 23 20 16 |... supremum .. #. |
2017c080 14 13 0c 06 00 00 10 0f 00 ba 00 00 0c 84 58 06 | ............. |
2017c090 00 00 00 63 40 0a 80 00 00 00 2d 01 10 61 62 62 | ...... c @ ...... abb |
2017c0a0 62 62 20 20 20 20 20 20 20 63 63 63 23 20 16 | bb ccc #. |
2017c0b0 14 13 0c 06 00 00 18 0f 00 ea 00 00 0c 84 58 07 | ............. |
2017c0c0 00 00 00 63 40 0a 80 00 00 00 2d 01 1f 64 65 65 | ...... c @ ...... dee |
2017c0d0 65 65 20 20 20 20 20 20 66 66 66 21 9e 94 | ee fff !.. |
2017c0e0 14 13 0c 06 00 00 20 0f 00 74 00 00 0c 84 58 08 | ...... t ...... X. |
2017c0f0 00 00 00 63 40 0a 80 00 00 00 00 2d 01 2e 64 00 00 |... c @ ......-... |
0000c100 00 00 00 00 00 00 00 00 66 66 66 00 00 00 00 | ...... fff ...... |
2017c110 00 00 00 00 00 00 00 00 00 00 00 00 00 | ...... |
The first row in the test3 table is (a, bb, bb, ccc). Their lengths are 1, 2, 10, and 3, respectively.
In addition, there are three hidden fields (rowid (Length: 6), transaction ID (6), rollback ID (7 ))
Therefore, the length offset list contains 7 values: (06) 6, (0c) 12, (13) 19, (14) 20, (16) 22, (20) 32, (23) 35
Tablespace data records are in reverse order, which is, 13, 0c, 06
23 20 16 14 13 0c 06 length offset list
00 00 10 0f 00 ba header file ID
00 00 0c 84 58 06 rowid
00 00 00 63 40 0a transactionID
80 00 00 00 2d 01 10 rollback ID
As for the primary key, we also replace rowid with four bytes of primary key information.
How does one insert data at the underlying layer in the sequence of inserted data?


Experiment: create table t3 (id int not null, t1 varchar (10) character set latin1 default null, t2 varchar (10) character set latin1 default null, t3 char (6) character set latin1 default null, primary key (id) ENGINE = InnoDB default charset = utf8 ROW_FORMAT = compact;
Lin_ren @ test 05:52:24> insert into t3 values (1, 'A', 'bb', 'cc ');
Query OK, 1 row affected (0.00 sec)
Lin_ren @ test 05:53:39> insert into t3 values (3, 'aaa', 'bbb ', 'ccc ');
Query OK, 1 row affected (0.00 sec)
Lin_ren @ test 05:54:50> insert into t3 values (2, 'A', 'bbbbb', 'ccc ');
Query OK, 1 row affected (0.00 sec)
2017c070 73 75 70 72 65 6d 75 6d 02 00 00 00 10 00 48 | supremum ...... H |
2017c080 80 00 00 01 00 00 00 82 46 58 80 00 00 00 32 01 | ...... FX ...... |
2017c090 10 61 61 62 62 63 20 20 20 20 03 03 00 00 00 |. aabbcc ...... |
109c0a0 18 ff cd 80 00 00 00 00 00 00 82 46 65 80 00 00 | ...... |
2017c0b0 00 32 01 10 61 61 62 62 62 63 63 20 20 20 20 |... aaabbbccc |
2017c0c0 04 02 00 00 00 00 20 ff db 80 00 00 00 00 00 82 | ...... |
2017c0d0 46 8a 80 00 00 00
32 01 10 61 61 62 62 62 63 | F... 2... aabbbbc |
2017c0e0 63 63 20 20 00 00 00 00 00 00 00 00 00 00 | cc ...... |
2017c0f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ...... |
Create table t1 (
Id int (11) not null default '0 ',
T1 varchar (10) character set latin1 default null,
T2 varchar (10) character set latin1 default null,
T3 char (6) character set latin1 default null,
Primary key (id)
) ENGINE = InnoDB default charset = utf8 ROW_FORMAT = REDUNDANT |
Lin_ren @ test 05:57:17> insert into t1 values (1, 'A', 'bb', 'cc ');
Query OK, 1 row affected (0.00 sec)
Lin_ren @ test 06:03:43> insert into t1 values (3, 'A', 'B', 'C ');
Query OK, 1 row affected (0.00 sec)
Lin_ren @ test 06:04:01> insert into t1 values (2, 'A', 'AB', 'cc ');
Query OK, 1 row affected (0.00 sec)
2017c070 08 03 00 00 73 75 70 72 65 6d 75 6d 00 1b 15 13 | ...... supremum ...... |
2017c080 11 0a 04 00 00 10 0d 00 d5 80 00 00 01 00 00 | ...... |
2017c090 82 46 d3 80 00 00 00 32 01 10 61 61 62 62 63 63 |. F... 2... aabbcc |
2017c0a0 20 20 20 19 13 12 11 0a 04 00 00 18 0d 00 74 | ...... t |
2017c0b0 80 00 00 03 00 00 00 82 46 d5 80 00 00 00 32 01 | ...... F ...... 2. |
2017c0c0 10 61 62 63 20 20 20 20 1b 15 13 11 0a 04 00 |. abc ...... |
2017c0d0 00 20 0d 00 b0 80 00 00 00 00 00 82 46 d6 80 | ............ F |
2017c0e0 00 00 00 32 01 10 61 61 62 63 63 20 20 20 20 |... 2... aaabcc [BR]


It can be seen that whether it is compact or redundant, the ID is used as the primary key, and the insertion order is in your order, rather than the id order.
Optimize table t3;
Lin_ren @ test 10:16:09> optimize table t3;


Table Op Msg_type Msg_text
Test. t3 optimizenote Table does not support optimize, doing recreate + analyze instead
Test. t3 optimizestatus OK
Sort sorted data
109c070 73 75 70 72 65 6d 75 6d 02 00 00 00 10 00 23 | supremum ...... # |
2017c080 80 00 00 01 00 00 00 82 5f ff 80 00 00 00 32 01 | ...... _ ...... 2. |
2017c090 10 61 61 62 63 63 20 20 20 04 02 00 00 00 |. aabbcc ...... |
2017c0a0 18 00 25 80 00 00 00 00 00 00 82 5f ff 80 00 00 |... % ...... |
2017c0b0 00 32 01 1d 61 61 62 62 62 63 63 63 20 20 20 20 |. 2 .. aabbbbccc |
2017c0c0 03 00 00 00 00 20 ff a8 80 00 00 00 00 00 82 | ...... |
2017c0d0 5f ff 80 00 00 00 32 01 2a 61 61 61 62 62 62 63 | _ ...... 2. * aaabbbc |
2017c0e0 63 63 20 20 00 00 00 00 00 00 00 00 00 00 | cc ...... |
Delete data:
Optimize table t3;
Listen c000 6e 94 df 09 00 00 03 ff | n ...... |
2017c010 00 00 00 5a 29 eb 95 68 45 bf 00 00 00 00 00 |... Z) ...... hE ...... |
2017c020 00 00 00 00 16 c9 00 02 00 c0 80 04 00 00 00 | ...... |
2017c030 00 a3 00 02 00 01 00 00 00 00 00 00 00 00 | ...... |
0000c040 00 00 00 00 00 00 00 38 2f 00 00 16 c9 00 00 | ...... 8/...... |
2017c050 00 02 00 f2 00 00 16 c9 00 00 00 02 00 32 01 00 | .......... |
2017c060 02 00 1d 69 6e 66 69 6d 75 6d 00 03 00 0b 00 00 |... infimum ...... |
109c070 73 75 70 72 65 6d 75 6d 02 00 00 00 10 00 23 | supremum ...... # |
2017c080 80 00 00 01 00 00 00 82 60 50 80 00 00 00 32 01 | ...... 'P... 2. |
2017c090 10 61 61 62 63 63 20 20 20 04 02 00 00 00 |. aabbcc ...... |
2017c0a0 18 ff cd 80 00 00 00 00 00 00 82 60 50 80 00 00 | ........... 'P... |
2017c0b0 00 32 01 1d 61 61 62 62 62 63 63 63 20 20 20 20 |. 2 .. aabbbbccc |
0000c0c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ...... |
We can see that if you delete a large amount of data, we need to execute the optimize table t3 operation to release space.
The above is the InnoDB row format divided into two formats (COMPACT, redundant), plus the essential differences between Insert, delete, and update.


Differences between mysql column formats compact, compressed, default, fixed, and redundant or related articles

Row_format of Mysql

In mysql, if a table does not contain varchar, text, its deformation, blob, and its deformation fields, this table is also called a static table, that is, the row_format of the table is fixed, that is, each record occupies the same bytes. Its advantage is fast reading, and its disadvantage is a waste of additional space.

If a table contains varchar, text, its deformation, blob, and its deformation fields, this table is also called a dynamic table, that is, the row_format of this table is dynamic, that is to say, the bytes occupied by each record are dynamic. Its advantages are space saving and its disadvantages increase the read time overhead.
Therefore, tables with a large number of search queries generally exchange space for time and are designed as static tables.

Row_format has other values:
DEFAULT
FIXED
DYNAMIC
COMPRESSED
REDUNDANT
COMPACT

Modify row format
Alter table table_name ROW_FORMAT = DEFAULT

The modification process causes:
Fixed ---> dynamic: this will cause CHAR to become VARCHAR
Dynamic ---> fixed: This causes VARCHAR to become CHAR

MYSQL row format

The compact row format (innodb default row format). The structure is as follows:
Variable Length Field Length list, null flag, record header information, column 1 data, column 2 data ....
This format was originally designed to efficiently store data and has two hidden columns: the transaction ID column and the rollback pointer column. If no primary key is defined, each row also adds a rowid column as the hidden primary key, 6 bytes

The row data stored in the COMPRESSED format is COMPRESSED by the zlib algorithm. Therefore, it is suitable for storing large-length data such as blob and text.

The following is a reference manual for the interpretation of DYNAMIC format:

The DYNAMIC format considers that if a part of long data needs to be stored on the overflow page, the most effective way is to store all data on the overflow page. Shorter columns are still stored on the B-Tree node, reducing the minimum number of overflow pages required for any given row. ----- Dynamic length fields, such as varchar

The fixd row format is suitable for static fixed-length types such as char.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.