InnoDB and validation of storage of null values in physical rows

Source: Internet
Author: User
Tags compact

Objective

Want to write this article, because before want to write a tool to parse InnoDB ibd file, in the process of writing this tool, found that the logical record to physical record conversion, the most difficult to have two parts, one is each row of Null value per field of bytes and storage, and the second is the variable length of the field occupies bytes and stored format. This article focuses on the first case. In the second case, we will write a special article
Before looking at Jiang Chengyao's "InnoDB Storage Engine" 103 page describes the compact line record format:

The second part after the variable-length field is the null flag bit, which indicates whether there is a null value in the row data, and the other is represented by 1. This section occupies 1 bytes of bytes

Then think about whether or not the number of columns is null, and that part only accounts for 1 bytes?
We have the following tests

This article conventions

Logical record: Record (tuple)
Physical record: Row (ROW)
Only compact line format is discussed

The tools used

Own Python-written tools innodb_extract

Test data table Structure
 localhost.test>desc null_test;+------------------+--------------+------+-----+---------+-------------- --+| Field | Type | Null | Key | Default | Extra |+------------------+--------------+------+-----+---------+----------------+| ID | bigint (20) | NO | PRI | NULL | auto_increment | 
| name | varchar (20) | YES | | NULL | |
| legalname | varchar (25) | YES | | NULL | |
| industry | varchar (10) | YES | | NULL | |
| province | varchar (10) | YES | | NULL | |
| city | varchar (15) | YES | | NULL | |
| size | varchar (15) | YES | | NULL | |
| admin_department | varchar (128) | YES | | NULL | | +------------------+--------------+------+-----+---------+----------------+8 rows in Set (0.00 sec)
In-table data
+----+------+-----------+----------+----------+------+------+------------------+| id | name | legalname | industry | province | city | size | admin_department |+----+------+-----------+----------+----------+------+------+------------------+|  1 | NULL | NULL      | NULL     | NULL     | NULL | NULL | NULL             |
| 2 | TOM | NULL | NULL | NULL | NULL | NULL | NULL |
| 3 | ALEX | NULL | NULL | NULL | NULL | NULL | HR | +----+------+-----------+----------+----------+------+------+------------------+3 rows in set (0.00 sec)
Analyze data

See three of rows of data with tools

#  python innodb_extract.py null_test.ibdinfimum7f 000010001c 8000000000000001 0000f1e27b17 b50000016800841          7e 0000180020 8000000000000002 0000f1e27b17 b5000001680094 544f4d2   TOM       3e 000020ffb6 8000000000000003 0000f1e27b17 b50000016800a4 414c4558 48523   ALEX      

First line:
Null flag bit: 0x7f (01111111)
Description: Write in right-to-left direction, total 7 null values
Record header:000010001c
Transaction Id:0000f1e27b17
Roll pointer:b5000001680084
Data:

Second line:
Null flag bit: 0x7e (01111110)
Description: Except for the second column, the rest are null values
Record header:0000180020
Transaction Id:0000f1e27b17
Roll pointer:b5000001680084
Data:
Second column: 544f4d TOM

Third line:
Null flag bit: 0x7e (00111110)
Description: Except for columns 2nd and 8th, the rest are null values
Record Header:000020ffb6
Transaction Id:0000f1e27b17
Roll pointer:b5000001680084
Data:
Second column: 414c4558 = ALEX
Eighth column: 4852 + = HR

Assume

Continue above, what if the field containing the null value is 8, or 9?

Depth analysis

Code snippet, which converts a physical record into a logical record, version 5.5.31, source file rem0rec.c,

Rec_convert_dtuple_to_rec_comp (/*===========================*/rec_t* Rec,/*!< in:origin of record */ Const dict_index_t* Index,/*!< in:record descriptor */const dfield_t* fields,/*!< In:array of data F Ields */ulint n_fields,/*!< In:number of data fields */ulint status,/*!< in:status bi TS of the record */Ibool temp)/*!< in:whether to use the format for temporary file    s in index creation */{const dfield_t* field;    const dtype_t* type;    Byte* end;    byte* nulls;    Byte* Lens;    Ulint Len;    Ulint i;    Ulint N_node_ptr_field;    Ulint Fixed_len;    Ulint null_mask = 1;    Ut_ad (Temp | | dict_table_is_comp (index->table));    Ut_ad (N_fields > 0);        if (temp) {Ut_ad (status = = Rec_status_ordinary);        Ut_ad (n_fields <= dict_index_get_n_fields (index)); N_node_ptr_field = ulint_undefined;        nulls = rec-1; if (Dict_table_is_comp (index->table)) {/* No need to do adjust fixed_len=0. We only have need to adjust it for row_format=redundant.        */temp = FALSE;        }} else {nulls = rec-(rec_n_new_extra_bytes + 1); Switch (univ_expect (status, Rec_status_ordinary)) {case Rec_status_ordinary:ut_ad (n_fields <= dict_            Index_get_n_fields (index));            N_node_ptr_field = ulint_undefined;        Break            Case Rec_status_node_ptr:ut_ad (N_fields = = Dict_index_get_n_unique_in_tree (index) + 1);            N_node_ptr_field = n_fields-1;        Break            Case Rec_status_infimum:case Rec_status_supremum:ut_ad (n_fields = = 1);            N_node_ptr_field = ulint_undefined;        Break            Default:ut_error;        Return    }} end = rec; LENS = Nulls-ut_bits_in_bytes (index->n_nullable);         /* Clear the Sql-null flags */memset (lens + 1, 0, nulls-lens);

In combination with the compact row format:

row记录格式如下:|--------extra_size--------------------------------|---------fields_data------------||-columns_lens-|-null lens-|---fixed_extrasize(5)--|--col1---|---col2---|---col2----||end<-----begin|end<--beign|-----------------------|orgin---------------------------|
    • First look at nulls = rec-(rec_n_new_extra_bytes + 1) rec for the record start offset, that is, extrasize is the length of the fixed-length record header. Note the NULL flag bit and the variable-length field length list are written from the left-to-right direction (for reasons see the next section of code). So nulls points to the null lens position where the next byte begins.
    • Look Again lens = Nulls-ut_bits_in_bytes (index->n_nullable) index->n_nullable refers to the number of fields in the table structure that define can be null, a field is marked with a bit, Ut_bits_in_bytes the number of bytes that will occupy the bit. So lens points to the position of a byte behind the Column_lens, that is, skipping the space occupied by the null flag, as well as writing the value from the back to the front.
    • memset (lens + 1, 0, Nulls-lens) will nulls space zeroed.

This is followed by iterating through each of the fields, first defining the can be null field for processing

/* Store the data and the offsets */    for (i = 0, field = fields; i < n_fields; i++, field++) {        const dict_field_t* ifield;        type = dfield_get_type(field);        len = dfield_get_len(field);        if (UNIV_UNLIKELY(i == n_node_ptr_field)) {            ut_ad(dtype_get_prtype(type) & DATA_NOT_NULL);            ut_ad(len == REC_NODE_PTR_SIZE);            memcpy(end, dfield_get_data(field), len);            end += REC_NODE_PTR_SIZE;            break;        }        if (!(dtype_get_prtype(type) & DATA_NOT_NULL)) {            /* nullable field */            ut_ad(index->n_nullable > 0);            if (UNIV_UNLIKELY(!(byte) null_mask)) {                nulls--;                null_mask = 1;            }                        

Because the direction is written from right to left, that is, from backwards, if the field is NULL, the NULL flag bit is set to 1 and the 1 bits are moved forward, if 8 is full, that is, 8 fields are null, offset shifts 1 bits to the left and Null_mask to 1

From this code to see the previous conjecture, that is, it is not a null flag bit only fixed occupies 1 bytes, but in 8 units, 8 null field is more than 1 bytes, less than 8 also occupy 1 bytes, high with 0 fill

            ut_ad(*nulls < null_mask);            /* set the null flag if necessary */            if (dfield_is_null(field)) {                *nulls |= null_mask;                null_mask <<= 1;                continue;            }            null_mask <<= 1;        }        

This code is to set the mapping of the null field to the null flag bit, and if the fields are null, set the flag bit to 1. This article is no longer detailed, and the analysis of the article of the variable-length field

Chestnut verification

Turn around and look at the previous example, we gradually add the field and set the default NULL to see the change of the null flag bit

    • Step 1, add two and set default null
localhost.test>alter table null_test add column `kind` varchar(15) DEFAULT NULL after `size`;Query OK, 3 rows affected (0.09 sec)Records: 3  Duplicates: 0  Warnings: 0localhost.test>alter table null_test add column licenseno varchar(15) DEFAULT NULL after `kind`;Query OK, 3 rows affected (0.11 sec)Records: 3  Duplicates: 0  Warnings: 0.11

So theoretically, the first row of data has 9 null columns. After 8 null columns, continue to move left, write 1 bit and start occupying two bytes. After we parse through the tools, we look at

#  python innodb_extract.py null_test.ibd01ff 000010001d 8000000000000001 0000f1e27c81 980000028c00841            01fe 0000180021 8000000000000002 0000f1e27c81 980000028c0094 544f4d2   TOM         00fe 000020ffb3 8000000000000003 0000f1e27c81 980000028c00a4 414c4558483   ALEX        

The first line of the null flag bit becomes 0x01ff, that is, 00000001 11111111 there are altogether 9 null fields, after 8 bits are full, continue to forward 1 bytes from right to left to continue writing
In the same vein, the second line is 0X01FE00000001 11111110
The third line 0x00fe,00000000 11111110

Continue adding 8 fields and setting default null

Localhost.test>desc null_test;+------------------+--------------+------+-----+---------+----------------+| Field | Type | Null | Key | Default | Extra |+------------------+--------------+------+-----+---------+----------------+| ID | bigint (20) | NO | PRI | NULL | auto_increment |
| name | varchar (20) | YES | | NULL | |
| Legalname | varchar (25) | YES | | NULL | |
| Industry | varchar (10) | YES | | NULL | |
| Province | varchar (10) | YES | | NULL | |
| City | varchar (15) | YES | | NULL | |
| Size | varchar (15) | YES | | NULL | |
| Kind | varchar (15) | YES | | NULL | |
| Licenseno | varchar (15) | YES | | NULL | |
| admin_department | varchar (128) | YES | | NULL | |
| Null_col1 | varchar (15) | YES | | NULL | |
| Null_col2 | varchar (15) | YES | | NULL | |
| Null_col3 | varchar (15) | YES | | NULL | |
| Null_col4 | varchar (15) | YES | | NULL | |
| Null_col5 | varchar (15) | YES | | NULL | |
| null_col6 | varchar (15) | YES | | NULL | |
| Null_col7 | varchar (15) | YES | | NULL | |
| Null_col8 | varchar (15) | YES | | NULL | | +------------------+--------------+------+-----+---------+----------------+18 rows in Set (0.00 sec)

The first row of the most null field currently has a 17 null field, corresponding to 17 null bit

#  python innodb_extract.py null_test.ibd01ffff 000010001e 8000000000000001 0000f1e27cce c60000017600840301fffe00001                    01fffe 0000180022 8000000000000002 0000f1e27cce c6000001760094 544f4d2   TOM                 01fefe 000020ffb0 8000000000000003 0000f1e27cce c60000017600a4 414c45 58483   ALEX        HR         

The first line of the null flag bit becomes 0x01ff, that is, 00000001 11111111 11111111 there are altogether 17 null fields, after two 8-bit full, continue to forward for 1 bytes from right to left to continue writing
In the same vein, the second line is 0X01FE00000001 11111111 11111110
The third line 0x00fe,00000001 11111110 11111110

Conclusion

Fields that allow nulls require additional space to hold the corresponding relationship of the field NULL to the NULL flag bit mapping, so the NULL flag bit length that holds the mapping relationship is not fixed. That is, the more null fields are not more space-saving. field where can be null should be minimized in the actual production environment

The variable-length field in the physical row is then specifically described in how to store the

InnoDB and validation of storage of null values in physical rows

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.