Data storage structure of MysqlMYISAM storage engine

Source: Internet
Author: User
MYSQL internal data storage has not been well documented. I am free to study it. [@ more @] 1. analyze row data with single fields: 1.1 create test data droptableifexistsheyf_5; createtableheyf_5 (namevarchar (50) typemyisamdefadefacharsetlatin1; insertintoheyf_5value

MYSQL internal data storage has not been well documented. I am free to study it. [@ more @] 1. use a single field to analyze row data: 1.1 create test data drop table if exists heyf_5; create table heyf_5 (name varchar (50) type myisam default charset = latin1; insert into heyf_5 value

There has been no good documentation for the internal data storage of MYSQL. I am free to study it. [@ more @]

1. Use a single field to analyze row data:

1.1 create Test Data
Drop table if exists heyf_5;
Create table heyf_5 (name varchar (50) type myisam default charset = latin1;
Insert into heyf_5 values ('A'), ('B'), ('C ');

[Root @ alisoft-test1 test] # system hexdump/opt/mysql/data/test/heyf_5.MYD
--------------------------------------------------
0000000 0003 0d03 01fe 0061 0000 0000 0000
0000010 0000 0000 0003 0d03 01fe 0062 0000 0000
0000020 0000 0000 0000 0000 0d03 01fe 0003
0000030 0000 0000 0000 0000 0000 0000
000003c


1.2 start data analysis

ROW1: 0003 0d03 01fe 0061 0000 0000 0000 0000 0000
ROW2: 0003 0d03 01fe 0062 0000 0000 0000 0000 0000
ROW3: 0003 0d03 01fe 0063 0000 0000 0000 0000 0000 0000
------------------------------

We use the first line of data for analysis:
Because the data is stored in the hard disk, the dual-byte is stored first at the low level and then after the high level. So when we read the data, we need to reverse:

ROW1: 03 00 03 0d fe 01 61 00...

Where:
---------------------------------------------
03: start of header-Block type, see mi_dynrec.c, _ mi_get_block_info ()
00 03: actual length (varchar storage space is; actual character length + 2)
0d: usused length
Fe: flags (0 indicates not empty, 1 indicates empty)
01: the actual data length of the field in this row (only available for variable-length fields)
61: Actually stored data value: 'A'
00: unused space (including all 00 before the next row)
---------------------------------------------

1.3 Let's verify the above assumptions:

Update heyf_5 set name = 'aaaaa' where name = 'a ';
Update heyf_5 set name = null where name = 'B ';
[Root @ alisoft-test1 test] # system hexdump/opt/mysql/data/test/heyf_5.MYD
-------------------------------------------------
0000000 0003 0808 06fe 6161 6161 6161 0000
0000010 0000 0000 0003 0e02 00ff 0000 0000 0000
0000020 0000 0000 0000 0000 0d03 01fe 0003
0000030 0000 0000 0000 0000 0000 0000


Fetch 1st rows of data:
0003 0808 06fe 6161 6161 6161 0000 0000 0000
Convert:
03 00 08 08 fe 06 61 61 61 61 61 00 00 00 00 00 00

Where:
------------------------------------
03: start of header-Block type
00 08: Actual character length = 6 + 2 B
08: unused space: 8 B
Fe: flags (0 indicates not null, 1 indicates null)
06: the actual data length of this field in this row is 6
61 61: actual storage value: 'A'
00: unused space (8 in size)


Fetch 2nd rows of data:
0003 0e02 00ff 0000 0000 0000 0000 0000 0000
Convert:
03 00 02 0e ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Where:
------------------------------------
03: start of header-Block type
00 02: Actual character length = 0 + 2 B
0e: unused space: 14 B
Ff: --> 1111 1111 because there is only one field, the flag is represented by the last 1 bit, and 1 is represented as NULL.
00: the actual data length of this field in this row is 0.
00...: unused space (exactly 14)


Analysis of 1.4 rows of links

As shown above, during row initialization, the system reserves 14 bytes for the VARCHAR field.
If our field value is within the range of 14 characters, it will not overflow. Let's take a look at the overflow situation (that is, the line link is required:

Update heyf_5 set name = 'bbbbbaaaaabbbbbaaaa' where name is null;
Update heyf_5 set name = 'cccccdddddcccdddddzzzzzee 'where name = 'C ';
System hexdump/opt/mysql/data/test/heyf_5.MYD
---------------------------------------------
0000000 0003 0808 06fe 6161 6161 6161 0000
0000010 0000 0000 0005 0016 0007 0000 0000
0000020 fe3c 6214 6262 6262 0005 001b 0007 0000
0000030 0000 0000 fe50 6319 6363 6363 0009 010f
0000040 6161 6161 6261 6262 6262 6161 6161
0000050 0009 0216 6464 6464 6364 6363 6363
0000060 6464 7a64 7a7a 7a7a 6565 0000

We updated the data of the second and third rows above:
ROW2: 0005 0016 0007 0000 0000 0000 fe3c 6214 6262 6262
ROW3: 0005 001d 0007 0000 0000 fe50 631b 0000 6363

At the same time, we can see that there is other data behind the address 3C:
0000030 0009 010f
0000040 6161 6161 6261 6262 6262 6161 6161
0000050 0009 010f 6464 6464 6364 6363 6363
0000060 6464 0064
Here we will go back to the analysis:


Let's first convert the row data:
ROW2: 05 00 16 00 07 00 00 00 00 00 3c fe 14 62 62 62 62
ROW3: 05 00 1d 00 07 00 00 00 00 00 00 50 fe 1b 63 63 63 63

ROW2 data, where:
--------------------------------------------------------
05: the row link type will change to 05
00 16: Actual character length = 20 + 2 = 22
00: unused space: 0 B
07 00 00 00 00 00 00 3c:
This 5-byte address is actually the link address of the row of data.
ROW2 is 3C, and ROW3 is 50

Fe: flags (0 indicates not null, 1 indicates null)
14: The actual data length of this field in this row: 20
5: 62/63: the actual storage value: 'bbbbbb'/'ccccccc'
---------------------------------------------------------

Here we can see that ROW2 only has 'bbbbbb', so where are other data?
MYSQL needs to be linked Based on the type, and the link address is: 07 00 00 00 00 00 3c

We read the linked block data from 3C:
0009 010f 6161 6161 6261 6262 6262 6161 6161
Convert:
09 00 0f 01 61 61 61 61 61 62 62 62 62 61 61 61 61 61 00

Where:
--------------------------------------------------------
09: or type: indicates that this is linked data.
00 0f: the actual length of the segment is 15
01: number of characters in idle Space
61 61 61 61 61 62 62 62 61 61 61 61 61: specific value
00: idle Space
---------------------------------------------------------


Similarly, let's take a look at the link data of ROW3: (starting from 50)
0009 0216 6464 6464 6364 6363 6363 6464 6464 7a64 7a7a 7a7a 6565 0000
Convert:
09 00 16 02 'dddddcccdddddzzzzzee '00 00

Where:
--------------------------------------------------------
09: or type: indicates that this is linked data.
00 16: the actual length of the segment is 22.
02: unused space: 2 B
...: Is the specific value and 00
---------------------------------------------------------

2. Use multiple fields to analyze row data:

Drop table if exists heyf_5;
Create table heyf_5 (id int, name varchar (50), id1 int) type myisam;
Insert into heyf_5 values (100, 'aaa', 3), (2, 'bb', 12), (3, 'C', 4 );
System hexdump/opt/mysql/data/test/heyf_5.MYD
========================================================== ============
0000000 0003 020e f800 0064 0000 6103 6161
0000010 0000 0000 0003 030d f800 0002 0000 6202
0000020 0c62 0000 0000 0000 0003 040c f800 0003
0000030 0000 6301 0004 0000 0000 0000
========================================================== ============

ROW1: 0003 020e f800 0001 0000 6103 6161 0003 0000
Convert:
03 00 0e 02 00 f8 01 00 00 00 03 61 61 61 03 00 00 00 00 00

Where:
-----------------------------------------
03: type
00 0e: 13 = 4 (int) + [3 + 2 (varchar)] + 4 (int)
02: two free Spaces
00 f8: 1111 1000 --> indicates three fields, not NULL
The specific data is as follows:
01 00 00 00: the INT 4B value is 1
03 61 61 61: because the length is longer, 03 indicates the length of the value, followed by the specific value
03 00 00 00: the INT 4B value is 3
-----------------------------------------

3. Summary:

3.1 if it is a new line:

Each line should be allocated with at least 14 bytes + the first 6 = 20 B. If more than 14 B are allocated, it should be allocated as needed.
The row data format is:
---------------------------------------------
03 (LEN: 1B): start of header-Block type, see mi_dynrec.c, _ mi_get_block_info ()
00 03 (LEN: 2B): actual length (varchar storage space is; actual character length + 2)
0d (LEN: 1B): number of unused bytes in the space allocated to the row
00 fe (LEN: XB): flags (0 indicates not empty, 1 indicates empty) (here is a byte or two bytes, as defined by. FRM)

-- Data:
-- The first field:
01 (LEN: XB): If it is of the variable length type, 1-N fields will be included before each field data starts to indicate the actual length of the field data,
If it is a fixed-length data type, this identifier is not found.
DATA (LEN: XB): Actually stored value: 'A'. The length depends on the second identifier.
-- The second field:
...
-- The third field:
...
00 (LEN: XB): unused space (length depends on the third identifier)
---------------------------------------------


3.2 If the row is updated and the link is generated:

3.2.1 The format of row data is as follows:
--------------------------------------------------------
05 (LEN: 1B): the row link type will change to 05
00 16 (LEN: 1B): Actual character length = 20 + 2 = 22
00 (LEN: 1B): unused space: 0 B
07 00 00 00 00 00 00 3c (LEN: 9B): link address of the row data
Fe (LEN: XB): flags (0 indicates not null, 1 indicates null)

-- Data:
-- The first field:
14 (LEN: XB): If it is of the variable length type, 1-N fields will be included before each field data starts to indicate the actual length of the field data,
If it is a fixed-length data type, this identifier is not found.
DATA (LEN: XB): Actually stored value: 'A'. The length depends on the second identifier.
-- The second field:
...
---------------------------------------------------------

3.2.2 and the data format of the link part is:
--------------------------------------------------------
09: Type: indicates that this is linked data.
00 16: the actual length of the paragraph
02: unused space allocated to the space of the link row: 2 B
-- Data:
The first field:
14 (LEN: XB): If it is of the variable length type, 1-N fields will be included before each field data starts to indicate the actual length of the field data,
If it is a fixed-length data type, this identifier is not found.
DATA (LEN: XB): Actually stored value: 'A'. The length depends on the second identifier.
Second Field:
...
00: unused space
XXX: Next line
---------------------------------------------------------

3.3 rows deleted:
If the row is deleted, the TYPE of the row is 00, and the data part is set to FF;

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.