This analysis is called a simple version, because we plan to analyze a more complex database file later to gain a deep understanding of the structure implemented by the B-tree of the SQLite database. It is a good learning method from a simple one, the simple version file here refers to the size of 2 K bytes, that is, 1 K bytes for each B-tree page. There are two B-tree pages in total, here the B-tree page refers to the B-Tree node mentioned in the classical data structure book. Here it is called a page because SQLite uses the page concept to organize B-tree.
The creation method is as follows:
Create Table tbl1 (one varchar (10), two varchar (10 ));
Insert into "tbl1" values ('first', 'xxx ');
Insert into "tbl1" values ('second', 'yyy'); then exit and use ultraedit to open the database file:
00000000 H: 53 51 4C 69 74 65 20 66 6f 72 6D 61 74 20 33 00; SQLite Format 3.
00000010 H: 04 00 01 01 00 40 20 20 00 00 00 07 00 00 00 ;.....@........
00000020 H: 00 00 00 00 00 00 00 00 00 00 00 01 ;................
00000030 H: 00 00 00 00 00 00 00 00 00 00 00 00 ;................
00000040 H: 00 00 00 00 00 00 00 00 00 00 00 00 ;................
00000050 H: 00 00 00 00 00 00 00 00 00 00 00 00 ;................
00000060 H: 00 00 00 00 0d 00 00 00 01 03 B8 00 03 B8 00 00 00 ;..........?.?.
00000070 H: 00 00 00 00 00 00 00 00 00 00 00 00 ;................
All the middle parts are zero. Save!
000003a0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................
000003b0h: 00 00 00 00 00 00 00 46 01 06 17 15 15 01 71; ...... f ...... Q
000003c0h: 74 61 62 6C 65 74 62 6C 31 74 62 6C 31 02 43 52; tabletbl1tbl1. cr
000003d0h: 45 41 54 20 54 41 42 4C 45 20 74 62 6C 31 28; Eate table tbl1 (
000003e0h: 6f 6e 65 20 76 61 72 63 68 61 72 28 31 30 29 2c; one varchar (10 ),
000003f0h: 74 77 6f 20 61 72 63 68 61 72 28 31 30 29; two varchar (10) This is the first B-tree page, this B-tree page stores the sqlite_master information, which is the system table of the SQLite database.
The following analyzes the specific meanings of these binary values. SQLite uses the big-end method to represent data, which is different from the small-end method of general Intel machines:
Meaning of offset address size
0 16 "SQLite Format 3 \ 000"
16 2 400 h = 1024 bytes, the number of bytes per page
18 2 0101h indicates the version.
20 1 unused space at the end of each page. Here, zero indicates that data is stored from the last byte of each page.
21: the maximum number of slice parts. Similar to IP slice, one page cannot be saved and must be sharded.
22 1 minimum slice count
23 1 minimum number of leaf load partitions
24 4 File Modification count for parallel access
28 4 reserved unused
32 4 first freelist page
36 4. Number of freelist pages in the file
40 60 not used here
The above one hundred bytes are called the file header of the database file. This file header is available only on the first B-tree page. The structure of each B-tree page is not found, and the structure of each page is the same:
In sequence: B tree header structure, B tree pointer structure, unused space, B tree actual data load.
Here is a little different from the B-tree structure in the classical data structure book. The purpose here is to facilitate practical application, and the purpose of the book's structure is to explain the principles of B-tree. So there are some differences:
Generally, the structure of a B-tree page in the book is pointer, Data,..., pointer
SQLite is organized into pointer,..., pointer, Data,... data.
On the first page, the structure of the B-tree header starts from the fifth byte of row h:
Meaning of offset address size
0 1 0dh = 1101b meaning 1: intkey, 2: zerodata, 4: leafdata, 8: leaf
1 2 the byte offset of the first idle block, which is 0
3 2 01: This B-tree page contains one record, that is, only one record is stored in the system table, because only one table tbl1 is created.
5 2. The first address of the load area, 03b8. You can see that the 46 in line 000003b0h is the beginning of the load area.
7: the number of parts. The data here is small, so it is 0.
At the offset of 0000006bh, the number of BITs ends, followed by the number of B pointer structure. Here there is only one item, and there is only one pointer 03b8h.
The actual data of the sqlite_master table is changed from the offset of 000003b8h to the end. Of course, the data is structured. 46h indicates that the record contains 70 bytes. Except for its 46 bytes, and 01 is an index, the entire record is exactly 70 bytes, and 01 is followed by payload data.
The second Tree B page is listed below:
00000400 H: 0d 00 00 00 02 03 E5 00 03 F3 03 E5 00 00 00 00 ;......?.??...
00000410 H: 00 00 00 00 00 00 00 00 00 00 00 00 ;................
All the intermediate parts are zero. Save!
000007d0h: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ;................
000007e0h: 00 00 00 00 0C 02 03 19 13 73 65 63 6f 6e 64; ...... second
000007f0h: 79 79 0b 01 03 17 13 66 69 72 73 74 78 78 78; YYY ...... firstxxx
Because it is not the first page, there is no 100 bytes of the file header. At the beginning, it is the B-tree header structure. Here there are two pointers 03f3 and 03e5, and the others are the same as above. The entire database management system precisely manages this file.
Further work: only when there is more data can we see the benefits of B-tree organization: search, delete, and increase quickly! Increase this file and analyze it again!