Introduction and Analysis of SQLite (III)-kernel Overview (1)

Source: Internet
Author: User
Tags rewind

Preface: Starting from this chapter, we start to enter the SQLite kernel. In order to better understand SQLite, I will first discuss the kernel from the overall structure. It is very important to grasp SQLite from the global perspective. SQLite kernel implementation is not very difficult, but it is not very simple. In general, there are three parts. This chapter mainly discusses Virtual machines, but it is only an overview of the principles here, and it does not involve much actual code. However, the source code will be carefully discussed after the kernel is outlined. Now let's discuss virtual machines (VMS ).

 

1. Virtual Machine)
VDBE is the core of SQLite, and its upper and lower modules are essentially serving it. Its implementation is located in vbde. c, vdbe. h, vdbeapi. c, vdbeInt. h, and vdbemem. c. It uses the underlying infrastructure B + Tree to execute the byte code generated by the Compiler (Compiler). This bytecode programming language (bytecode programming lauguage) is used for query, it is specially designed to read and modify databases.
Byte Code is encapsulated in the memory as a sqlite3_stmt object (internally called Vdbe, see vdbeInt. h). Vdbe (or statement) contains everything required by the execution program:
A) a bytecode program
B) names and data types for all result columns
C) values bound to input parameters
D) a program counter
E) an execution stack of operands
F) an arbitrary amount of "numbered" memory cells
G) other run-time state information (such as open BTree objects, sorters, lists, sets)


The byte code is very similar to the assembler. Each instruction consists of an operation code and three operands: <opcode, P1, P2, P3>. Opcode is the operation code of a certain function. To understand it, it can be regarded as a function. P1 is a 32-bit signed integer, and p2 is a 31-bit unsigned integer. It is usually the destination address (destination) of the jump (jump) command ), of course this has other purposes; p3 is a pointer to a string or other struct ending with null. Different from api c, the VDBE operation code often changes, so you should not use bytecode to write programs.
The following C APIs directly interact with VDBE:
• Sqlite3_bind_xxx () functions
• Sqlite3_step ()
• Sqlite3_reset ()
• Sqlite3_column_xxx () functions
• Sqlite3_finalize ()

To have a sensitivity, let's look at a specific bytecode program:
Sqlite>. m col
Sqlite>. h on
Sqlite>. w 4 15 3 15
Sqlite> explain select * from episodes;
Addr opcode p1 p2 p3
----------------------------------------
0 Goto 0 12
1 Integer 0 0
2 OpenRead 0 2 # episodes
3 SetNumColumns 0 3
4 Rewind 0 10
5 Recno 0 0
6 Column 0 1
7 Column 0 2
8 Callback 3 0
9 Next 0 5
10 Close 0 0
11 Halt 0 0
12 Transaction 0 0
13. VerifyCookie 0 10
14 Goto 0 1
15 Noop 0 0

1.1 Stack)
A vdbe program usually consists of different sections that complete specific tasks. Each segment contains some operation stack commands. This is because different commands have different numbers of parameters. Some commands have only one parameter; some commands have no parameters; some commands have several parameters. In this case, the three operands cannot be satisfied.
In this case, the command uses the stack to transmit parameters. (Note: from the Assembly perspective, there are several ways to pass parameters, such as registers and global variables, while stacks are commonly used in modern languages, it has great flexibility ). These commands do not do these tasks on their own, so before them, they need help from other commands. VDBE saves the intermediate computing results to memory cells. In fact, both the stack and memory units are based on Mem (see vdbeInt. h) Data Structure (note: the memory units in the stack here are all virtual. One computer scientist said that more than 90% of Science in Computer Science is a virtualization problem. It is not false. The OS is essentially a virtual machine. Here, we can also see virtualization everywhere in SQLite. We will discuss this issue carefully in the later OS Interface module ).

1.2 Program Body)
This is a process of opening the episodes table.
The first command: Integer is used to prepare the second command, that is, to push the parameters required for executing the second command into the stack. OpenRead extracts the parameter values from the stack and then executes the command. SQLite can use the ATTACH command to open multiple database files in a connection. Whenever SQLite opens a data file, it assigns an index number (index) to it, and the index of main database is 0, the first database is 1. The index value of the Integer instruction database is pushed into the stack, and OpenRead extracts the value from it and determines which data to open. Let's take a look at the explanation in the SQLite document:
Open a read-only cursor for the database table whose root page is P2 in a database file.
The database file is determined by an integer from the top of the stack. 0 means the main database and 1 means the database used for temporary tables. give the new cursor an identifier of P1. The P1 values need not be contiguous but all P1 values shoshould be small integers. it is an error for P1 to be negative.
If P2 = 0 then take the root page number from off of the stack.
There will be a read lock on the database whenever there is an open cursor. If the data-
Base was unlocked prior to this instruction then a read lock is acquired as part of this instruction. A read lock allows other processes to read the database but prohibits any other process from modifying the database. the read lock is released when all cursors are closed. if this instruction attempts to get a read lock but fails, the script terminates with an SQLITE_BUSY error code.
The P3 value is a pointer to a KeyInfo structure that defines the content and collating

Sequence of indices. P3 is NULL for cursors that are not pointing to indices.

Let's take a look at the columns to which the SetNumColumns command sets the cursor. P1 is the cursor index (0 here, just opened), P2 is the number of columns, and the episodes table has three columns.
Continue the Rewind command. It resets the cursor to the beginning of the table. It checks whether the table is empty (that is, there is no record). If no record exists, it causes the command pointer to jump to the command specified by P2. Here, P2 is 10, that is, the Close command. Once Rewind sets the cursor, execute the commands 5-9. Their main function is to traverse the result set. Recno pushes the keyword of the record specified by cursor P1 to the stack. The Column command starts from the cursor specified by P1, and the Column value specified by P2. The values of the id (primary key), season, and name fields are pushed to the stack in the commands 5, 6, and 7. Next, the Callback command extracts three values (P1) from the stack and forms an array of records stored in the memory cell ). Callback stops the VDBE operation and gives control to sqlite3_stemp (). This function returns SQLITE_ROW.

Once VDBE creates a record structure, we can use sqlite3_column_xxx () functions to retrieve values from the record structure. When sqlite3_step () is called Next time, the Instruction Pointer Points to the Next instruction, and the Next instruction moves the cursor to the Next row. If there are other records, it will jump to the command specified by P2, where command 5 is used to create a new record structure and keep repeating until the end of the result set. The Close command closes the cursor and then executes the Halt command to end the VDBE program.

1.3 program start and stop
Now let's take a look at the remaining commands. The Goto command is a jump command and jumps to P2, that is, 12th commands. Command 12 is a Transaction, which starts a new Transaction, and then executes VerifyCookie. After the VDBE program is compiled, whether the database mode is changed (that is, whether the update operation has been performed ). This is a very important concept in SQLite. During this time when SQL is compiled into VDBE code by sqlite3_prepare () and the program calls sqlite3_step () to execute bytecode, another SQL command may change the database mode (such as ALTER TABLE, DROP TABLE, or CREATE TABLE ). Once this happens, the previously compiled statement becomes invalid, and the database mode information is recorded on the Root page of the database file. Similarly, each statement has a backup for this mode during compilation. The VerifyCookie function is to check whether they match. If they do not match, related operations will be taken.

If the two match, the next command Goto will be executed; it will jump to the main part of the program, that is, the first command, to open the table read record. There are two points worth noting:
(1) The Transaction command itself does not obtain The lock (The Transaction instruction doesn't acquire any locks in itself ). Its function is equivalent to BEGIN, but actually the share lock is obtained by the OpenRead command. The lock is released when the transaction is closed, depending on the Halt command, which will be used for Tail scanning.
(2) the storage space required by the statement object (VDBE program) is determined before the program is executed. This is due to two important facts: first, the depth of the stack is no more than the number of commands (usually much less ). Secondly, before running the VDBE program, SQLite can calculate the memory required for resource allocation.

1.4 Instruction type (Instruction Types)
Each Command completes a specific task and is usually related to other commands. Generally, commands can be divided into three types:
(1) Value manipulation: these commands generally perform arithmetic operations, such as add, subtract, divide, logical operations, such as and or, AND string operations.
(2) Data management: these commands operate Data on memory and disk. Memory commands perform stack operations or transmit data between memory units. Disk operation commands control B-tree and pager to open or operate the cursor, start or end the transaction, and so on.

(3) Control flow: Control commands mainly move command pointers.

1.5 Program execution)
Finally, let's look at how the VM interpreter implements and how the byte code is roughly executed. There is a key function in the vdbe. c file:
// Run the VDBE Program
Int sqlite3VdbeExec (
Vdbe * p/* The VDBE */
)
This function is the entry to run the VDBE program. Let's take a look at its internal implementation:

/* Execute commands from here
** Pc is a program counter (int)
*/
For (pc = p-> pc; rc = SQLITE_ OK; pc ++ ){
// Obtain the operation code
POp = & p-> aOp [pc];
Switch (pOp-> opcode ){
Case OP_Goto: {/* jump */
CHECK_FOR_INTERRUPT;
Pc = pOp-> p2-1;
Break;
}
... ...
}
}
From this code, we can roughly introduce the principle of VM execution: the VM interpreter is actually a for loop containing a large number of switch statements, and each switch statement implements a specific operation command.

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.