Explain the process and sequence of Oracle SQL statement execution.
Step 1: the client sends the statement to the server for execution
When we execute an SQL statement on the client, the client sends the SQL statement to the server and allows the server process to process the statement. In other words, the Oracle client does not perform any operations. Its main task is to send some SQL statements generated by the client to the server. After a server process receives information from a user process, it needs to allocate the required memory to the PGA to store relevant information, such as the logon information stored in the session memory.
Although there is also a database process on the client, the role of this process is different from that on the server, the database process on the server can process SQL statements. However, it should be noted that the client processes correspond to the server processes one by one. That is to say, after the client connects to the server, a process is formed between the client and the server. The client is called the client process, and the server is called the server process.
Step 2: Statement Parsing
After the client sends an SQL statement to the server, the server process parses the statement. This resolution is performed on the server, and the resolution actions can be divided into many small actions.
1) query high-speed cache (library cache)
When the server process receives the SQL statement sent from the client, it does not directly query the database. The server process converts the character of this SQL statement into an ASCII equivalent digital code. Then, the ASCII code is passed to a HASH function and a hash value is returned, then, the server process will go to the library cache (high-speed cache) in the shared pool to check whether the same hash value exists. If this statement exists, the server process uses the analyzed version of the statement that has been cached in the shared pool library cache for execution, saving the need for subsequent parsing. This is soft parsing. If the adjustment cache does not exist, you need to perform the subsequent steps, which is hard parsing. Hard Parsing is usually an expensive operation, which takes about 70% of the total SQL Execution time. Hard parsing will generate execution trees, execution plans, and so on.
Therefore, using high-speed data cache can improve the query efficiency of SQL statements. There are two reasons: on the one hand, reading data from the memory is more efficient than reading data from data files on the hard disk, and on the other hand, it saves time to avoid parsing statements.
However, you must note that this data cache is different from the data cache of some client software. Some client software sets data cache on the client of the application software to improve query efficiency. Because of the existence of the data cache, the query efficiency of the client application software can be improved. However, if others make relevant modifications on the server, the modified data cannot be promptly reflected on the client due to the existence of the application software data cache. From this we can see that the data cache of the application software is not the same as the high-speed data cache of the database server.
2) Statement validity check (data dict cache)
When the corresponding SQL statement cannot be found in the cache, the server process starts to check the validity of this statement. Here we mainly check the syntax of SQL statements to see if they comply with the syntax rules. If the server process determines that the SQL statement does not comply with the syntax rules, the error message is returned to the client. During this syntax check, the table name and column name contained in the SQL statement are not checked, but the syntax is checked.
3) Check language meaning (data dict cache)
If the SQL statement meets the syntax definition, the server process will parse the tables, indexes, views, and other objects involved in the statement, check the names and structures of these objects against the data dictionary to see if these fields, tables, and views are in the database. If the table name and column name are inaccurate, the database reports the error information to the client.
Therefore, when we write a select statement, if the syntax is wrong with the table name or column name, the system first prompts a syntax error, when the syntax is correct, the system prompts that the column name or table name is incorrect.
4) Get the object resolution lock (control structer)
When the syntax and semantics are correct, the system locks the objects to be queried. This is mainly to ensure data consistency and prevent other users from changing the structure of this object during the Query Process.
5) Check data Access Permissions (data dict cache)
After checking the syntax and semantics, the client may not be able to obtain the data. The server process also checks whether the connected user has the permission to access the data. If the user does not have the data access permission, the client cannot obtain the data. Note that the database server process checks the syntax and semantics before checking the access permission.
6) determine the best execution plan
When there is no problem with syntax and semantics, the server process does not directly query database files. The server process will optimize this statement according to certain rules. Before the execution plan is developed, there will be one-step query conversion, such as view merging, subquery unnesting, predicate and materialized view rewriting query. In order to determine which execution plan to adopt, Oracle also needs to collect statistics to determine the access join method of the table, and finally determine the possible lowest cost execution plan.
However, note that this optimization is limited. Generally, the SQL statement of the database needs to be optimized in the process of application software development. The function of this optimization is greatly greater than the self-optimization of the server process.
After the server process optimizer determines the optimal execution plan of the query statement, the SQL statement and execution plan are saved to the data cache ). In this way, when there is still this query, the above syntax, semantics, and permission check steps will be omitted, and the SQL statement will be executed directly to improve the efficiency of SQL statement processing.
Step 3: Bind Variable assignment
If the Bind Variable is used in the SQL statement, scan the declaration of the Bind Variable, assign a value to the BIND variable, and bring the variable value into the execution plan. If SQL exists in the high-speed buffer in the first step of resolution, it will jump directly to this step.
Step 4: statement execution
Statement parsing only parses the syntax of the SQL statement to ensure that the server can know what the statement actually represents. After the statement Parsing is complete, the database server process will actually execute this SQL statement.
For SELECT statements:
1) First, the server process must determine whether the required data exists in the db buffer. If yes, the server directly obtains the data instead of querying the data from the database file, increase the access count based on the LRU algorithm;
2) if the data is not in the buffer, the server process will query the relevant data from the database file and put the data into the data buffer (buffer cache ).
If the data exists in the db buffer, the availability check method is to check whether there is a transaction in the header of the db buffer block. If there is a transaction, the data is read from the rollback segment; if no transaction exists, the select scn and the db buffer block header scn are compared. If the former is smaller than the latter, the data still needs to be read from the rollback segment; if the former is greater than the latter, this indicates that it is a non-dirty cache and can directly read the content in this db buffer block.
For DML statements (insert, delete, update ):
1) check whether the required database has been read to the buffer cache. If a buffer cache already exists, perform step 3;
2) If the required database is not in the buffer cache, the server reads the data block from the data file to the buffer cache;
3) Lock Row Exclusive Lock (Row Exclusive Lock) for the table to be modified, and then obtain an Exclusive Lock for the data Row to be modified;
4) copy the Redo record of data to the redo log buffer;
5) generate undo data for data modification;
6) modify the db buffer;
7) dbwr writes the modification to the data file;
Step 4: The server reads data from the data file to the db buffer using the following steps:
1) First, the server process requests the tmlock In the table header (ensure that other users cannot modify the table structure during the transaction execution). If the tmlock is successfully applied, request some row-level locks (TX locks). If the TM and TX locks are successfully locked, the data will be read from the data file.
2) Before reading data, prepare the buffer space for the Read File. The server process needs to scan the LRU list to find the free db buffer. During the scan, the server process registers all the modified db buffer to the dirty list. If the free db buffer and non-dirty data block buffer are insufficient, dbwr is triggered to write the buffer block pointed to by dirty buffer into the data file, and these buffers are cleaned to free up space for buffering the newly read data.
3) when sufficient idle buffer is found, the server process reads each data block (db BLOCK) of the rows from the data file. The DB block is the minimum operating unit of ORACLE, even if the data you want is only one or several rows in many rows in the db block, ORACLE will read all rows in this db block into the Oracle db buffer) put it into the idle area of the db buffer or overwrite the non-dirty data block buffer that has been squeezed out of the LRU list, and arrange it in the header of the LRU list, that is, before data blocks are placed in the db buffer, you must apply for the latches in the db buffer before you can read the data to the db buffer.
If a data block already exists in the db buffer cache (sometimes called the db buffer or db cache), even if no transaction is found in the db buffer, in addition, if the SCN is smaller than its own non-dirty cache data block, the server process still needs to apply for a lock on this record in the table's header. After the lock is successful, subsequent actions can be performed. If the lock fails, wait until the previous process is unlocked to perform the action (the tx lock is blocked at this time ).
Follow these steps to redo logs:
1) after the data is read into the db buffer, the server process writes the rowid of the row data affected by the statement and read it into the db buffer, the original value to be updated, the new value, and the scn information from the PGA to the redo log buffer one by one. Before writing the redo log buffer, you must first request the redo log buffer latches. After the locks are applied, the write starts.
2) when the number of writes reaches 1/3 of the redo log buffer size, or the number of writes reaches 1 MB or exceeds 3 seconds, or when the check point or dbwr occurs, the lgwr process is triggered to write the redo log buffer data to the redo file on the disk (this will generate a log file sync wait event ).
3) the latches held by the redo log buffer that has been written into the redo file will be released and can be overwritten by subsequent write information. redo log buffer is used cyclically. Redo file is also used cyclically. When a redo file is full, the lgwr process automatically switches to the next redo file (in this case, the log file switch (check point complete) Wait event may occur ). In the archive mode, the archiving process also writes the content of the previous full redo file to the archive log file (this may cause the log file switch (archiving needed ).
When creating undo information for a transaction, the specific steps are as follows:
1) after completing the redo log buffer related to the firm, the server process begins to rewrite the transaction list of the block header of the db buffer and writes it to the scn (at the beginning, the scn is written in the redo log buffer, not written in db buffer ).
2) then copy the data copy that contains the transaction list and scn information of this block into the rollback segment. The information in the rollback segment is called the "front image" of the data block ", this "pre-image" is used for subsequent rollback, recovery, and consistent read. (Rollback segments can be stored in dedicated rollback tablespace, which consists of one or more physical files and is dedicated to rollback tablespaces, rollback segments can also be opened in data files in other tablespaces ).
When you write data files by modifying information, follow these steps:
1) rewrite the data content of the db buffer block and write the rollback segment address in the block header.
2) Place the db buffer pointer to the dirty list. If a row of data is updated multiple times without commit, there will be multiple "front images" in the rollback segment, except that the first "front image" contains scn information, each Other "front image" header has the scn information and the "front image" rollback segment address. One update corresponds to only one scn, then the server process will create a pointer to the db buffer block in the dirty list (so that the dbwr process can find the db buffer data block in the dirty list and write it into the data file ). Then, the server process will read the second data block from the data file and repeat the action of the previous data block, read and log data blocks, create rollback segments, modify data blocks, and add them to the dirty list.
3) when the length of the dirty queue reaches the threshold (usually 25%), the server process will notify dbwr to write dirty data, that is, release the latches on the db buffer, free up more free db buffer. It has always been explained that oracle reads a data block at a time. In fact, oracle can read multiple data blocks at a time (db_file_multiblock_read_count to set the number of read blocks at a time)
To execute commit, perform the following steps:
1) commit triggers the lgwr process, but does not force dbwr to immediately release the locks of all corresponding db buffer blocks. That is to say, although the SQL statement has been committed, dbwr is still writing the data block involved in this SQL statement in the following period. The row lock in the table header is not released immediately after commit, but is released only after the completion of the dbwr process, this may result in the failure of one user to request resources that have been committed by another user.
2) the time between the end of the Commit and dbwr process is very short. If a power failure occurs after the commit, before the dbwr ends, because the data after the commit already belongs to the content of the data file, however, these files are not completely written into the data file. Therefore, roll back. Because the commit has triggered lgwr, all the changes to the data files that have not yet been written will be rolled forward by the smon process based on the redo log file after the instance is restarted, the previous work not completed by commit (that is, the change is written to the data file ).
3) If no commit is available, the power is down because the data has been changed in the db buffer and no commit exists. This indicates that this part of data is not a data file. Since lgwr is triggered before dbwr, that is, as long as the data changes, (there must be a log first) All dbwr modifications to the data file will be recorded in the redo log file first. After the instance is restarted, the SMON process then rolls back and forth based on the redo log file.
In fact, smon rollback is completed based on the checkpoint. When all the checkpoints occur, first let the LGWR process buffer all the redologbuffer (including uncommitted redo information) write the redo log file, and then let the dbwr process write the committed buffer of dbbuffer into the data file (do not force the uncommitted write ). Then update the SCN in the control file and the data file header, indicating that the current database is consistent. There are many transactions between the adjacent two checkpoints, including committed and uncommitted transactions.
When performing rollback, perform the following steps:
The server process will find the corresponding copy before modification in the rollback segment based on the transaction list and SCN in the header of the data file block and the db buffer block and the rollback segment address, the original values are used to restore the changes that have been modified but not submitted in the current data file. If there are multiple "front images", the server process will find the rollback segment address of the "front image" in the header of a "front image, until the earliest "front image" of the same transaction is found. Once a commit is issued, the user cannot roll back, which ensures that the subsequent actions of the dbwr process after the commit are not completed completely.
Step 5: extract data
After the statement is executed, the queried data is still in the server process and has not been transferred to the client user process. Therefore, there is a piece of code specifically responsible for data extraction in the server-side process. The function is to return the queried data results to the client process to complete the entire query.
Oracle SQL statement execution sequence
(8) SELECT (9) DISTINCT (11)
(1) FROM
(3) JOIN
(2) ON
(4) WHERE
(5) GROUP
(6) WITH {CUBE | ROLLUP}
(7) HAVING
(10) ORDER
1) FROM: Perform Cartesian Product (cross join) on the table in the FROM clause to generate virtual table VT1.
2) ON: Apply the ON filter to VT1. Only those that are true are inserted into TV2.
3) OUTER (JOIN): If outer join is specified (compared with cross join or inner join), the row that does not match is retained in the table and added to VT2 as an external row to generate tv3. If the FROM clause contains more than two tables, perform steps 1 to 3 on the result table generated by the previous join and the next table until all the table locations are processed.
4) WHERE: Apply the WHERE filter to TV3 and insert TV4 only when the row is set to true.
5) group by: groups rows in TV4 BY column list in the group by clause to generate tv5.
6) CUTE | ROLLUP: inserts the supergroup into VT5 to generate VT6.
7) HAVING: Apply the HAVING filter to VT6. Only groups that make true are inserted to VT7.
8) SELECT: process the SELECT list and generate VT8.
9) DISTINCT: deletes duplicate rows from VT8, product VT9.
10) order by: ORDER the rows in VT9 in the column list in the order by clause, generate a cursor (VC10), generate table TV11, and return it to the caller.
Each of the preceding steps generates a virtual table, which is used as the input for the next step. These virtual tables are unavailable to callers (client applications or external queries. Only the table generated in the last step will be sent to the caller. If a clause is not specified in the query, the corresponding steps are skipped.