Then read with me. PostgreSQL source Code (ix)--executor (the--scan node (top) of the query execution module), this chapter ends the remaining seven scan nodes.
T_SubqueryScanState, T_FunctionScanState, T_ValuesScanState, T_CteScanState, T_WorkTableScanState, T_ForeignScanState, T_CustomScanState,
8.SubqueryScan node
The role of the Subqueryscan node is to scan the scanned object with another survey plan tree (sub-plan), whose scanning process is eventually converted to the execution of the child plan.
Postgres Sub-query mainly contains the following keywords: EXISTS, in, not in, Any/some, all, detailed introduction can be seen: http://www.postgres.cn/docs/9.5/functions-subquery.html
Example:
postgres=# explain select id from test_new where exists (select id from test_dm); QUERY PLAN------------------------------------------------------------------------- Result (cost=0.02..35.52 rows=2550 width=4) One-Time Filter: $0 InitPlan 1 (returns $0) -> Seq Scan on test_dm (cost=0.00..22346.00 rows=1000000 width=0) -> Seq Scan on test_new (cost=0.00..35.50 rows=2550 width=4)(5 行)
The following query is also a subquery, but it is optimized during the query compilation phase (to promote child connections, mainly by converting any and exist clauses to semi-joins)
postgres=# explain select id from test_new where exists (select id from test_dm where id = test_new.id); QUERY PLAN----------------------------------------------------------------------------- Hash Semi Join (cost=38753.00..42736.38 rows=1275 width=4) Hash Cond: (test_new.id = test_dm.id) -> Seq Scan on test_new (cost=0.00..35.50 rows=2550 width=4) -> Hash (cost=22346.00..22346.00 rows=1000000 width=4) -> Seq Scan on test_dm (cost=0.00..22346.00 rows=1000000 width=4)(5 行)
About the content, here is a very good story: PostgreSQL query optimization of sub-query optimization
The Subqueryscan node expands on the scan node to define the root node pointer (the Subplan field) of the child plan, while the Subrtable field is the structure used by the polling compiler, and the executor runs with a null value.
typedef struct SubqueryScan{ Scan scan; Plan *subplan;} SubqueryScan;
Obviously, the initialization process of the Subqueryscan node (Execinitsubqueryscan function) uses Execinitnode to process the sub-plan tree pointed to by the Subplan field of Subqueryscan. The Planstale root node pointer of the sub-plan is assigned to the Subplan field of Subqueryscanstate.
typedef struct SubqueryScanState{ ScanState ss; /* its first field is NodeTag */ PlanState *subplan;} SubqueryScanState;
I think the Subqueryscan node is actually a shell, why do you say so? Because the execution of the Subqueryscan node (the Execsubqueryscan function) is accomplished by passing subquerynext to the Execscan function processing. Subquerynext actually calls Execprocnode processing subplan to get tuples. In other words, here Subqueryscan is running a separate query plan and then getting its results instead of scanning the table yourself. So recheck work is done in an independent query plan, and Subqueryscan nodes don't have to do it anymore.
So we can see:
static boolSubqueryRecheck(SubqueryScanState *node, TupleTableSlot *slot){ /* nothing to check */ return true;}
It says that Execprocnode processing Subplan is called at execution time, so it is clear that additional calls to Execendnode are needed to clean up the sub-plan during the cleanup process.
9.FuncitonScan node
Apart the first example:
postgres=# CREATE FUNCTION dup(int) RETURNS TABLE(f1 int, f2 text)postgres-# AS $$ SELECT $1, CAST($1 AS text) || ' is text' $$postgres-# LANGUAGE SQL;CREATE FUNCTIONpostgres=# explain SELECT * FROM dup(42); QUERY PLAN------------------------------------------------------------- Function Scan on dup (cost=0.25..10.25 rows=1000 width=36)(1 行)
In PostgreSQL, some functions can return a collection of tuples, in order to get tuples from the return values of these functions, PostgreSQL defines the Functionscan node, whose scanned object is a function that returns a set of tuples. The Functionscan node extends the definition on the basis of scan:
Functions list field, which contains the functions involved in Funcitonscan;
and the Funcordinality field (whether to add an ordinal column to the returned result).
When a function in the FROM clause is suffixed by WITH ORDINALITY, a bigint column is appended to the output which starts from 1 and increments by 1 for each row of the function's output. This is most useful in the case of set returning functions such as unnest()
See more here: http://www.postgres.cn/docs/9.5/functions-srf.html
typedef struct FunctionScan{ Scan scan; List *functions; /* list of RangeTblFunction nodes */ bool funcordinality; /* WITH ORDINALITY */} FunctionScan;
The initialization process of the Functionscan node (the Execinitfunctionscan function) initializes the FUNCTIONSCANSTATE structure and then functions according to the Functionscan field. The state node of the runtime is constructed for each function functionscanperfuncstate, as follows:
typedef struct FunctionScanPerFuncState{ ExprState *funcexpr; /* state of the expression being evaluated */ TupleDesc tupdesc; /* desc of the function result type */ int colcount; /* expected number of result columns */ Tuplestorestate *tstore; /* holds the function result set */ int64 rowcount; /* # of rows in result set, -1 if not known */ TupleTableSlot *func_slot; /* function result slot (or NULL) */} FunctionScanPerFuncState;
Here the structure for expression evaluation is constructed for each function based on the Functions field in Functionscan (stored in funcexpr), and the descriptor of the constructor return tuple is stored in Tupdesc. The Tuplestoreslate field that is used to store the function result set is NULL.
When these are done, you can construct the Tupledesc of the return value based on the functionscanperfuncstate structure of all the functions involved (that is, the final return value must be a combination of the return values of these functions):
For example:
postgres=# SELECT * FROM dup(42) WITH ORDINALITY AS t(ls,n,xxx),increment(42); ls | n | xxx | increment----+------------+-----+----------- 42 | 42 is text | 1 | 43(1 行)
typedef struct FunctionScanState{ ScanState ss; /* its first field is NodeTag */ int eflags; //node's capability flags bool ordinality; //is this scan WITH ORDINALITY? bool simple; //true if we have 1 function and no ordinality int64 ordinal; //current ordinal column value int nfuncs; //number of functions being executed /* per-function execution states (private in nodeFunctionscan.c) */ struct FunctionScanPerFuncState *funcstates; /* array of length nfuncs */ MemoryContext argcontext; //memory context to evaluate function arguments in} FunctionScanState;
In the execution of the Functionscan node (Execfunctionscan function), the Functionnext is passed to the Execscan function, The Functionnext function first determines whether the tuplestorestate is empty (empty at first execution), and if it is empty, executes the function Execmaketablefunctionresult generates all the result sets and stores them in Tuplestorestate , each execution node thereafter calls Tuplestore_gettupleslot to get a tuple in the result set.
Finally, the Functionscan node cleanup process requires a CHEONGJU tuplestorestate structure.
10.ValuesScan node
Values evaluates a row value, or a set of row values, that is specified by a value expression. More commonly, it is used to generate a "constant table" within a large command, but it can also be used on its own.
When more than one row is specified, all rows must have the same number of elements. The column data type of the resulting table is determined by the combination of the explicit or inferred type of the expression that appears in the column, and the same rules as the Union.
In large commands, the syntax allows values to appear anywhere the select appears. Because syntax makes it a SELECT, you can use ORDER BY, LIMIT (or equivalent fetch first), and offset clauses for a values command.
Let's give an example, a purely values command:
VALUES (1, 'one'), (2, 'two'), (3, 'three');
A table with two columns and three rows is returned.
postgres=# VALUES (1, 'one'), (2, 'two'), (3, 'three'); column1 | column2---------+--------- 1 | one 2 | two 3 | three(3 行)postgres=# EXPLAIN VALUES (1, 'one'), (2, 'two'), (3, 'three'); QUERY PLAN-------------------------------------------------------------- Values Scan on "*VALUES*" (cost=0.00..0.04 rows=3 width=36)(1 行)
More commonly, values can be used in a large SQL command. Most commonly used in insert:
postgres=# insert into test values (1,'xxxx');INSERT 0 1postgres=# explain insert into test_new values (1); QUERY PLAN------------------------------------------------------ Insert on test_new (cost=0.00..0.01 rows=1 width=0) -> Result (cost=0.00..0.01 rows=1 width=0)(2 行)
Specifically, you can see this: http://www.postgres.cn/docs/9.5/sql-values.html
So we are not unfamiliar with the values clause, continue to say.
The Valuesscan node is used to scan the set of tuples given by the values clause (the VALUES clause in the INSERT statement goes through the result node ). As shown below, the values_lists in the Valuesscan node stores the list of expressions in the values clause.
typedef struct ValuesScan{ Scan scan; List *values_lists; /* list of expression lists */} ValuesScan;
The initialization process of the Valuesscan node (execlnitvaluesscan function) processes the expression in values_lists to generate the values expression and is stored in the exprlists array of the valuesscanstate, Array_ Len Records array lengths, Cuxr_idx and
The MARKEDJDX is used to store offsets in the array. The memory context is also allocated rowconext for expression grate (Ss.ps.ps_ExprContext is intended to be used for expression evaluation, but to prevent memory leaks that occur for an too-long values clause, Use Rowconext to unify each row of values, and use Rowconext to release the memory after each row has been processed. )。
typedef struct ValuesScanState{ ScanState ss; /* its first field is NodeTag */ ExprContext *rowcontext; //per-expression-list context List **exprlists; //array of expression lists being evaluated int array_len; //size of array int curr_idx; //current array index (0-based)} ValuesScanState;
The Valuesscan node execution process (Execvaluesscan function) calls the Execscan implementation, Execscan gets the scan tuple through Valuesnext, valuesnext through Curr_ IDX gets the expression that needs to be processed from the exprlists and calculates the result tuple return.
Because the Rowconext context is applied extra, the memory context Rowcontext needs to be freed in the Valuesscan node cleanup process (Execendvaluesscan function).
11.CteScan node
With provides a way to write auxiliary statements that are used in a large query. These statements are often referred to as common table expressions or CTE, which can be thought of as defining temporary tables that exist only in one query. Each auxiliary statement in the WITH clause can be a SELECT, insert, update, or delete, and the WITH clause itself can also be appended to a main statement, and the main statement can also be a SELECT, insert, UPDATE, or delete
Specifically, refer to this: http://www.postgres.cn/docs/9.5/queries-with.html
If you understand the CTE, you will know that the CTE does not normally exist alone, but instead is attached to a main query, in other words, the CTE appears as a secondary query. Therefore, the secondary query is treated as a sub-plan subplan in the main query. The execution state tree of the CTE is stored in the Es_subplanstates list of the actuator global State estate.
typedef struct EState{ NodeTag type; ... /* Parameter info: */ ParamListInfo es_param_list_info; /* values of external params */ ParamExecData *es_param_exec_vals; /* values of internal params */ ... List *es_subplanstates; /* List of PlanState for SubPlans */ ...} EState;
And the CTEPLANLD in Ctescan stores the offset of its sub-plan in that list, which corresponds to the same ctescan cteplanld of the same child plan. When PostgreSQL is implemented, it also allocates a space for each CTE in a list of global parameters, with offsets stored in cteparam and the same offsets corresponding to the ctescan of the same CTE. The Ctescan node-related data structures are shown below.
typedef struct CteScan{ Scan scan; int ctePlanId; /* ID of init SubPlan for CTE */ int cteParam; /* ID of Param representing CTE output */} CteScan;
The initialization process of the Ctescan node (Execinitctescan function) initializes the CTESCANSTATE structure first, and the corresponding sub-plan execution state tree is found by cteplanld in Es_subplanstates. and stored in the Cteplanstate field of the ctescanstate.
The parameter structure Paramexecdata is then obtained through Cteparam in the Es_param_exec_vals field of the actuator global State estate. If value in Paramexecdata is null, no other ctescan initializes the storage structure for this CTE, and the cte_table field of Ctescanstate is initialized. The value of leader and Paramexecdata is a pointer to the current ctescanstate. If value in Paramexecdata is not NULL, its value is discovered computers to leader, which points to the ctescanstate created by the first Ctescan, not ctescan for the current cte_table. This corresponds to a CTE globally with only one tuple cache structure, and all ctescan that use the CTE share the cache.
typedef struct CteScanState{ ScanState ss; /* its first field is NodeTag */ int eflags; /* capability flags to pass to tuplestore */ int readptr; /* index of my tuplestore read pointer */ PlanState *cteplanstate; /* PlanState for the CTE query itself */ /* Link to the "leader" CteScanState (possibly this same node) */ struct CteScanState *leader; /* The remaining fields are only valid in the "leader" CteScanState */ Tuplestorestate *cte_table; /* rows already read from the CTE query */ bool eof_cte; /* reached end of CTE query? */} CteScanState;
At last. Doing some initialization work, such as initializing the expression context, sub-expressions, tuple tables, result tuple tables, and so on, that handle tuples.
When you execute the Ctescan node, you will first look at whether the cache tuple (cache structure tuplestorestate) is cached in the cte_table point, and if it is available directly, you need to execute the sub-plan that cteplanld points to to get the tuple.
The Ctescan node cleanup process needs to clean up the tuple cache structure, but simply cleans up the ctescanstate that leader points to itself.
12.WorkTableScan node
This node is closely associated with the Recursiveunion node. Let's look at the example below, a recursiveunion query:
postgres=# WITH RECURSIVE t(n) AS(postgres(# VALUES(1)postgres(# UNION ALLpostgres(# SELECT n+1 FROM t WHERE n<100)postgres-# SELECT sum(n) FROM t; sum------ 5050(1 行)查询计划 QUERY PLAN------------------------------------------------------------------------- Aggregate (cost=3.65..3.66 rows=1 width=4) CTE t -> Recursive Union (cost=0.00..2.95 rows=31 width=4) -> Result (cost=0.00..0.01 rows=1 width=0) -> WorkTable Scan on t t_1 (cost=0.00..0.23 rows=3 width=4) Filter: (n < 100) -> CTE Scan on t (cost=0.00..0.62 rows=31 width=4)(7 行)
For recursive query evaluation, the process is as follows:
1. Calculate non-recursive items. For union (but not union ALL), discard duplicate rows. Include all the remaining rows in the results of the recursive query, and also put them on a temporary worksheet .
2. Repeat the following steps as long as the worksheet is not empty:
Computes a recursive item, replacing the recursive self-reference with the contents of the current worksheet. For Union (not union ALL), discard duplicate rows and those that repeat with the previous result row. All the remaining rows are included in the results of the recursive query, and they are also placed in a temporary intermediate table.
Replace the contents of the worksheet with the contents of the intermediate table, and then empty the intermediate table.
Details can be seen here: http://www.postgres.cn/docs/9.5/queries-with.html
The work table here is worktable.
Worktablescan will work with Recursiveunion to complete the recursive merger sub-poll. Recursiveunion caches all tuples in the recursion to the recursiveunionstate structure, Worktablescan provides a scan of this cache.
As shown below, the Worktablescan node extension defines wtparam for communication between recursiveunion nodes, while the Rustate field of the Worktablescanstate node records A pointer to the RECURSIVEUNIONSTATE structure so that worktablescan can get tuples from the cache structure during execution.
typedef struct WorkTableScan{ Scan scan; int wtParam; /* ID of Param representing work table */} WorkTableScan;
Node State:
typedef struct WorkTableScanState{ ScanState ss; /* its first field is NodeTag */ RecursiveUnionState *rustate;} WorkTableScanState;
13.ForeignScan node
If you've used POSTGRES_FDW or dblink to provide an external data wrapper, you probably know what the scan node does: Scan the external PostgreSQL data sheet.
If you are interested in POSTGRES_FDW, here is the URL, take it to you: http://www.postgres.cn/docs/9.5/postgres-fdw.html
The information for the Foreignscan node is as follows, and some information related to external data is extended mainly outside of the scan. Both Fdw_exprs and Fdw_private are under the control of the external data wrapper, but Fdw_exprs is assumed to contain an expression tree and will be processed accordingly by the Planner; Fdw_private not.
Fdw_scan_tlist is the target list that describes the contents of the scanned tuple returned by FDW, or nil if the scan tuple matches the declaration row type of the external table, which is normal for a simple external table scan. (If the plan node represents an outer join, Fdw_scan_tlist is required because there is no rowtype available in the system directory)
Fdw_scan_tlist is never executed; it simply holds an expression tree that describes the contents of the scan tuple column.
The fdw_recheck_quals should contain conditions that the core system passes to FDW but not added to scan.plan.qual, meaning that these conditions need to be judged in FDW (these conditions are judged in recheck).
typedef struct ForeignScan{ Scan scan; Oid fs_server; /* OID of foreign server */ List *fdw_exprs; /* expressions that FDW may evaluate */ List *fdw_private; /* private data for FDW */ List *fdw_scan_tlist; /* optional tlist describing scan tuple */ List *fdw_recheck_quals; /* original quals not in scan.plan.qual */ Bitmapset *fs_relids; /* RTIs generated by this scan */ bool fsSystemCol; /* true if any "system column" is needed */} ForeignScan;
There is also a data structure that is particularly concerned with keeping the external Data wrapper handler return function, which provides pointers to the callback functions used by planner and executor.
src/include/foreign/fdwapi.htypedef struct FdwRoutine
The following is the state node foreignscanstate of Foreignscan, which expands the list field fdw_recheck_quals, which requires recheck outside of Scanstate, The external data wrapper handler returns the function collection structure body fdwroutine and the external data wrapper state fdw_state.
In addition to the general initialization, the Execinitforeignscan function initializes the Foreignscanstate fdwroutine field at initialization, obtaining the function pointer and scanning the relational table.
typedef struct ForeignScanState{ ScanState ss; /* its first field is NodeTag */ List *fdw_recheck_quals; /* original quals not in ss.ps.qual */ /* use struct pointer to avoid including fdwapi.h here */ struct FdwRoutine *fdwroutine; void *fdw_state; /* foreign-data wrapper can keep state here */} ForeignScanState;
The execution of the Foreignscan node (Execforeignscan function) is accomplished by passing foreignnext to the Execscan function processing. Foreignnext actually calls Fdwroutine->iterateforeignscan to get a tuple each time it is scanned on an external data source.
About function Foreignrecheck, do you remember the Fdw_recheck_quals field mentioned above? The Execqual function is called here to use the conditions in the Fdw_recheck_quals field to do the recheck.
Finally, after the scan is over, call Fdwroutine->endforeignscan to close the scan and close the external table execclosescanrelation (node->ss.ss_currentrelation).
14.CustomScan node
From the word custom we can tell that this is the interface of the Postgres open Custom Scan method. This node only provides an empty shell, and we look at:
typedef struct CustomScan{ Scan scan; uint32 flags; /* mask of CUSTOMPATH_* flags, see relation.h */ List *custom_plans; /* list of Plan nodes, if any */ List *custom_exprs; /* expressions that custom code may evaluate */ List *custom_private; /* private data for custom code */ List *custom_scan_tlist; /* optional tlist describing scan * tuple */ Bitmapset *custom_relids; /* RTIs generated by this scan */ const CustomScanMethods *methods;} CustomScan;
Left to the user to expand themselves, at the same time, the Customscanstate state node is the same, there are only a few function pointers and some properties of the preset, you can use, you can also put customscanstate as the scan method you want to expand an attribute Can be said to be very flexible.
Therefore, not much to say this, I hope to see on the Internet to do the expansion of the example ~
This is the end of the scan node.
Read with me. PostgreSQL Source (10)--executor (--scan node of the query execution module (bottom))