Here the parts of the canonical predicate expression are sorted out, and the order of reading is as follows:
One, PostgreSQL code Analysis, query optimization section, canonicalize_qual
Two, PostgreSQL code Analysis, query Optimization section, Pull_ands () and Pull_ors ()
Third, PostgreSQL code Analysis. Query optimization section, Process_duplicate_ors
/ *
* process_duplicate_ors
* Given a list of exprs which are ORed together, try to apply
* the inverse OR distributive law.
*
* Returns the resulting expression (could be an AND clause, an OR
* clause, or maybe even a single subexpression).
* /
/ *
* Suppose we have four tables, each of which is TEST_A, TEST_B, TEST_C, TEST_D, each table has a column,
* That is, TEST_A has a column A, TEST_B has a column B, and so on.
*
* This function handles such situations. For a selection, SELECT * FROM TEST_A, TEST_B, TEST_C, TEST_D
* WHERE (A = 1 AND B = 1) OR (A = 1 AND C = 1) OR (A = 1 AND D = 1);
*
* WHERE condition in the statement:
* (A = 1 AND B = 1) OR (A = 1 AND C = 1) OR (A = 1 AND D = 1)
* Can be rewritten as:
* (A = 1) AND (B = 1 OR C = 1 OR D = 1)
* This is the main function of this function.
*
* The parameter of this function is a list. For the above WHERE condition, the structure of orlist is as follows:
* There is one element in orlist, which is BoolExpr of type OR_EXPR. The structure in BoolExpr is as follows:
typedef struct BoolExpr
{
Expr xpr; = slightly
BoolExprType boolop; = OR_EXPR
List * args; = 3 conditions in OR, namely (A = 1 AND B = 1) OR (A = 1 AND C = 1) OR (A = 1 AND D = 1)
bool plusFlag; = slightly
} BoolExpr;
*
* Detailed implementation of the following analysis functions, the approximate steps are:
* 1) Analyze the common terms in each OR. 2) extract common items, 3) merge the remaining items into AND.
* /
static Expr *
process_duplicate_ors (List * orlist)
{
List * reference = NIL;
int num_subclauses = 0;
List * winners;
List * neworlist;
ListCell * temp;
if (orlist == NIL)
return NULL; / * probably ca n‘t happen * /
/ * Assume there is only one. .
.
.
. Then forget it */
if (list_length (orlist) == 1) / * single-expression OR (can this
* happen?
) * /
return linitial (orlist);
/ *
* Choose the shortest AND clause as the reference list --- obviously, any
* subclause not in this clause is n‘t in all the clauses. If we find a
* clause that ’s not an AND, we can treat it as a one-element AND clause,
* which necessarily wins as shortest.
* /
/ *
* "Find the shortest".
* In the WHERE statement:
* (A = 1 AND B = 1) OR (A = 1 AND C = 1) OR (A = 1 AND D = 1)
* OR operation concatenates three sub-statements. Find the shortest one. Since there is a common term, the shortest one must be the same.
* Includes public items. Then by finding the shortest one. The number of comparisons can be reduced in subsequent operations.
* In the WHERE statement above. The three sub-statements are the same length. According to the following operation process, for example, (A = 1 AND B = 1) should be found.
* Is the first.
* /
foreach (temp, orlist)
{
Expr * clause = (Expr *) lfirst (temp);
if (and_clause ((Node *) clause))
{
List * subclauses = ((BoolExpr *) clause)-> args;
int nclauses = list_length (subclauses);
Ranch
/ *
* The length in the sub-statement is inferred here, for example for the (A = 1 AND B = 1) sub-statement,
* He is actually two sub-sentences connected by an AND, then his length is 2.
*
* Record the shortest sub-statements through nclauses, assuming shorter (nclauses <num_subclauses),
* Then replace it with the shortest.
* /
if (reference == NIL || nclauses <num_subclauses)
{
reference = subclauses;
num_subclauses = nclauses;
}
}
else
{
/ *
* another situation. It is possible that the clause is not an AND statement, so it does not seem to conform to the rules.
* Think of him as a whole, then this is the shortest element.
*
******************************
* Suppose the code runs here. Then there are only two cases:
* One is ... WHERE (A = 1 AND B = 1) OR (A = 1 AND C = 1) OR (A = 1).
* One is ... WHERE ((A = 1 OR C = 1) AND B = 1) OR (A = 1 OR C = 1).
* Assuming that in both cases, the following simplifications can be made, for example:
* The first case is simplified as A = 1
* Simplified in another case (A = 1 OR C = 1)
*
* The third case is to be added ...
* /
reference = list_make1 (clause);
break;
}
}
/ *
* Just in case, eliminate any duplicates in the reference list.
* /
/ * Find the shortest. Save to List * /
reference = list_union (NIL, reference);
/ *
* Check each element of the reference list to see if it ’s in all the OR
* clauses. Build a new list of winning clauses.
* /
/ *
* "Find Public Items".
*
* NOTE: At this time, the advantages brought by "find the shortest" can be reflected. Outer loops will be less frequent.
*
* Suppose the WHERE statement is:
* (A = 1 AND B = 1) OR (A = 1 AND C = 1) OR (A = 1 AND D = 1)
* It must be found in "find the shortest" (A = 1 AND B = 1).
* Then the outer layer will have two loops ... (foreach (temp, reference)), and the variables of the two loops are
* A = 1 and B = 1.
* The inner layer has three loops ... (foreach (temp2, orlist)). The variables of the three cycles are
* (A = 1 AND B = 1) and (A = 1 AND C = 1) and (A = 1 AND D = 1)
*
* The ratio is as follows:
* If the outer loop is now running for the first time, it will look for the common item of A = 1, and if the inner loop is also running for the first time,
* That is, find (A = 1 AND B = 1) whether there is a common item A = 1. Found to exist (list_member),
* Then infer the second clause of the inner loop in turn ...
*
* As the example above, in detail. The operations performed by these loops are:
* Outer layer for the first time:
* Infer whether A = 1 is in (A = 1 AND B = 1), in, infer the next
* Infer whether A = 1 is in (A = 1 AND C = 1), in, infer the next
* Infer whether A = 1 is present (A = 1 AND D = 1). Now, A = 1 is a public item, record (winners = lappend ...)
* Outer layer for the second time:
* Infer whether B = 1 is present (A = 1 AND B = 1). Infer next
* Infer whether B = 1 is in (A = 1 AND C = 1), not in, jump out of the loop, no need to infer the next one.
* Infer whether B = 1 is present (A = 1 AND D = 1). Not running, since the previous one does not contain a public item, extraction is not possible.
* /
winners = NIL;
foreach (temp, reference)
{
Expr * refclause = (Expr *) lfirst (temp);
bool win = true;
ListCell * temp2;
foreach (temp2, orlist)
{
Expr * clause = (Expr *) lfirst (temp2);
if (and_clause ((Node *) clause))
{
if (! list_member (((BoolExpr *) clause)-> args, refclause))
{
win = false;
break;
}
}
else
{
if (! equal (refclause, clause))
{
win = false;
break;
}
}
}
if (win)
winners = lappend (winners, refclause);
}
/ *
* If no winners, we ca n’t transform the OR
* /
if (winners == NIL)
return make_orclause (orlist);
/ *
* Generate new OR list consisting of the remaining sub-clauses.
*
* If any clause degenerates to empty, then we have a situation like (A
* AND B) OR (A), which can be reduced to just A --- that is, the
* additional conditions in other arms of the OR are irrelevant.
*
* Note that because we use list_difference, any multiple occurrences of a
* winning clause in an AND sub-clause will be removed automatically.
* /
/ *
* "Extract public items".
* Use list_difference to delete public items, implement Details are not repeated here.
* /
neworlist = NIL;
foreach (temp, orlist)
{
Expr * clause = (Expr *) lfirst (temp);
if (and_clause ((Node *) clause))
{
List * subclauses = ((BoolExpr *) clause)-> args;
Ranch
/ * Look here ... look here ..., eliminate public items * /
subclauses = list_difference (subclauses, winners);
if (subclauses! = NIL)
{
/ * After elimination, the remaining stitching is up. Spliced into: (B = 1 OR C = 1 OR D = 1) * /
if (list_length (subclauses) == 1)
neworlist = lappend (neworlist, linitial (subclauses));
else
neworlist = lappend (neworlist, make_andclause (subclauses));
}
else
{
/ *
* This is explained in the substatement. One of them is a public item, that is, for example the following form:
* ... WHERE (A = 1 AND B = 1) OR (A = 1)
*
* At this time, the common term is A = 1, the first clause is (A = 1 AND B = 1), and the second clause is (A = 1),
* The second clause goes through list_difference. The result returned is NULL.
* For such a situation, it can actually be simplified as: A = 1, because (A = 1 AND B = 1) must satisfy the case of A = 1.
* /
neworlist = NIL; / * degenerate case, see above * /
break;
}
}
else
{
if (! list_member (winners, clause))
neworlist = lappend (neworlist, clause);
else
{
neworlist = NIL; / * degenerate case, see above * /
break;
}
}
}
/ *
* Append reduced OR to the winners list, if it ’s not degenerate, handling
* the special case of one element correctly (can that really happen?).
* Also be careful to maintain AND / OR flatness in case we pulled up a
* sub-sub-OR-clause.
* /
if (neworlist! = NIL)
{
if (list_length (neworlist) == 1)
winners = lappend (winners, linitial (neworlist));
else
/ * neworlist should be (B = 1 OR C = 1 OR D = 1), so use make_orclause * /
winners = lappend (winners, make_orclause (pull_ors (neworlist)));
}
/ *
* And return the constructed AND clause, again being wary of a single
* element and AND / OR flatness.
* /
if (list_length (winners) == 1)
return (Expr *) linitial (winners);
else
/ * The returned form is: (A = 1) AND (B = 1 OR C = 1 OR D = 1), so make_andclause * /
return make_andclause (pull_ands (winners));
}
PostgreSQL code Analysis, query optimization section, process_duplicate_ors