PostgreSQL code Analysis, query optimization section.
/ *
* process_duplicate_ors
* Given a list of exprs which are ORed together, try to apply
* the inverse OR distributive law.
*
* Returns the resulting expression (could be an AND clause, an OR
* clause, or maybe even a single subexpression).
* /
/ *
* Suppose we have four tables, TEST_A, TEST_B, TEST_C, TEST_D, each table has a column A, B, C, D,
* That is, TEST_A has a column A, TEST_B has a column B, and so on.
*
* This function handles this situation. For a selection, SELECT * FROM TEST_A, TEST_B, TEST_C, TEST_D
* WHERE (A = 1 AND B = 1) OR (A = 1 AND C = 1) OR (A = 1 AND D = 1);
*
* WHERE condition in the statement:
* (A = 1 AND B = 1) OR (A = 1 AND C = 1) OR (A = 1 AND D = 1)
* Can be rewritten as:
* (A = 1) AND (B = 1 OR C = 1 OR D = 1)
* This is the main function of this function.
*
* The parameter of this function is a list. For the above WHERE condition, the structure of orlist is as follows:
* There is one element in orlist, which is BoolExpr of type OR_EXPR. The structure in BoolExpr is as follows:
typedef struct BoolExpr
{
Expr xpr; = slightly
BoolExprType boolop; = OR_EXPR
List * args; = 3 conditions in OR, namely (A = 1 AND B = 1) OR (A = 1 AND C = 1) OR (A = 1 AND D = 1)
bool plusFlag; = slightly
} BoolExpr;
*
* The following analyzes the specific implementation of the function. The approximate steps are:
* 1) Analyze the common terms in each OR, 2) extract the common terms, and 3) merge the remaining terms into AND.
* /
static Expr *
process_duplicate_ors (List * orlist)
{
List * reference = NIL;
int num_subclauses = 0;
List * winners;
List * neworlist;
ListCell * temp;
if (orlist == NIL)
return NULL; / * probably can't happen * /
/ * If there is only one. . . . ,Then forget it */
if (list_length (orlist) == 1) / * single-expression OR (can this
* happen?) * /
return linitial (orlist);
/ *
* Choose the shortest AND clause as the reference list --- obviously, any
* subclause not in this clause isn't in all the clauses. If we find a
* clause that's not an AND, we can treat it as a one-element AND clause,
* which necessarily wins as shortest.
* /
/ *
* "Find the shortest".
* In the WHERE statement:
* (A = 1 AND B = 1) OR (A = 1 AND C = 1) OR (A = 1 AND D = 1)
* OR operation concatenates three sub-statements to find the shortest one, because if there is a common item, then the shortest one must be
* Contains common items, then by finding the shortest one, the number of comparisons can be reduced in subsequent operations.
* In the above WHERE statement, the length of the three sub-statements is the same. According to the following execution process, you should find (A = 1 AND B = 1),
* Is the first.
* /
foreach (temp, orlist)
{
Expr * clause = (Expr *) lfirst (temp);
if (and_clause ((Node *) clause))
{
List * subclauses = ((BoolExpr *) clause)-> args;
int nclauses = list_length (subclauses);
Ranch
/ *
* The length in the sub-statement is judged here, for example, for the (A = 1 AND B = 1) sub-statement,
* He is actually two sub-sentences connected by an AND, then his length is 2.
*
* Record the shortest sub-statements through nclauses, if there are shorter (nclauses <num_subclauses),
* Then replace it with the shortest.
* /
if (reference == NIL || nclauses <num_subclauses)
{
reference = subclauses;
num_subclauses = nclauses;
}
}
else
{
/ *
* There is also a case, it may be that the clause is not an AND statement, which does not seem to meet the rules
* Then consider him as a whole, then this is the shortest element.
*
******************************
* If the code is executed here, there are only two cases:
* One is ... WHERE (A = 1 AND B = 1) OR (A = 1 AND C = 1) OR (A = 1).
* One is ... WHERE ((A = 1 OR C = 1) AND B = 1) OR (A = 1 OR C = 1).
* In both cases, the following simplifications can be made:
* The first case is simplified as A = 1
* The second case is simplified as (A = 1 OR C = 1)
*
* The third case is to be added ...
* /
reference = list_make1 (clause);
break;
}
}
/ *
* Just in case, eliminate any duplicates in the reference list.
* /
/ * Find the shortest, save to List * /
reference = list_union (NIL, reference);
/ *
* Check each element of the reference list to see if it's in all the OR
* clauses. Build a new list of winning clauses.
* /
/ *
* "Find Public Items".
*
* NOTE: At this time, the advantages brought by "find the shortest" can be reflected, and the number of outer loops will be less.
*
* If the WHERE statement is:
* (A = 1 AND B = 1) OR (A = 1 AND C = 1) OR (A = 1 AND D = 1)
* It must be found in "find the shortest" (A = 1 AND B = 1).
* Then the outer layer will have two loops ... (foreach (temp, reference)), and the variables of the two loops are
* A = 1 and B = 1.
* The inner layer has three loops ... (foreach (temp2, orlist)), and the variables of the three loops are
* (A = 1 AND B = 1) and (A = 1 AND C = 1) and (A = 1 AND D = 1)
*
* Examples are as follows:
* If the outer loop is now executed for the first time, it will look for the common item of A = 1, and if the inner loop is also executed for the first time,
* That is, look for (A = 1 AND B = 1) whether the public item A = 1 exists and find that it exists (list_member),
* Then judge the second clause of the inner loop in turn ...
*
* As the above example, specifically, the operations of these loops are:
* Outer layer for the first time:
* Determine if A = 1 is in (A = 1 AND B = 1), in, judge next
* Judge whether A = 1 is in (A = 1 AND C = 1), yes, judge next
* Determine if A = 1 is in (A = 1 AND D = 1), in which A = 1 is a public item, record (winners = lappend ...)
* Outer layer for the second time:
* Determine if B = 1 is present (A = 1 AND B = 1), yes, determine the next
* Determine whether B = 1 is present (A = 1 AND C = 1), not present, jump out of the loop, the next one need not be judged.
* Determine if B = 1 is in (A = 1 AND D = 1), not executed, because the previous one does not contain public items, it is impossible to extract.
* /
winners = NIL;
foreach (temp, reference)
{
Expr * refclause = (Expr *) lfirst (temp);
bool win = true;
ListCell * temp2;
foreach (temp2, orlist)
{
Expr * clause = (Expr *) lfirst (temp2);
if (and_clause ((Node *) clause))
{
if (! list_member (((BoolExpr *) clause)-> args, refclause))
{
win = false;
break;
}
}
else
{
if (! equal (refclause, clause))
{
win = false;
break;
}
}
}
if (win)
winners = lappend (winners, refclause);
}
/ *
* If no winners, we can't transform the OR
* /
if (winners == NIL)
return make_orclause (orlist);
/ *
* Generate new OR list consisting of the remaining sub-clauses.
*
* If any clause degenerates to empty, then we have a situation like (A
* AND B) OR (A), which can be reduced to just A --- that is, the
* additional conditions in other arms of the OR are irrelevant.
*
* Note that because we use list_difference, any multiple occurrences of a
* winning clause in an AND sub-clause will be removed automatically.
* /
/ *
* "Extract public items".
* Use list_difference to delete public items. Implementation details are not described in detail.
* /
neworlist = NIL;
foreach (temp, orlist)
{
Expr * clause = (Expr *) lfirst (temp);
if (and_clause ((Node *) clause))
{
List * subclauses = ((BoolExpr *) clause)-> args;
Ranch
/ * Look here ... look here ..., eliminate public items * /
subclauses = list_difference (subclauses, winners);
if (subclauses! = NIL)
{
/ * After elimination, the remaining stitching is spliced into: (B = 1 OR C = 1 OR D = 1) * /
if (list_length (subclauses) == 1)
neworlist = lappend (neworlist, linitial (subclauses));
else
neworlist = lappend (neworlist, make_andclause (subclauses));
}
else
{
/ *
* This means that one of the sub-statements is a common item, which is the following form:
* ... WHERE (A = 1 AND B = 1) OR (A = 1)
*
* At this time, the common term is A = 1, the first clause is (A = 1 AND B = 1), and the second clause is (A = 1),
* The second clause goes through list_difference and the result returned is NULL.
* In this case, it can be simplified as: A = 1, because (A = 1 AND B = 1) must satisfy the situation of A = 1.
* /
neworlist = NIL; / * degenerate case, see above * /
break;
}
}
else
{
if (! list_member (winners, clause))
neworlist = lappend (neworlist, clause);
else
{
neworlist = NIL; / * degenerate case, see above * /
break;
}
}
}
/ *
* Append reduced OR to the winners list, if it's not degenerate, handling
* the special case of one element correctly (can that really happen?).
* Also be careful to maintain AND / OR flatness in case we pulled up a
* sub-sub-OR-clause.
* /
if (neworlist! = NIL)
{
if (list_length (neworlist) == 1)
winners = lappend (winners, linitial (neworlist));
else
/ * neworlist should be (B = 1 OR C = 1 OR D = 1), so use make_orclause * /
winners = lappend (winners, make_orclause (pull_ors (neworlist)));
}
/ *
* And return the constructed AND clause, again being wary of a single
* element and AND / OR flatness.
* /
if (list_length (winners) == 1)
return (Expr *) linitial (winners);
else
/ * The returned form is: (A = 1) AND (B = 1 OR C = 1 OR D = 1), so make_andclause * /
return make_andclause (pull_ands (winners));
}
PostgreSQL code Analysis, query optimization section, process_duplicate_ors