Five optimization techniques1. Query reuse refers to the use of the previous execution results as far as possible, in order to save the query calculation of the whole process of time and reduce resource consumption. At present, the query reuse technology mainly focuses on two aspects: 1) query result reuse Allocate a buffer in the cache, storing the SQL statement text and the final result set, when the same SQL input, Return the results immediately. 2) Reuse of query plans Cache A query statement's execution plan and its corresponding syntax tree structure. The pros and cons of query reuse: 1) Li Duan: Saves CPU and IO consumption 2) Cons: Consumes a lot of memory resources, the same SQL different user gets the result set may be different 2. Query rewrite query rewrite ideas: 1. Convert a procedural query to a descriptive query, such as a view rewrite 2. Complex queries such as nested subqueries, outer join elimination, Nested connection elimination) Convert to multiple table join queries as much as possible 3. Convert inefficient predicates to equivalent efficient predicates (such as equivalent predicate overrides) 4. Using the nature of equality and inequality, there are many ways to simplify the query rewriting of where and having conditions, which are not definite, uniform laws, But the core of rewriting must be "equivalent conversion", only equivalence can be converted, which needs special emphasis. 3. Query optimization algorithm What is a query plan? A query plan, also known as a query tree, consists of a series of internal operators that form an execution of a query based on a certain number of operational relationships. For example, we would like to associate a, B, C and other tables, you can take the table A and table B connection to get the intermediate results, and then join the other table C to get a new intermediate mode, until all the tables are connected. Query plan, Different nodes on a binary tree: Single table node: Consider the data acquisition method of single table: 1. Get data directly via IO 2. Get data by index 3. The location of the data through the index and then the IO to the data block to get the data this is a process from physical storage to memory parsing into a logical field, which conforms to the requirements of the von Neumann architecture (external memory data read-in Save to be processed) Two table nodes (two table junction nodes) Multiple table nodes: Consider how the multi-table join sequence constitutes the least expensive execution plan. Decide whether AB first or BC is connected first. Many databases use the Left deep tree, the right deep tree, the dense tree three kinds of ways or part of the multi-table connection to get a variety of connection channels. Strategies for generating Optimal query plans: 1. Rule-Based Optimization 2. Cost-optimized total cost =cpu cost +io cost in the process of generating a query plan, calculate the cost of each access path (the access path mainly includes the above three "relationship nodes"), and then select the least expensive as the subpath, so that all tables are concatenated until a complete path is completed. MySQL and PostgreSQL take a query optimization strategy based on rules and cost estimates. 4. Parallel query queries optimize parallel conditions: 1. Number of available resources 2.CPU in the System 3. The specific algebraic operators in the operation within the same SQL, query parallelism is divided into intra-operation parallel and inter-operation parallel 5. Distributed query in distributed database system, Query strategy optimization is the main focus of query optimization is the data transmission strategy, a, b two nodes to connect (is a node to the B node data transmission, or from B to A or first filter and then transfer, etc.) and local processing optimization (traditional single-node database query optimization technology) The data communication overhead is the main factor in the optimization algorithm, the cost estimation model: Total cost =i/o cost +CPU cost + communication cost
MySQL query optimization from getting started to running (ii) database query optimization Technology Overview