One instance explains MySQL Query Optimization (III) _ MySQL

Source: Internet
Author: User
Tags mysql query optimization
One instance explains MySQL Query Optimization (3) instance analysis
The following is an example of a manufacturing company to illustrate how to optimize queries. The database of the manufacturing company contains three tables. the mode is as follows:

1. part table

Parts part description other columns

(Part_num) (part_desc) (other column)

102,032 Seageat 30G disk ......

500,049 Novel 10 M network card ......

......

Instance analysis

The following is an example of a manufacturing company to illustrate how to optimize queries. The database of the manufacturing company contains three tables. the mode is as follows:

1. part table

Parts part description other columns

(Part_num) (part_desc) (other column)

102,032 Seageat 30G disk ......

500,049 Novel 10 M network card ......

......

2. vendor table

Vendor ID, vendor name, other columns

(Vendor _ num) (vendor_name) (other column)

910,257 Seageat Corp ......

523,045 IBM Corp ......

......

3. parven table

Part No. manufacturer No. part quantity

(Part_num) (vendor_num) (part_amount)

102,032 910,257 3,450,000

234,423 321,001 4,000,000

......

The following query runs regularly on these tables and generates a report on the quantity of all parts:

SELECT part_desc, vendor_name, part_amount

FROM part, vendor, parven

WHERE part. part_num = parven. part_num

AND parven. vendor_num = vendor. vendor_num

Order by part. part_num

If no index is created, the overhead of the preceding query code will be huge. Therefore, we create an index on the part number and the manufacturer number. Index creation avoids repeated scanning in nesting. Statistics on tables and indexes are as follows:

Table row size number of rows per page number of data pages

(Table) (row size) (Row count) (Rows/Pages) (Data Pages)

Part 150 10,000 25 400

Vendor 150 1,000 25 40

Parven 13 15,000 300 50

Index key size per page key quantity page quantity

(Indexes) (Key Size) (Keys/Page) (Leaf Pages)

Part 4, 500, 20

Vendor 4 500 2

Parven 8 250 60

It seems to be a relatively simple 3-table join, but its query overhead is very high. You can see from the system table that there is a cluster index on part_num and vendor_num, so the index is stored in the physical order. The parven table does not have a specific storage order. The size of these tables indicates that the success rate of non-sequential access from the buffer page is very small. The optimal query plan for this statement is: first read 400 pages from the part sequence, and then access the parven table unordered for 10 thousand times, 2 pages each time (one index page and one data page), a total of 20 thousand disk pages, and 15 thousand non-sequential access to the vendor table, combined with 30 thousand disk pages. It can be seen that the cost of disk access on this index is 50.4 thousand times.

In fact, we can improve the query efficiency by using a temporary table in three steps:

1. read data from the parven table in the order of vendor_num:

SELECT part_num, vendor_num, price

FROM parven

Order by vendor_num

INTO temp pv_by_vn

This statement reads parven (50 pages) sequentially, writes a temporary table (50 pages), and sorts it. Assume that the sorting overhead is 200 pages, which is 300 pages in total.

2. connect the temporary table to the vendor table, output the result to a temporary table, and sort by part_num:

SELECT pv_by_vn, * vendor. vendor_num

FROM pv_by_vn, vendor

WHERE pv_by_vn.vendor_num = vendor. vendor_num

Order by pv_by_vn.part_num

Into tmp pvvn_by_pn

Drop table pv_by_vn

This query reads pv_by_vn (50 pages) and uses indexes to access the vendor table for 15 thousand times. However, due to the vendor_num order, in fact, the vendor table is read in an indexed order (40 + 2 = 42 pages). the output table contains about 95 rows on each page, with a total of 160 pages. Writing and accessing these pages triggers 5*160 = 800 reads and writes, and the index reads and writes 892 pages.

3. connect the output and the part to get the final result:

SELECT pvvn_by_pn. *, part. part_desc

FROM pvvn_by_pn, part

WHERE pvvn_by_pn.part_num = part. part_num

Drop table pvvn_by_pn

In this way, the query reads pvvn_by_pn sequentially (160 pages) and reads the part Table 15 thousand times through the index. as the index is built, 1772 disk reads and writes are actually performed. the optimized ratio is. In Informix Dynamic

When doing the same experiment on Sever, we found that the time consumption optimization ratio is (if the data volume is increased, the proportion may be larger ).

Summary

20% of the code takes 80% of the time, which is a naming law in programming, as well as in database applications. Our optimization focuses on SQL execution efficiency for database applications. The key aspect of query optimization is to make the database server read less data from the disk and read pages in sequence rather than in an unordered manner.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.