Optimization of MySQL database index Query

Source: Internet
Author: User

Problem description:
The table we want to access is a very large table with 40 million records, the id is the primary key, and the index is created on program_id.
Execute an SQL statement:

Select * from program_access_log where program_id between 1 and 4000
This SQL statement is very slow.
We originally thought that the reason for processing too many records, so we added the id limit, read-only 0.5 million records at a time

Select * from program_access_log where id between 1 and 500000 and program_id between 1 and 4000
However, this SQL statement is still very slow, and the speed is almost no higher than the previous one.
Mysql processes a table with 0.5 million records, and the condition field is also indexed. This statement should be completed instantly.

Problem Analysis:
This table has a capacity of about 30 GB, and the database server memory is 16 GB. It cannot be loaded at a time. This causes the problem.
This SQL statement has two conditions: ID 1 to 0.5 million and Program_id 1 to four thousand. Because the range of program_id is much smaller, mysql chooses it as the primary index.
First, the index file is used to find all the IDs of all program_id in the range of 1 to 4000. This process is very fast.
Next, we need to use these IDs to find records in the Table. Because these IDs are discrete, mysql does not read the table in sequence.
This table is very large and cannot be loaded into memory at a time. Therefore, every access record of mysql must be located on the disk again and all nearby records should be loaded into the memory, A large number of IO operations lead to a reduction in speed.

Solution:
1. Partition the table with the program_id as the condition
2. Table sharding. The size of each table cannot exceed the memory size.
However, the server uses mysql5.0 and does not support partitions. This table is a public table and cannot be modified without affecting other projects.
So we adopted the third method:

Select * from program_access_log where id between 1 and 500000 and program_id between 1 and 15000000
The range of program_id is far greater than that of id. id is used as the primary index for search. Because id is the primary key, it searches for 0.5 million consecutive records, the speed is basically the same as accessing a 0.5 million-record table.

Summary:
This is a problem that data query slows down due to the use of indexes in tens of millions of records tables. It is typical to share with you!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.