Problem Description:
The table we want to access is a very large table, 40 million records, the ID is the primary key, and the program_id is indexed.
Execute a SQL:
SELECT * from Program_access_log where program_id between 1 and 4000
This SQL is very slow.
We thought we were dealing with too many records, so we added an ID limit and read only 500,000 records at a time.
SELECT * from Program_access_log where ID between 1 and 500000 and program_id between 1 and 4000
But the SQL is still slow, and the speed is barely elevated.
MySQL processing 500,000 records of the table, the conditional field has also been indexed, this statement should be completed in an instant.
Problem Analysis:
This table is about 30G of capacity, database server memory 16G, and cannot be loaded at once. That's what caused the problem.
This SQL has two conditions, IDs one to 500,000 and program_id one to 4,000, because the program_id range is much smaller, MySQL chooses it as the primary index.
It is very quick to find all the IDs in the range of 1 to 4000 through the index file first, program_id.
The next step is to find the records in the table through these IDs, because the IDs are discrete, so MySQL's access to the tables is not sequential read.
And this table is very large, unable to load memory at a time, so each access to a record MySQL will be relocated to the disk and the nearby records are loaded into memory, a large number of IO operations caused the decline in speed.
Problem Solving Solution:
1. Partitioning a table in terms of program_id
2. Sub-table processing, the size of each table does not exceed the size of the memory
However, the server uses mysql5.0, does not support partitioning, and this table is a common table and cannot modify the structure of the table without affecting other items.
So we took a third approach:
SELECT * from Program_access_log where ID between 1 and 500000 and program_id between 1 and 15000000
Now the scope of the program_id is much greater than the scope of the ID, the ID is looked up as the primary index, because the ID is the primary key, so the search is 500,000 consecutive records, speed and access to a 500,000 records of the table basically the same
Summarize:
This is a tens of millions of pens in the table because the use of the index caused the data to find a slow problem, there is a certain typical and we exchange!