For ordinary developers, there are not many opportunities for optimizing tens of millions of data records. I would like to thank the company for providing such an environment, and the company should let me do the optimization work. When there are no more than 0.1 million records in the database, it is difficult to tell how high the level of the developer is. When the number of records in the database exceeds 10 million, it can still test the comprehensive technical capabilities of developers.
Of course, not every company can afford professional DBAs. Then, professional DBAs may not be able to work in our company for a long time, this is not just a matter of salary and treatment, but also involves people's long-term development plans. Of course, I am not a professional DBA. In the spirit of the idea that I can solve the problem well, it is a good cat.
First, we can see that the number of records in the database is as follows: the number of records is 10581490, and more than 70 thousand data records need to be read from another table.
The page running result is as follows: This is to view the data of a certain unit. Each page displays 16 records, 1087292 records, and 67956 pages.
The problems encountered are as follows:
1: After the customer used the data for several years, the paging speed became very slow and almost intolerable.
2: When pages reach the last page, the speed is usually slow and there may be crashes, especially when there are many records.
Let's talk about the steps to solve the problem:
1: first, optimize the database, becauseProgramIt's also very complicated and cannot be changed at the moment,Start with optimizing database Field TypesIt will be much better.
First, change the datetime values in the database to smalldatetime. It is a great sense of accomplishment to reduce the database size by several hundred MB. At least the disk read pressure is reduced a lot. Because there are tens of millions of database data records, you cannot use the management tool to modify the structure. You can only use the new query to execute SQL commands.
The following timeout occurs.
Then we can only adjust the table structure by executing the query. It takes about 10 minutes to execute an SQL command each time. The data volume is too large.
2: then optimize,Database indexThe original index is messy and can be understood as a mess, so I killed all of them and re-organized them.
Remove unnecessary indexes first, and then re-create indexes. because the number of records is too large, excessive indexes will increase the database and reduce the weight for him.
The primary key is set to inverted and non-clustered, so that the latest data can be sorted at the beginning.
Set the condition of the primary query as index, and set group by to the first position and then the clustered index. In this way, the query speed is much faster, so the efficiency is not high, the data is too huge. After more than 10 million pieces of data, the comparison is quite obvious and can be seen.
After completing the above two steps, the paging speed is much faster, and at least the page is not crashed. Unfortunately, when the data volume is large, the paging speed of the last page is still a little slow, some unbearable feelings, but at least they won't be lost.
3: Focus on optimization,Stored Procedure of database PagingSolve the unbearable problems on the last page first. Paging uses the reverse select Top N method. I calculate the exact number of records obtained on the last page, properly modified the problem that the last page was too slow to survive, and obtained the appropriate link. Although it was not completely solved, the speed was obviously faster, the paging program I wrote is a little complicated, and I don't dare to do anything about it. I just want to solve the problem and let it go without causing more trouble.
4: Compare the results after database structure optimization, as shown in
The index occupies 2706.109 MB of space before optimization.
INDEX OPTIMIZATIONAfterIndex space: 520.805 MB
I think that a million records table optical index will optimize the MB space, and this alone will also improve a lot of performance.
5: Focus on optimization,ProgramCodePartiallyIn fact, code optimization is prior to index optimization, Because you first read the code and read the business logic to optimize the index.ArticleThe order of reading and writing is somewhat reversed, so you can do it with your mind. I 'd like to continue writing according to my ideas.
The like processing is performed in the program for enterprise numbers and enterprise names. When the database records exceed 10 million records, the like operation is performed on the characters, which is really terrible, after all, so much data is matched once. Although the computer's computation speed is very fast, but tens of millions of records have been computed so far, where can we go?
Improvement Method:
A: Enter the company ID and company name to fuzzy search to clearly locate the name of a pharmacy.
B: If the enterprise ID is obtained, the enterprise name is no longer matched, and the enterprise ID is determined by =, and the enterprise ID is indexed.
Summary of paging Optimization for massive databases:
After nearly a week of hard work, we finally optimized the data table with over million records and solved the problem.Although it is neither scientific nor professional nor theoretical basis, experimental data, chart comparison, performance debugging tools, and so on, it is good to solve all the problems, and the mouse is caught as a good cat. Haha.
After the database is thoroughly optimized and the program code is thoroughly optimized, the paging speed is fast. Each page contains 16 records, 1087292 records, and 67956 pages. The speed of each page is within three seconds, and the last page will not crash, it's pretty fast enough to endure.
When you are free, let's take a look at the slow paging speed of the last page and solve it in depth. Let's just take a break before getting into trouble.
It takes about 10 minutes for each optimized action to be executed. If a wrong operation is performed, it usually takes about half an hour to complete and you need to delete it, re-execute the corrected SQL statement, so the results of Optimization in one day are not very obvious. It takes several days to optimize the statement.