Last night, -- No, it should be the early morning of this day. -- it is doomed to be an insimple moment. After I joined the company, I was involved in the first project release, and I was doing data correction. operations involving user data were always very sensitive and important. Therefore, the psychology is quite tense. It turns out that the real environment is more practical. Although I checked it several times in advance, there was still a problem during actual execution.
This is the case: I have written several. py scripts to operate the database. I need to update resource records of more than 10000 users. each user's resource record involves several tables of several databases. During the specific execution, it is found that the resource update for each user is relatively slow-but in fact, it is time-consuming to see which operation can not be found. As a result, everyone immediately gathers to view the log records (so it shows how important the logs are, especially database operations ), it is found that a resource update statement takes 5 S for each user, while the total cost of other resource update statements is about 1 s for each user. After simply estimating the time, we find that the time is not enough.
I was very depressed when I stood there. After all, I got stuck in my own shell, but I didn't expect an error in a seemingly inconspicuous place. -- this is another lesson: important Errors often appear in inconspicuous places.
The cause of the problem is simple: An Update statement for a subquery is used in a table of 0.1 million orders of magnitude. It takes 5 s to execute each update statement, and nearly 10000 update statements exist, what year and month do you have to update! As a result, the "Emergency Response Team" was immediately set up, and after the budget time, they even planned to give up the release plan. However, the solution was finally found:
1. Separate the time-consuming update statement and generate an SQL statement for execution. For the remaining parts, each user needs 1 s, which takes about 2 hours. This is not enough. Since each user's Resource Record Update is almost independent from each other, the boss suggested using a concurrent solution to divide more than 10000 users into 10 segments, multiple shell processes using the same script are corrected separately. This will take about a few minutes.
2. This time-consuming update statement must also be executed. What should I do? Therefore, contact the DBA immediately. They optimized the statement and replaced the subquery with two inner joins. Therefore, the execution time was reduced to several minutes. Too powerful! However, this also raises the question: why is the execution efficiency of subqueries so low?
Lessons learned, experiences and experiences:
1. Make sure that appropriate logs are logged. You can find the cause or other valuable clues when an error occurs;
2. major errors or mistakes often happen not in places that seem to be very important, but in areas where they are not very serious;
3. write programs (especially Database Operations) must consider the constraints of the real environment to check the effectiveness of the Program in the real environment. Otherwise, the written programs may be correct and useless;
4. Major Release plans must be carefully and carefully arranged in advance so that they can be implemented in a step-by-step manner. Of course, accidents cannot be avoided and emergency response preparation should also be prepared;
5. Learn to write efficient SQL statements and concurrent programs.
6. Study the execution efficiency of subqueries and internal connections;
7. interpersonal communication is very important. It cannot be ignored, not because you are a programmer.