System invisible killer-blocking and waiting (SQL), and invisible killer SQL
Preface
The application system carries a large number of businesses and comes with complicated business logic. in the database, there are a large number of different types of SQL statements.
The SQL statement execution speed is inseparable from the blocking wait.
There may be many reasons for system slowness: insufficient hardware resources, unoptimized statements, unreasonable structure design, and lack of necessary O & M methods. All these problems can be seen in blocking and waiting to discover and solve the problem.
This article describes how to discover and solve system blocking and waiting.
Scenario Description
Is there a problem with your system?
- The system runs slowly. It takes several dozen seconds for many functions to present results. The user experience is very poor, and leaders are constantly putting pressure on the system. As the system owner, they only know that the system is slow and do not know where the system is slow? We are too late to solve the problem. The leaders have complained to us or are too slow to get used to it.
- The function of the system runs slowly. It takes a long time to run a statement in the production environment. However, it is very fast to run this statement in the testing environment or separately? It seems unscientific?
- I have a lot of knowledge about data. I can find out what the system waits for, but I don't know what these waits mean. Baidu's varied answers cannot solve my problems.
- I can find and solve this part of the wait, but I can only use some scripts to gain a thorough understanding of the status quo.
- I can solve expert problems, but I cannot give the leaders an intuitive presentation.
System wait Overview
A good SQL statement is better than a car with a speed of 180. A good system hardware (CPU, memory, and disk) is like a smooth and wide road. It seems like a good car with a good road, you can drive it quickly! A little bit is ignored! When you drive a Ferrari on the wide Third Ring Road in Beijing, even if you are in the old gun "three rings 12 shao", how much can you drive to the morning peak? Early Beijing peak! Early Beijing peak!
This example introduces the concept of system blocking and waiting. The red light (hardware waiting, such as IO waiting) is the normal waiting. If another car does not walk in front of you or is driving slowly, you can only wait (it can also be said that you are blocked by him )!
A figure shows the main wait types and Solutions of the system:
Problem diagnosis
The diagnosis of any problem should be taken into consideration from a global perspective. The most taboo thing is that when a high indicator is seen, the problem is quickly located, and then the problem is analyzed in a partial manner.
A problem may involve many parts. Therefore, we must first locate the system problem from a global perspective. Blocking is also the same. What types of blocking exists in the system and what are the main causes, which are the cause of association and which are secondary.
Globally locate blocking and waiting
First, we need to care about the waiting types in the database.
Note: This part shows the waiting conditions in the system. Similar to the script, you have excluded unnecessary types of concern and sorted the waiting conditions for statistics.
Horizontal coordinate: Waiting type
Ordinate: number of occurrences in the collection period
When we know the wait type, we need to know which of these types takes a lot of time:
Note: The waiting time of various wait types is also the main direction for troubleshooting. Combined with the waiting type and waiting time, we can see which waits are in the system and which waits are serious, which is the most serious.
Horizontal coordinate: Waiting type
Ordinate: average wait time
After learning about the main wait types and time, we need to analyze: What database does it come from? Which programs come from? What is the result of a user request? When is blocking the most serious?
Wait for specific statement
The overall wait situation of the system is clear. Next we will look at the specific statements that cause the wait, which is also an important analysis step to solve the problem.
Which types of sentences are most frequently waited?
Note: here we can analyze the blocked statement types in multiple dimensions based on the number of waits, wait time, and resource consumption.
What is the specific waiting condition of a statement? You can view the actual blocking status of a specific statement in the original view.
Note: In the detailed view of blocking, we can clearly see the blocking tree of the statement, and we can see the blocked statements, time, type of resource that has been blocked and waiting
Blocking tree: In this example, session 68 is blocked by session 66, and session 66 is blocked by session 104, in this way, three sessions constitute a blocking link, which is also called a blocking tree.
Diagnostic conclusion
Through global positioning, statement type analysis, to specific statement execution blocking status, comprehensive analysis based on the blocking type, number of times, time, Connection Program, resource consumption, and other dimensions, we can clearly see the blocking problem in the database.
In this example, the main blocking types of the system are CXPACKET and LCK_M_U. The blocking time is long and the main blocking time is about eleven o'clock A.M, the main blocking statement is an update statement and a complex select query.
Problem Solving
First of all, the figure below briefly describes the solution to the system's corresponding Waiting needs.
Note: The main methods to reduce blocking based on different situations include adjusting the server, instance, and database configuration parameters (for example, adjusting the degree of parallelism) and changing the isolation level (for example, snapshot reading, nolock, etc.), optimize statements (such as adding indexes and optimizing writing)
In this example, the main CXPACKET is caused by the poor configuration of the Instance concurrency parameter. LCK_M_U is mainly because one update is blocked by another update lock in one batch, the optimization of update statements is mainly to ensure the optimization of update statements and shorten the execution time as much as possible, in addition, the most common solution to update in high concurrency is to use indexes to use key lock to represent locks to improve concurrency. The updated table may only have dozens of records, the concurrent efficiency of adding an index and not adding an index is also very different. In addition, the design of the program is also very important. You can only learn about the various mysteries in the actual environment. The main purpose of using the SQL expert cloud tool is to comprehensively locate the problem, chart statistics and other forms clearly present the problem, and quickly solve the problem according to the solution provided by the tool.
The above is all the content of this article. I hope this article will help you in your study or work. I also hope to provide more support to the customer's home!