The source of this article: http://www.cnblogs.com/wy123/p/6984885.html
An error message that a disk request exceeds a certain amount of time in a errorlog in the database occurs when a SQL Server server has recently encountered an extremely slow response and a client request error occurred.
Is this a problem with storage systems or disks, or is it a problem with SQL Server itself or is it caused by an application? And how to solve it?
This article will be a simple analysis of the factors that cause this problem, but it cannot cover all potential possibilities, so there is a specific analysis to be done with similar problems.
Disk request time-out in SQL Server
The error message in the English version of this error is as follows:
SQL Server has encountered%d occurrence (s) of I/O requests taking longer than%d seconds to complete on file [%ls] in Dat Abase ID%d. The OS file handle is 0x%p. 0
The offset of the latest long I/O is:% #016I64x
The Chinese version of the error message is as follows
SQL Server has encountered a%s Secondary to database ID%4! The I/O request for file [%3!] in is more than%2! Seconds to complete. The operating system file handle is 0x%5!. The most recent long-time I/O offset is:%6!
Refer to the NO. 833 error message in the message message
Specific 833 error Request disk request time-out phenomenon
The specific error situation is as follows:
SQL Server has encountered m I/O requests for files in database n * * For more than 15 seconds to complete. The operating system file handle is * * *. The most recent long-time I/O offset is: * *
This means that the database has encountered an error in the process of automatic file growth.
It is interesting that a DBA reports this error message to an engineer who is responsible for storage (SAN storage, not hanging disks) and that it is possible that the storage system is defective or unstable,
Storage engineers believe that storage is not a problem, check the server after the server is not normal, memory "almost full",
For the database server, the memory "almost full" situation can be said to be completely normal, since the responsible storage engineer is not a professional DBA, the memory usage of the SQL Server database server may not be too familiar with this question can be understood.
Because the storage used by the database server is a high-performance SAN storage, storage is available as a service, with n multiple servers in common,
Other servers do not have disk requests, it is unlikely that a single server will appear as a "storage failure" is simply identified as a storage failure.
So where is the reason?
Meaning of database engine error 833
First of all, to see the specific meaning of this 833 error, do not own 13 to explain a pass, the classic book written very clearly.
In short, the meaning is that SQL Server encountered disk busy or some other factors when requesting disk read and write, more than 15 seconds has not been completed
For example, when the data read and write the request to disk, and the disk is busy or other problems, too late or not appropriate enough timely, this will undoubtedly seriously affect SQL Server to provide server response time.
The simple analysis above, because this problem is not a general piece, the storage system is unlikely to have problems, it is likely to locate the current server itself factor.
Cause analysis
Because it is a dedicated SQL Server server, there are no requests from other applications, and it is likely to be related to requests made to the SQL Server database.
In fact, before this problem, there is a warning, usually a stable server (CPU rarely more than 60%, the memory of PLE can be stabilized at more than 20 minutes, the disk IO latency is low, etc.), but occasionally there is the situation of convulsions for a while
The performance of the exhaust when the CPU is about 80%, the memory of ple will be severely degraded, the IO delay is severely increased.
Now we can only start with the session of SQL Server, when observing the active session in SQL Server, we find that the query time of some kind of SQL statement is very long,
Usually this kind of SQL in a certain time period of execution of the frequency is relatively high.
But under normal circumstances, this kind of SQL execution efficiency is still relatively high, why suddenly become very bottom?
When checking the corresponding execution plan of the activity session, it is found that the waiting state of this kind of activity session is IO Wait (pageiolatch_sh), and the execution of SQL is completely unexpected execution mode.
Because similar queries are performed more frequently, such sessions are initiated from different clients, and once SQL is executed efficiently, there will be a large backlog of activity sessions on the server.
Why the normal execution of a good SQL statement suddenly become very slow and slow,
The reason is that at some point, SQL Server automatically triggers the update of statistics, but this is a larger table, but the sampling scale of the default statistics update is not enough, and if the sampling percentage is not enough, this statistic is completely unusable.
Refer to the previous article: http://www.cnblogs.com/wy123/p/5875237.html
Once the automatic collection of statistics is complete, it is not reasonable to send the previous SQL statement a way that it considers efficient (table scan instead of index seek) based on the statistics currently collected.
This raises the corresponding SQL to use a non-rational execution plan to implement the query, but also to trigger the session congestion, the client sent a large number of sessions at the same time in a inefficient way to execute slowly.
So the CPU will soar, the IO latency increases, and the memory ple is severely degraded.
It is also not difficult to understand that dozens of query session is in an unreasonable way to make a request to disk, the disk is busy activity session data request,
An inability to respond because of an autogrow request for data or an index file caused a start-up problem.
Finally through index reconstruction (prompting statistical updates, of course, purely statistical updates can be) resolved, long-term prevention, you need to schedule the job to manually define the statistics update threshold and sampling percentage.
Summarize:
Problems on the database server, many of the problems are a chain reaction process, corresponding to the observed part of the phenomenon, it is likely not the surface of the reaction (disk request timed out, the problem is on the storage?). )
Professional position must have professional literacy, such as the beginning of the DBA mistakenly thought to be a storage problem, storage engineers think the server memory is full of is not normal, in fact, is not the root cause of the problem.
Facing the problem, to trace back, find out the most fundamental reason, is the key to solve the problem.
833 error reason analysis and resolution for SQL Server disk request Timeout