[SQL SERVER] I/O requests taking longer than 15 seconds to complete on file

來源:互聯網
上載者:User

What does the “I/O request” error below represent?

 

2008-04-21 13:26:42.480 spid364     
Microsoft SQL Server 2005 - 9.00.3177.00 (Intel X86)

2008-04-22 16:30:02.140 spid6s      
SQL
Server has encountered 2 occurrence(s) of I/O requests taking longer
than 15 seconds to complete on file [F:/sql data files/xxx.MDF] in
database [xxx] (5). 

2008-04-22 16:32:08.780 spid6s      
SQL
Server has encountered 2 occurrence(s) of I/O requests taking longer
than 15 seconds to complete on file [H:/sql data files/xxx_data_4.NDF]
in database [xxx] (5). 

 

Research and Findings

---------------------------------

These
errors may occur as result of “CPU Drift” and can be ignored and
disabled, howerver, first verify both SQL DMV
sys.dm_io_pending_io_requests and Windows Performance counters don’t
indicate any IO delays.

 

On computers with multiple CPUs, the CPUs are designed to “sleep” during periods of low workload. 
When
CPUs sleep, SQL may not accurately determine CPUs overall workload and
incorrectly report this as IO WARNING shown above, however, this does
not represent an actual CPU performance problem.

 

To
confirm if the CPUs entered low-power state, SQL Server Escalation
Services created RDTSC (Read Time Stamp Counter) utility to report total
CPU sleep time. 
The report confirmed the CPUs were sleeping up to 24 seconds. 
This would be enough for SQL Server to incorrectly report this a slow IO. 
Both AMD’s and INTEL’s web sites describe the ability for CPUs to sleep.

 

RDTSCTest.exe [-md|-mt]

      -md   Detailed output (default)

      -mt   CPU speeds in table format

 

-- Current CPU Speeds --

Runtime              CPU  ExpectedMHz ActualMHz RDTSCTicks           DriftMS         

-------------------- ---- ----------- --------- -------------------- ----------------

2008-04-22 17:53:36     0        3502      3503 0x0001564772F87FA72    16123.4

2008-04-22 17:53:36     8        3502      3506 0x00015647D8B7AE21D    23922.5

2008-04-22 17:53:36    16        3502      3507 0x00015647B5FEB4A39    21260.9

 

For more information on  RDTSC can be found at

http://blogs.msdn.com/psssql/archive/2007/08/19/sql-server-2005-rdtsc-truths-and-myths-discussed.aspx

 

Starting
with SQL 2005 SP2 we’ve included two trace flags to disable the
reporting of CPU Drift errors in the SQL Server errorlog. 

 

Disable this error using Trace Flag 8033

The time stamp counter of CPU on scheduler id 1 is not synchronized with other CPUs.

 

Disable this error using Trace Flag 830

SQL Server has encountered 2 occurrence(s) of I/O requests taking longer than 15 seconds to complete

 

-------------------------------------------------------------------------------------------------------

 

 

In the majority of cases this warning indicates that an I/O operation
has taken too long. This article looks at the possible reasons for this
and describes what can be done to reduce the likelihood of it
recurring.

There is another possible reason that is not I/O related and this is briefly referred to at the end of the article.

Let's start by examining the reason behind it. I/O requests from SQL
Server are handled asynchronously by the operating system. This means
that
when a read or write request is made, the thread making it waits for
the I/O to complete.

This frees CPU for use by another thread.
Optimally an I/O should take a few ms, but can take considerably more than this.
This warning is reported when the time between requesting an I/O and its completion is greater than 15 seconds.

The possible reasons for this are: I/O subsystem problems or
misconfiguration, excessive I/O being requested by SQL Server, data
files not
optimally placed on the disk, and fragmentation.

I/O Subsystem

The first thing to do is to examine the I/O subsystem. Hardware errors
are a common cause and you should run diagnostics if you are at all
suspicious that this might be the case.

Two common reasons for poor I/O throughput on a SAN are out of date
firmware, and insufficient queue length on the HBA. Be aware that if you
upgrade SAN firmware you often need to upgrade the HBA drivers at the
same time, or the server may fail to access the SAN altogether.
Contact the SAN vendor for confirmation and details.

Another common cause of poor I/O performance is if a file system filter
driver has been installed. A filter driver intercepts requests before
they reach the file system, and performs additional processing such as
anti-virus checking and encryption. It goes without saying that this can
only have a negative impact on I/O performance.

If you have to have anti-virus software installed on your SQL Server,
ensure that mdf, ndf and ldf files are added to the exclusion list. Even
better is for realtime virus checking to be disabled completely;
schedule a regular scan during quiet times instead.

I/O Load

It is often the case that there are I/O issues at night, when batch
jobs are running, but it performs well during the day with no warnings
in the
logs.

You may find that there are several jobs running at the same time,
either through poor scheduling or because one or more jobs have overrun.
Provided there is sufficient free time you could reschedule one or more
jobs and monitor for a couple of
days to see if it has made a difference.

This is more complicated in environments where a SAN is used by several
servers, as it may be jobs running on non-SQL servers overloading the
SAN
and causing your problems. In this case you need to discuss the problem
with the system administrators
responsible for these servers and agree a schedule that is mutually
acceptable.

Fragmentation

Fragmentation may be internal (within tables/indexes) or external (file fragmentation on the disk).

Internal fragmentation is only a problem where in-order scans are being
performed, and only on large tables. The warning message that is
the subject of this article is unlikely to occur in this situation.
Best practise is to monitor logical fragmentation
and regularly rebuild or reorganize affected indexes.

Disk fragmentation can also be reduced through best practise, such as presizing data files (capacity planning), and not
using autoshrink
, or manually shrinking database files.

If you suspect disk fragmentation may be causing performance problems, I can recommend Diskeeper
.
This runs continuously in the background
using idle CPU time only, and is able to defragment SQL Server files
while they are in use. Your mileage may vary, but I have seen
staggering improvements from running this tool.

I/O Configuration

Other best practise includes splitting tempdb into several files,
putting data and log files on separate disks, placing clustered and
non-clustered indexes in separate filegroups (on different disks),
keeping statistics up-to-date, etc.
All these will improve performance and some are described in
SQL Server Configuration

.

One Last Possibility

In a minority of cases there may be not be an issue with I/O at all, but a problem caused by CPU drift. This is described in
this article
.

I stress that this is rare, and you should always examine the possible causes of slow I/O described above.

Summary

This article has described some of the most common reasons for "SQL Server has encountered n occurrence(s) of I/O requests taking
longer than 15 seconds to complete on file <filename> in database <dbname>"

.

In most cases this message is an accurate description of the problem, and reasons for slow I/O should be investigated.

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.