Disk I/O performance

Source: Internet
Author: User
If you configure SQL Server to contain only several gigabytes (GB) of data and do not undertake heavy read or write operations, you do not need to pay too much attention to disk I/O topics, and balance SQL server I/O activities between hard disks for optimal performance. However, to create a large SQL Server database that contains hundreds of gigabytes of data and/or undertakes heavy read and/or write activities, it is necessary to balance the load between multiple hard disks, to configure the best SQL server disk I/O performance.

  

Relationship between nominal disk transfer rate and SQL Server

One of the most important aspects of optimizing database performance is adjusting I/O performance. Of course, SQL Server is no exception. Unless SQL server runs on a machine with enough memory to accommodate the entire database, otherwise, the I/O performance is determined by the speed at which the disk I/O subsystem processes SQL server data read/write.

  

Keep in mind the following rule of thumb: The standard wide ultra SCSI-3 hard drive provides 75 nonconsecutive (random) for Windows and SQL server per second) and 150 consecutive I/O operations. The nominal transmission rate of the hard disk is about 40 MB/s. Remember that it is more likely to limit the database server's transmission rate to 75/150 I/O per second, rather than 40 MB/second. The calculation is as follows:

  

(75 random I/O operations per second) x (8 KB transfer) = 600 KB per second

  

The calculation above indicates that if strict random read or write SQL server operations (single-page read and write) are performed on a given hard disk ), it is possible that the hard disk can only process up to 600 KB (0.6 MB per second) per second. This is much lower than the 40 mb I/O processing capacity of the drive per second. The SQL Server worker thread, graphical showplan, and lazywrtier run 8 KB of I/O.

  

(150 consecutive I/O operations per second) x (8 KB transmission) = 1,200 KB per second

  

The calculation above indicates that if you perform strict continuous read or write SQL server operations (single-page read or write) on a given hard disk ), it is possible that the processing capacity of the hard disk is up to 1,200 KB (1.2 MB per second) per second.

  

(75 random I/O operations per second) x (64 kB transmission) = 4,800 KB per second (4.8 MB)

  

The above calculation indicates the best pre-read Scheme (assuming that all operations are random I/O ). Note that 64 kB transmission can still provide a single page transfer rate (0.6 and 1.2 Mb/s) even if the random I/O is fully performed) much better I/O disk transfer rate (4.8 Mb/s ):

  

(150 consecutive I/O operations per second) x (64 kB transmission) = 9,600 KB per second (9.6 MB)

  

The above calculation indicates that if you perform a strict continuous read or write SQL server operation on the given hard disk, it is possible that the hard disk has a maximum I/O processing capacity of 9.6 mb per second. This is much better than random I/O. SQL Server read-ahead manager executes disk I/O at a 64 kB transfer rate and tries to arrange read operations for continuous (usually referred to as "sequential disk") pre-read scanning. Because the purpose of read-ahead manager is to continuously perform I/O operations, page splitting may lead to read discontinuing in the extended disk, which is the reason for eliminating and preventing page splitting.

  

Log manager can continuously write up to 32 KB bytes to log files.

  

Continuous and discontinuous disk I/O operations

The continuous and discontinuous terms are used in hard disk operations. It is necessary to take some time to explain the meaning of these two words for the hard disk. A hard disk consists of a group of drive disks. Each drive disk uses a set of read/write headers to provide services for read/write operations. These read/write headers can be moved between disks, read Information from the drive disk or write data to the disk. For SQL Server, remember the two important features of the hard disk:

  

The read/write head and the related disk arm must be moved to search for and operate on the locations of Hard Disk disks required by SQL Server and windows. If the data on the disk is not consecutive, it takes much time for the hard drive to move the disk's arm and read/write head to all the required disk locations. If all the required data is in the continuous physical sector on the disk, the opposite is true, the disk's arm and read/write head can be moved very little to complete the required disk I/O operations. The time spent in the continuous and discontinuous situations varies greatly. It takes about 50 milliseconds for each discontinuous data query, while the continuous data query takes about 2-3 milliseconds. Note that these values are roughly estimated. The specific values depend on the discontiguous data distribution on the disk, the disk rotation speed (RPM), and other physical properties of the hard disk. The main thing to remember is that continuous I/O is beneficial to SQL Server performance.

The standard hard drive supports 75 consecutive I/O and 150 consecutive I/O per second. The important thing to remember is that the time for reading or writing 8 KB is almost the same as the time for reading or writing 64 KB. Within the range of 8 KB to 64 KB, the I/O transmission of a single disk takes the most time for the disk to take the arm and read/write head movement. Therefore, in terms of mathematics, it is helpful to transmit SQL data of more than 64 KB to a 64 kB disk as much as possible, because 64 kB transmission is basically as fast as 8 KB transmission, the SQL server data transmitted each time is eight times the data transmitted 8 KB. Remember that read-ahead manager performs disk operations with 64 kB bytes (also known as the SQL Server extended Disk Area. Log Manager also performs continuous write operations with a large amount of I/O transmission. The main thing to remember is to make full use of read-ahead manager and separate SQL server log files from other non-sequential access files to effectively improve the performance of SQL Server.

For more information about physical hard disks, see Compaq's White Paper "Disk Subsystem performance and scalability (Disk Subsystem performance and scalability )", the location will be mentioned in "find other materials" at the end of this article.

  

Disk I/O transfer rate/RAID Controller transfer rate/PCI bus bandwidth

The maximum transmission rate of a standard hard disk is 40 MB per second or 75 discontinuous disks per second/150 consecutive disks. The nominal transmission rate of the standard RAID Controller is about 40 MB per second or (very close) 2,000 disks per second. The calibration transfer rate of the Peripheral Component Interconnect (PCI) bus is approximately 133 MB per second or higher. The actual device transfer rate will be different from the nominal transfer rate, but this issue is not important for our discussion here. It is important to understand how to use these transfer rates to roughly estimate the number of hard disks associated with each RAID Controller, and how many drives and raid controllers a PCI bus connects to avoid I/O bottlenecks.

  

In the previous section "Relationship between nominal disk transfer rate and SQL Server", the SQL server data volume that can be read from or written to the hard disk per second is calculated as 9.6 mb. Assuming that the raid controller can process 40 MB per second, roughly calculate the number of hard disks that can be connected to a RAID Controller should be 40 divided by 9.6, and the answer is about 4. This means that when SQL Server only performs 64 KB of continuous I/O, a maximum of four drives can be associated with a controller. Similarly, the maximum data transmission rate from the hard disk to the Controller is 4.8 MB/s for all non-consecutive I/O operations of 64 KB. 40 MB/second divided by 4.8 MB/second is about 8. That is to say, in a non-consecutive 64 kB scheme, the maximum number of hard disks associated with a single controller is 8. A random 8 KB data transmission scheme requires a maximum of drives. 40 divided by 0.6 is approximately 66, which means 66 drives are required to make the RAID Controller that reads and writes 100% random 8 KB. This is not a practical solution, because the amount of transmission used for pre-reading and logging is greater than 8 KB, and SQL Server cannot execute 100% random I/O.

  

Another way to calculate the number of drives that can be connected to the raid controller is to start from the disk per second instead of the number of bytes per second. If a hard disk can carry out 75 random I/O operations per second, in theory, 26 connected drives can generate 2,000 non-sequential I/O operations per second, it is sufficient to achieve the maximum I/O processing capability of a single RAID Controller. On the other hand, only 13 hard disks can be connected together to generate 2,000 consecutive I/O records per second, enabling the RAID Controller to run at maximum throughput, this is because a single hard disk can withstand 150 consecutive I/O per second.

  

Now we will discuss the PCI bus. Note that raid controllers and PCI bus bottlenecks are not common hard disk-related I/O bottlenecks. However, for ease of instruction, we assume that a group of hard disks associated with a RAID Controller is busy enough, so that 40 MB of throughput per second passes through the Controller. The next question is, "How many raid controllers are connected to the PCI bus, which is safer and will not cause the PCI bus I/O bottleneck ?" To make a rough estimation, you can divide the I/O processing capacity of the PCI bus by the I/O processing capacity of the RAID Controller: 133 MB/second divided by 40 MB/second equals about 3, this indicates that about three raid controllers can be attached to a PCI bus. Note that most large servers have multiple PCI buses, so that more raid controllers can be installed on a single server.

  

These calculations help describe the relationship between the transfer rates of each component that forms the disk I/O subsystem (hard disk, RAID Controller, and PCI Bus), but should not be taken literally. This is because these calculations assume that all of them are for continuous data access or non-sequential data access, which is almost impossible in the production database server environment. In fact, there are usually both continuous I/O and discontinuous I/O, both 8 kb I/O and 64 kB I/O. There are other factors that make it difficult to accurately estimate the I/O operations through a set of hard disks at a time. The plug-in read/write cache available for the RAID Controller increases the number of I/O files that can be effectively generated by the drive group. Since it is difficult to accurately estimate the numbers of 8 KB and 64 kB I/O required by the SQL server environment, it is also difficult to estimate the number of I/O increases.

  

However, we hope that this part will help you understand the actual significance of the calibrated transfer rate for SQL Server.

  

Raid

When scaling databases with billions of bytes of data, it is important to have a basic understanding of RAID (cheap redundant disk array) Technology and Its Relationship with database performance.

  

Raid has the following advantages:

  

Performance: The hardware RAID Controller slices all data reads/writes in windows and applications (such as SQL Server) (typically 16-128 KB ), these slices are distributed across all disks involved in the raid array. Splitting data between physical drives in a similar way can achieve an average distribution of read/write I/O workload among all physical hard disks involved in the raid array. This improves the disk I/O performance, because the hard disks involved in the raid array remain equally busy, this does not cause some disks to become bottlenecks due to the uneven distribution of I/O requests.

Fault Tolerance: Raid protects the hard disk from faults in two ways and prevents data loss due to faults: Images and parity information.

"Image" is implemented by writing information to two groups of drives. There is a group of information on each side of the image drive. If one drive fails when an image is used, you can replace the faulty drive and

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.