How does SQL Server use hard disk principles to reduce I/O

Source: Internet
Author: User

Introduction to hard disk principles

A typical Hard Disk 1 is shown.

Figure 1. A typical Hard Disk

As shown in figure 1, the disk on the hard disk is rotated at a high speed, and the head arm moves back and forth on the disk to read and write data. That's why we say hard disks are a mechanical component. Figure 2 shows how the disk reads data in a more abstract way. The disk is divided into multiple tracks from the center of the circle. The so-called swing arm moves back and forth on the track, that is, the swing arm moves back and forth between the tracks,

Figure 2. More abstract hard disk principles

In addition to the track, a track is divided into multiple sectors, as shown in 3.

Figure 3. Track, sector, and cluster

We can see that the sector is the smallest unit of Hard Disk addressing, but the smallest unit is the cluster ). This is why the actual size and occupied space of files on the hard disk are different.

The time it takes for the disk to read and write data.

After understanding the basic principles of the hard disk, it is not difficult to calculate that the time spent reading and writing data on the disk can be divided into three parts.

1. Seek time

The so-called seek time is actually the time required for the magnetic arm to move to the specified track. This part of time can be divided into two parts:

Seek time = start time of the magnetic ARM + constant * Number of tracks to be moved

The constant is related to the hardware of the drive, and the start time of the magnetic arm is also related to the hardware of the drive.

2. Rotation Delay

Rotation delay refers to the time when the slice is moved to the bottom of the head. This time is related to the number of turns on the drive. This is what we usually call the conversion of 7200 RPM hard disk.

Average rotation delay = 1/(2 * revolutions per second)

For example, the average rotation delay of a hard drive with 7200 RPM is 1/2*120 ≈ 4.17 ms.

The rotation delay is only related to hardware.

3. transmission time

Transmission time refers to the time when data is read from or written to the disk.

This time is equal to: the number of bytes to be read/written/the speed per second * The number of bytes per sector

Disk Scheduling Algorithm

It is not difficult to see that most of the parameters are related to hardware, and the operating system cannot be optimized. Only the number of tracks to be moved can be controlled by the operating system. Therefore, reducing the number of tracks to be moved is the only way to reduce the read/write time of the entire hard disk.

Because there may be many processes in the operating system that need to call disks for read and write operations, it is the purpose of the disk scheduling algorithm to reasonably arrange the movement of the head to reduce the seek time, several common disk scheduling algorithms are as follows.

1. First, let's start with the service algorithm (FCFS)

This algorithm queues disk IO requests and schedules heads in sequence. This algorithm is simple and reasonable, but does not reduce the tracing time.

2. SSFT)

This algorithm first executes the Request closest to the current head of the required read/write track. This guarantees the shortest of the average seeking time, but the disadvantage is obvious: requests that are far away from the current head may not be executed all the time, which is also called "hunger ".

3. SCAN Algorithm)

This algorithm selects the Request closest to the track where the current head is located as the next service object in the moving direction of the head. This improvement effectively avoids hunger and reduces the seek time. However, the disadvantage still exists, that is, it is not conducive to the remote access request.

3. Circular scanning algorithm (CSCAN)

This is also known as the elevator algorithm, which is an improvement of the shortest track time algorithm. Like an elevator, this algorithm can only go from the first floor to the 15th floor, and then from the 15th floor to the first floor. This is also the case for head scheduling of this algorithm. The head can only be from the innermost track to the outermost track of the disk. Then, move from the outermost track to the innermost track. The head is unidirectional. On this basis, the tracing request closest to the current head is executed in the same way as the shortest track time algorithm. This algorithm improves the SCAN algorithm and eliminates unfair requests to both ends of the track.

Other optimization methods and how SQL Server uses these methods

In addition to the above disk scheduling algorithm to reduce track time. Some other methods can also be used. Before getting started, I would like to talk about the local principle first.

Locality Principle

The so-called Locality Principle is divided into temporal and spatial ones. Because the program is executed sequentially, data near the current data segment may be accessed in the following time. This is the so-called spatial locality. There are still loops in the program, so the currently accessed data may be accessed again in a short time. This is the so-called time Locality Principle.

Therefore, after understanding the local principle, we can reduce disk I/O by using the following methods.

Read-Ahead)

Pre-read is also called pre-read. According to the disk principle, it is not hard to see that during the disk data reading process, the real data reading time only occupies a small part, and most of the time is spent on the rotation delay and seek time, therefore, based on the spatial locality principle, SQL Server not only reads the required data each time, but also reads the data near the requested data. This is called pre-read in SQL Server. SQL Server can effectively reduce IO requests through pre-reading.

Delayed write)

Similarly, according to the time Locality Principle, recently accessed data may be accessed again. Therefore, after the data is changed, it is not immediately written back to the disk, but stored in the memory, for the next request to read or modify, it is another effective method to reduce disk IO. in SQL Server, latency writing is a buffer pool. When a modification request is commit, it does not immediately write back the disk, but marks the modified page as "dirty", and then writes back the disk through checkpoint or lazy writer based on a mechanism. Regarding the principles of checkpoint and lazy writer, for more information, see my previous article: about transaction logs in SQL Server (2)-the role of transaction logs in data modification.

Optimize physical distribution

According to the disk principle, it is not difficult to see that if the requested data is continuous between the physical tracks of the disk, the moving distance of the head will be reduced, thus reducing the seek time. Therefore, placing relevant data in a continuous physical space will reduce the tracing time. In SQL Server, data is continuously stored on the physical disk based on the primary key by means of clustered indexes, reducing the tracing time.

Summary

This article describes the principle of hard disk, the time it takes to read and write data, and how to reduce the time it takes to read and write data, and briefly describes how SQL Server uses these features to reduce IO usage. Understanding the principles of a disk is one of the foundations for performance tuning.

Link: http://www.cnblogs.com/CareySon/archive/2012/08/20/2647017.html

Edit recommendations]

  1. Comment: SQL Server 2012
  2. Full understanding of SQL Server Profiler Series 1: Principles and related concepts
  3. Microsoft SQL Server 2012 assists with Super 8 Hotel insights to expand opportunities
  4. SQL Server 2008 R2 failover Cluster Environment preparation
  5. SQL Server: how local variables affect query performance

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.