Relationship between disk read-write and database

Source: Internet
Author: User

A disk physical structure
(1) Disc: The disk body of the hard drive is composed of several platters stacked together.

When the hard drive is factory, the hard disk manufacturer completes the low-level format (physical format), the role is to divide the blank disk (platter) into a single center, a different radius of the tracks (track), but also divides the tracks into several sectors (Sector), each sector can be stored 128x 2 of the N-th-square (n=0.1.2.3) byte information, the default size of each sector is 512 bytes. Typically, the user does not need to perform low-level formatting operations.

(2) Head: Each of the two sides of the disc has a head.

(3) Spindle: All discs are driven by the spindle motor rotation.

(4) Control integrated circuit board: Complex! There are ROM (software system inside), cache, etc.

How to complete a single IO operation on two disks
(1) Finding a way
When the controller sends an IO operation command to the disk, the drive arm (actuator arm) of the disk drives the head (head) away from the landing area (Landing zone, which is located in an area with no data in the inner ring) and moves directly above the track where the initial block of data is to be manipulated. This process is called seek (seeking), corresponding to the time consumed is called seek time (seek times);

(2) Rotation delay
Finding the corresponding track does not immediately read the data, when the magnetic head waits until the disk disc (platter) rotates to the sector where the initial data block (Sector) falls directly below the read-write head to begin reading data, and the time spent in the process of waiting for the disc to be rotated to the operational sector is called the rotational delay ( Rotational Latency);

(3) Data transfer
Next, as the disc rotates, the head continuously reads/writes the corresponding data block until all the data required for the IO is completed, the process is called Data Transfer, and the corresponding time is referred to as the transfer time (Transfer times). Once the three steps have been completed, a single IO operation is completed.

Depending on the process of a single IO operation on a disk, you can find:
Single IO time = seek time + rotation delay + transfer time

The formula for calculating ioPS (IO per second) is:
IOPS = 1000ms/Single io time

Three-disk IOPS calculation
What is the time required for different disks, their seek time, rotation delay, and data transfer?

1. Seek time
Considering that the data being read and written may be on any track in the disk, it is possible to have the most inner ring of the disk (shortest seek time) or the most outer ring of the disk (longest seek time), so we only consider the average seek time in the calculation.

When purchasing a disk, this parameter is indicated that the current Sata/sas disk, depending on the speed, seek time is different, but usually under 10ms:

Speed

Average Seek time

15000rpm

2~3ms

10000rpm

3~5ms

7200rpm

8~9ms

2. Rotation delay
As with Pathfinding, when the head is positioned on a track, it is possible to read and write to the data immediately after the disk has been read or written, but the worst case is that the magnetic head will be able to read the data after a full lap, so the average rotation delay is considered. For a 15000rpm disk is (60s/15000) * () = 2ms.

3. Delivery time
(1) Disk transfer rate
There are two types of disk transfer rates: Internal transfer rate (Internal Transfer rates), external transfer rate (External Transfer rates).

Internal transfer Rate (Internal Transfer rate), which refers to the data transfer between the head and the hard disk cache, is simply the speed at which the hard disk head will read the data from the platter and store it in the cache.

The ideal internal transfer rate does not exist seeking, rotation delay, has been on the same track to read data and upload to the cache, obviously this is not possible, because the storage space of a single track is limited;

The actual internal transfer rate includes seek and rotation delay, the current home disk, stable internal transfer rate is generally between 30mb/s to 45MB/S (server disk, should be higher).

External Transfer Rate (External Transfer rate), which refers to the rate of data transfer between the hard disk cache and the system bus, that is, the speed at which the computer reads from the cache to the appropriate hard disk controller via the hard disk interface.

Hard disk Manufacturers in the hard disk parameters, usually also give a maximum transmission rate, such as now SATA3.0 6gbit/s, conversion is 6*1024/8,768mb/s, usually refers to the hard disk interface to the external maximum transmission rate, of course, the actual use is not up to this value.

This calculates the IOPS, and conservatively chooses the actual internal transfer rate, taking 40m/s as an example.

(2) The size of a single IO operation
With the transfer rate, it is also necessary to know the size of the single IO operation (io Chunk size) to calculate the transfer time of a single IO. So what is the size of a single IO on disk? The answer is: not sure.

In order to improve the performance of IO, the system introduces the file system cache, which will put multiple requests from IO in the cache according to the request data, and then commit to disk again. This means that the read operation of multiple 8K blocks of data emitted by the database is likely to be processed in a disk-read IO.

Also, some storage systems also provide caching (cache), and after receiving an IO request from the operating system, the IO requests from multiple operating systems are combined into one processing.

Regardless of the operating system level of caching, or disk controller level of the cache, the purpose is only one, improve the efficiency of data read and write. Therefore, each individual IO operation size is not the same, it depends mainly on the system of data read and write efficiency judgment. Here is an example of the data page size for the SQL Server database: 8K.

(3) Delivery time
Transfer time = IO Chunk size/internal Transfer rate = 8k/40m/s = 0.2ms

Can be found:
(3.1) If the IO Chunk size is large, the transfer time will be longer, the single IO time will also be longer, resulting in a smaller iops;
(3.2) The main reading and writing cost of the mechanical disk is spent on the addressing time, namely: Seek time + rotation delay, that is, the swing of the disk arm, and the disk rotation delay.
(3.3) If you calculate IOPS roughly, you can ignore the transfer time, 1000ms/(seek time + rotation delay).

4. Sample IOPS Calculation
Take 15000rpm as an example:

(1) Single IO time
Single IO time = seek time + rotation delay + transfer time = 3ms + 2ms + 0.2 ms = 5.2 ms

(2) IOPS
IOPS = 1000ms/single io time = 1000ms/5.2ms = 192 (Times)
This calculates the random access IOPS for a single disk.

Consider an extreme case where the disk is all sequential access and can be ignored: the length of the seek time + rotation delay, the calculation formula for IOPS becomes: IOPS = 1000ms/transfer time
IOPS = 1000ms/Transfer time = 1000ms/0.2ms = 5000 (Times)

Obviously this extreme situation is too ideal, after all, the space of each track is limited, seek time + rotation delay time can indeed be reduced, but can not be completely avoided.

Disk read and write in four databases
1. Random Access and continuous access
(1) Randomly accessed (random access)
Refers to the current IO is given the sector address and the last IO given sector address is relatively large, so that the head in the two IO operation between the need for a larger movement to restart reading/writing data.

(2) Continuous access (sequential access)
Conversely, if the sector address given by the secondary IO is consistent or close to the last IO end of the sector, the head can start the IO operation very quickly, and multiple IO operations are called continuous accesses.

(3) Take SQL Server database as an example
Data files, objects on the unified area of SQL Server are spatially allocated in extent (8*8k), where data is stored randomly, which data page has space, and where it is written, unless you pre-allocate large enough, individually-used files to each table through filegroups. Otherwise, data continuity is not guaranteed, usually random access.
In addition, even if the clustered index table, it is only logical continuity, not physical.

Log file, due to the existence of VLF, log read and write theory for continuous access, but if the log file is set to autogrow, and the increment is not small, VLF will be many small, then it is not strictly continuous access.

2. Sequential io and concurrent IO
(1) Sequential IO mode (Queue mode)
The disk controller may issue a series of IO commands to the disk group at one time, if the disk group can only execute one IO command at a time, called sequential io;

(2) Concurrent IO modes (Burst mode)
When a disk group can execute multiple IO commands concurrently, it is called concurrent IO. Concurrent IO can only occur on a disk group consisting of multiple disks, and a single disk can only process one IO command at a time.

(3) Take SQL Server database as an example
Sometimes, even though disk transfers/sec is not too large, the discovery database has an IO wait, why? Usually because of the disk request queue, there are too many IO requests piled up.

The request queue and the busy level of the disk are viewed through the following performance counters:
Logicaldisk/avg.disk Queue Length
Logicaldisk/current Disk Queue Length
Logicaldisk/%disk time

In this case, you can do the following:
(1) Simplifying the business logic and reducing the number of IO requests;
(2) Multiple user databases under the same instance, migrated to different instances;
(3) The log and data files of the same database are separated into different storage units;
(4) The separation of read and write operations with HA strategy.

3. ioPS and throughput (throughput)
(1) IOPS
IOPS is the number of read-write (I/O) operations per second. When calculating the transfer time, it is mentioned that if the IO Chunk size is large, then the IOPS will be smaller, assuming that the data is read and written in 100M, then the IOPS will be very small.

(2) throughput (throughput)
Throughput refers to the number of bytes per second that can be read and written. It is also assumed that the data is read and written in 100M, although the IOPS is small, but n*100m data is read and written every second, and throughput is not small.

(3) Take SQL Server database as an example
For OLTP systems, small chunks of data are often read and written, more random access, and IOPS are used to measure read and write performance;
For data warehouses, log files, often read and write large chunks of data, more sequential access, with throughput to measure read and write performance.

The current IOPS of the disk is viewed through the following performance counters:
Logicaldisk/disk transfers/sec
Logicaldisk/disk reads/sec
Logicaldisk/disk writes/sec

Current throughput of the disk, viewed through the following performance counters:
Logicaldisk/disk bytes/sec
Logicaldisk/disk Read bytes/sec
Logicaldisk/disk Write bytes/sec

Relationship between disk read-write and database

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.