Performance Analysis of 4 kb large-sector hard disks

Source: Internet
Author: User
Tags benchmark gparted

 

Some time ago, I published an article titled the appearance and Analysis of Large-sector hard disks
"To illustrate the emergence of large-sector hard disks in, and some readers have also sent a letter to ask, here, reprinted an article by foreigners, again to explain 4 K large-sector hard disks, we hope that you will have a better understanding of the 4 K large-sector hard drive and deepen your understanding of the impact of the 4 K large-sector hard drive on the real system!

 

Linux on a 4 kb sector Disk: practical suggestions

Make sure that Linux is started on all cylinders

------ Original address http://www.ibm.com/developerworks/cn/linux/l-4kb-sector-disks/index.html? CA = Drs-

 

Roderick
W. Smith
, Consultant and writer Smith's photo "width =" 64 "Height =" 80 ">

Roderick W. Smith
He is a consultant and writer who has written more than 10 books about UNIX and Linux, includingThe definitive guide to Samba
3. Linux in
Windows world

AndLinux Professional
Institute certification
Study Guide

. He is also the creator of the GPT fdisk partitioning software. He currently lives in the island state.
Woonsocket city.

Introduction:
Since December 2009, hard drive manufacturers have introduced 4096-byte sector disks, rather than the common 512
Byte sector disk. To run the operating system properly, use the firmware to divide the 4096-byte physical sector into 512
The use of large physical sectors poses a risk to the disk layout and system performance. This article looks at these risks, including benchmarking, which describes common Linux
The real impact of the file system. As 4096-byte sectors become more common since 2010, the policies required to process these new disks become increasingly important.

 

Why is it changed to 4096-byte sector?

If you are familiar with the disk structure, you will know that the disk is broken downSlice
Usually 512
Byte. All read/write operations are performed in multiple slices. Check carefully and you will find that the hard disk actually contains a large amount of extra data between the slices. These extra bytes are used by the disk firmware to detect and correct each
Error in the slice. As hard disks become larger and larger, more and more data needs to be stored on each disk area, leading to more low-level errors, thus increasing the workload of firmware error correction.

One way to solve this problem is to increase the slice size from 512
Byte is increased to a larger value to use a more powerful error correction algorithm. These algorithms enable each byte to use less data than 512
The Byte sector can solve more serious problems. Therefore, changing to a larger slice has two practical advantages: improving reliability and increasing disk capacity-at least theoretically.

Different from increasing the display size or increasing the speed of the Central Processing Unit (CPU), the real benefits to end users may not be so obvious. However, the space dedicated to the parity test is reduced.
It can accelerate the introduction of large disks or improve disk reliability.

Unfortunately, about 512
The assumption of the byte sector is hidden in the entire software chain, and in tools such as the Basic Input/Output System (BiOS), boot loader, operating system kernel, file system code, and disk tool. Although to 4096
The transformation of the byte sector has been brewing for many years, and some tools are not ready yet. Microsoft Windows XP is often exposed to potential failures, even in Linux
.

To help achieve this transition, the first batch of disks with 4096-byte partitionsPhysical
Convert the slice into eight 512 bytesLogic
Slice. For bios, operating system, and all disk tools, the disk looks like a 512-byte sector, but the actual underlying physical sector size is 4096 bytes. Western
Digital is the first manufacturer to produce such disks. It uses terminology.Advanced format
To represent a physical sector with 4096 bytes and
A 512-byte logical sector conversion disk. This article uses the same term for disks of Western Digital and those of other manufacturers using similar technologies.

 

Why is performance affected?

Unfortunately, changing the surface sector size in the firmware will reduce performance. To understand the cause, first understand the data structure of the file system and how to partition the disk.

The latest file system uses a Data Structure of 4096 bytes or a larger size. Therefore, most disk I/O operations multiply. Think about it, when Linux
What happens when you want to read and write these data structures on a new disk with 4096-byte sectors. If the data structure of the file system is exactly the same as that of the underlying physical partition
Reading and writing of the byte data structure will produce reading and writing of a single sector. You do not need to do anything special about the firmware of the hard disk. However, when the data structure of the file system is inconsistent with that of the underlying physical sector, read/write operations must use two
Physical sector. For read operations, this basically does not take a lot of extra time, because the read/write headers on the disk are likely to pass through two sectors consecutively, and the firmware can discard unnecessary data. On the other hand, for inconsistent numbers
Data Structure write operation requires the disk firmware to read two Sectors first, modify the partition of two sectors, and then write two sectors. The time required for this operation is 4096
It takes more time for bytes to occupy a sector. Therefore, the performance is reduced.

How can we determine whether the data structure is reasonably aligned?
Most file systems align their data structures with the beginning of a partition that contains itself. Therefore, if a partition starts with a 4096-byte
Slice) boundary, indicating that it is properly aligned. Unfortunately, most LINUX partition tools have not yet created partitions that are aligned in this way until recently. Next section
Homogeneous partitioning
, Describes how to use common LINUX partition software to align partitions.

 

Benchmark Test Results

You may not know how important partition alignment is. To this end, we use a 1 TB Western
Digital WD-10EARS advanced format driver. The disk uses the Globally Unique Identifier (guid) Partition Table
(GPT) where the system performs partitioning, the alignment partition starts from the logical sector 40, and the non-alignment partition starts from the logical Sector 34 (using GPT)
The first available sector of the disk and the default Partition Table ). The tested file systems are ext3fs, ext4fs, reiserfs (version 3rd), JFS, XFS, and
Btrfs. The computer runs a 64-bit 2.6.32.3 Linux kernel.

A script executes a series of disk I/O operations, including creating a new file system, extracting an uncompressed Linux
The original kernel code goes to the test driver, copies the original code to the driver, reads unzipped files from the test driver, reads the original code from the driver, and deletes the Linux kernel directory. Source Linux
The original kernel code is stored on another disk. For read tests, the output points to/dev/null. After each write test, the test disk is detached to ensure that
The disk cache of is not operated. The reported number includes the time required to perform the unmount operation. The size of the original kernel code is 365 MB-far greater than 64 Mb on the disk
. Each test sequence runs six times for each file system, three times for properly aligned partitions, and three times for improperly aligned partitions.
Times. The number of Operation changes is small. Use the value obtained by dividing the mean non-alignment time by the mean alignment time to determine the impact of the unreasonable alignment on performance. Over 1.00
The value indicates that improper alignment may cause some performance loss.

Many tests produce moderate losses. The value created for the file system is between 0.96 (for XFS) and 7.94 (for reiserfs), and the average value is
2.79. This loss is not so important because file systems are not often created. The value generated by the read test is between 0.95 and 1.25, which indicates that the speed loss cannot exceed 25%, 1
. If the value is 1.00, there is no loss. If the value is higher, the performance is decreased.

 

Figure 1. Read performance loss with unaligned partitions


The Write Performance of large files is also subject to moderate losses. These values are between 1.10 (for XFS and JFS) and 6.02 (for reiserfs), and the average value is
2.10. Generally, this value is high because of the sensitivity of reiserfs. After deleting the file system, the average value of the remaining five file systems is 1.31. File detection is similar, from
1.04 (for XFS) to 4.78 (for JFS), the average value is 1.97. After JFS is deleted as the group value, the average value is 1.40.

The creation of small files has the greatest write performance impact (extract the original kernel code ). The impact on original code extraction ranges from 1.04 (for ext4fs)
25.53 (for reiserfs), the average value is 10.9. The second major performance impact in this test is XFS. The value is
1.82. Since these numbers are the ratio of non-alignment performance to alignment performance, if the value is 10.9, it indicates that it takes 10 hours for an original code to be extracted from a reasonably alignment partition.
Seconds, and 109 seconds for an improperly aligned partition-a huge difference! For XFS, if the value is 1.82, this 10
The second operation takes 18.2 seconds on an improperly aligned partition.

Figure 2 summarizes the write performance loss for all file systems. If the value is 1.00, there is no performance loss. If the value is higher, the performance is decreased.

 

Figure 2. Write Performance Loss with unaligned partitions


Note that these tests do not reflect the overall performance of the file system. For example, you should not
The biggest performance difference is that it has a bad impact on performance. However, reiserfs is more sensitive to unreasonable alignment than other file systems.

In addition to running a test on the file system in the partition, a sampling check is performed on the file system in an LVM configuration, regardless of the LVM
Whether the partition is properly aligned. These results are similar to the original partition results.

What does it actually mean? Determine the size of the physical sector of your disk. If you have advanced format
The driver should be aligned with your partition reasonably.

 

Determine the physical sector size

Theoretically, the Linux kernel should be in/sys/block/sdx/queue/physical_block_size
The size of the physical sector is returned in the pseudo-file, in/sys/block/sdx/queue/logical_block_size.
The size of the logical sector is returned in the pseudo-file, where sdx is the name of your device node (generally SDA, SDB
). However, in practice, the physical block size information is false, at least for the first generation of Western Digital advanced
The format driver is like this. Unfortunately, this indicates that the disk tool cannot properly detect the existence of such disks.

In practice, you must
You can find the type of your driver on the site or in other ways. /Sys/block/sdx/device/Model
The pseudo file contains the device model, so you can search for it here and check it with the manufacturer.

For the current first generation of advanced format driver
Digital has tags on drivers to indicate that they are advanced format drivers. Unfortunately, these labels indicate that only Windows XP
There is a problem with these drivers. The above benchmark test results show that Linux users must be very careful with these drivers.

 

Alignment Partition

 

The current Western Digital driver includes a jumper that can be used to set Windows XP compatibility. This Jumper can move the slice number by 1
Therefore, a partition is placed on the actual logical sector 64, and the computer recognizes it as the beginning of the Sector 63 (for cylindrical alignment ). This is a solution for Windows
This is an emergency solution for common scenarios (that is, to use slice alignment across a single partition of the entire drive. Unfortunately, if you create multiple partitions, except for the first partition, all other partitions may not be
Alignment. Therefore, you are almost certainNo
Use this jumper. Second, use your LINUX partition software to create a reasonably aligned partition.

Three Master Boot Record (MBR) series and GPT partition tools are available for Linux
And each tool has its own way of alignment and partitioning. If you have an advanced format driver, you 'd better choose to run the latest Linux
Partition software.

Tip:
If you want to double-start Linux and an old operating system that requires cylindrical alignment, try
To adjust the start of all partitions. This is converted into eight-sector alignment for optimal disk performance and cylindrical alignment for earlier operating systems.

 

Alignment raid Partition

Independent Disk redundancy array (RAID) level 5th and level 6 contain
Similar alignment Problems of the driver, but the reason is related to the data band size used to create the array, usually 16 KB
To 256kb. When using a raid array, you should align the partition on the data that doubles the size. As an emerging standard, 2048
Default alignment on a sector (Kb) is applicable to all common raid strip sizes.

The released test results show that the performance loss of improper alignment is about 5-30%, which is less aligned than advanced format.
The performance loss caused by the driver is much smaller. When creating a raid array from an advanced format disk, no additional steps are required. Because the raid alignment value is
A multiple of the 4096-byte alignment required by the advanced format driver, if you are a raid with 512-byte physical sector
Disk arrays are aligned and partitioned, and both technologies can be implemented.

 

fdisk
Series

fdisk
Series is the majorityutil-linux-ng

You can edit the MBR data structure directly, but you cannot create or modify a file system. Passutil-linux-ng

2.17,fdisk
Alignment of the Eight-sector partition does not provide any direct support. Through 2.17.2
The alignment is still based on the cylinder by default.

However, you can use any versionfdisk
Reasonably align partitions. To do this, enteru

,
Change the default unit from the cylinder to the slice. Enter the initial sector value, which must be a multiple of 8. Theoretically, to achieve reasonable alignment, you can set the number of sectors in the first partition
8; however, it is best to set the first partition to a value of 64 or higher to leave room for the loader code in the unallocated space between the MBR and the first partition. Microsoft
The partition tool for Windows Vista and Windows 7 starts the first partition from the sector
2048. From a cross-platform perspective, this is the security zone where the partition is started. In factutil-linux-ng
2.17.1
When you enterc

To disable DoS
In compatibility mode, this is the default setting. We recommend that you keep this setting.

However, note that,fdisk

The subsequent partitions are not automatically aligned. If you specify the partition size in MB or larger bytes, and then accept the default value of the subsequent partition, the subsequent partitions may be aligned, but this is not necessarily true. For security reasons,
You should not check each partition whose values are multiples of 8.

Usefdisk
Another methodfdisk -H 224 -S 56
/dev/sda


Start it, which will change cylinder/head/sector (CHS)
Geometric parameters to ensure that the program and the cylinder are properly aligned by 4096 bytes, which is also the default situation.

 

libparted
Library

libparted
The library driver supports multiple LINUX partition tools operated by the file system. Using version 2.1, the text mode is GNU
Parted Program (command nameparted
Only supports the alignment of the cylindrical boundary. The best way is to inputunit
s


To change the default unit to a sector. Then, you can manually enter the partition start point in the sector and precisely check the partition start point.

Version 2.2 began to shift to a more useful strategy for disks with 4096 physical sectors. You can specify1M

And then the slice is properly aligned. This version also generates a warning when your partition is not properly aligned.

When using the GUI gparted program, you must deselect "round
Cylinders check box, 3
. You must set the start sector of the partition associated with the end of the previous partition, but this can be done if you start from a reasonably aligned partition. You can display
The information dialog box to understand the absolute meaning of the start and end slice.

Figure 3. deselect "round to cylinders" when using gparted"
Check box (it is displayed as selected here, which is the default status)

GPT fdisk Tool

The GPT fdisk tool is only useful when you use a GPT disk. 0.5.2
In earlier versions, no alignment is performed, although you can manually alignment the partition by specifying the appropriate number of starting slice. Versions 0.5.2 and 0.6.0 to 0.6.5
Adjust the starting sector of all partitions to the boundary of eight sectors. However, this is only applicable to large disks (disks larger than 800 GB) and not small disks. Version 0.6.6
Introduce a Windows-style 2048 Sector (1 MB) alignment for all unpartitioned disks and try to deduce the disk alignment used in the past through the existing partition.

Through 0.5.2 and later versions, you canl

Option to manually adjust the alignment value. This option uses a large number of sectors as an option. To reasonably align the advanced format disk, set this value to 8.
Or its multiples. Verification option (any menuv
) Any partitions that are not reasonably aligned based on the current alignment value report.

 

Prospect

Currently, only a small number of advanced Format hard drive models are available. According to news reports, the technology from 2010
More drivers from all major manufacturers will be extended from the beginning of the year. As you can imagine, the new model may encounter other performance problems different from the first generation of advanced format drivers.

Finally, the manufacturer may discard the 512-byte sector concept or provide jumpers to allow users to choose whether to use this compatibility feature. If you encounter a 4096
Byte sector, but you can select a driver of the actual sector size, you may want to use it; however, you need to pay attention to some warnings.

As mentioned earlier, software from BIOS may include assumptions about the disk sector size. If the BIOS contains such a hypothesis, your computer may not
Boot starts on a 4096-byte sector disk that lacks firmware conversion to 512-byte sectors. From version 2.2, except for version 512
A disk outside the byte sector that belongs to the test sector. Start GNU parted on the disk.
It automatically displays a warning that supports the disk. Other problems may be hidden in software that is important to you. Using the latest software may help you solve these problems. For example, you can use a traditional disk as the boot disk.
Use, use the new technology disk only as a data disk (/dev/SDB or higher ).

In short, be cautious when dealing with unusual new disks. The current style of the advanced format disk and other new drive types may soon be settled.

 

References

Learning

  • "Advanced
    Format Technology
    "(Western
    Digital is a white paper translated into multiple languages (pdf). It describes advanced
    Format.
  • "Processing ing
    WD's advanced format
    HD Technology
    "(Hot hardware) includes Windows benchmarking.
  • Linux
    A post from kernel developer tejun Heo
    , Description
    Technical challenges of the advanced format driver in Linux software.
  • "Linux
    Hardware raid howto
    Linux raid alignment.
  • On developerworks Linux
    Zone
    Find Linux developers (including new Linux beginners)
    )
    For more references, refer to our most popular
    Welcome articles and tutorials
    .
  • Refer to all
    Linux skills
    And
    Linux tutorial
    .
  • Stay tuned to developerworks Technical Events
    And network broadcast
    Focus on various IBM
    Product and IT industry themes.

  • Participation free
    Bill developerworks live! Introduction
    Quickly learn about IBM products and tools and IT industry trends.

  • Watch developerworks
    Demonstration Center
    Including Product installation and setup demos for beginners and advanced functions provided for experienced developers.

Obtain products and technologies

  • GNU
    Parted web site
    It also hosts the text mode GNU parted and its parent library libparted. GNU parted
    Is a mature text mode MBR and GPT partitioning tool.
  • Util-Linux-ng
    Including Linuxfdisk
    ,sfdisk
    Andcfdisk
    .
  • Gnome Partition
    Editor
    (Also known as gparted) is a GUI partitioning tool built on libparted.
  • GPT fdisk
    The program isfdisk
    Then, the pure GPT partition program is modeled.
  • Evaluate the trial version of IBM products in the most suitable way for you
    :
    Download the trial version of the product, try the product online, use the product in the cloud environment, or
    SOA sandbox
    It takes several hours to learn how to efficiently implement Service Oriented Architecture.

Discussion

Participation in developerworks
Community
. Contact other developerworks users and explore developer-driven blogs, forums, groups, and Wikipedia. About the author
Roderick W. Smith is a consultant and writer who has written more than 10 books on UNIX and Linux, including The
Definitive Guide to Samba 3. Linux in
Windows world

And Linux Professional
Institute certification
Study Guide

. He is also the creator of the GPT fdisk partitioning software. He currently lives in the island state.
Woonsocket city.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.