File System-Article 1 key factors affecting File System Performance: storage block allocation and layout policies

Source: Internet
Author: User

The file system allocation and layout policies directly affect the file system access performance. To this end, modern file systems all adopt various optimization methods.
Redistribution of 3.1 Blocks
In order to make the design simple and limit the file system fragments generated in the worst case, the traditional file system divides the disk into equal disk blocks. Because the disk sector size is generally 512, the block size is generally an integer multiple of 512 bytes. In traditional file systems, the entire disk block can only be allocated to a single file. However, because the file size is not always an integer multiple of the size of the file system block, this design only partially exploits the last disk block (called the tail), resulting in internal fragmentation. Block suballocation is to block the tail block again so that it can be used to store fragments of other files.
 
Some block redistribution schemes can achieve byte-level allocation, but most schemes only divide the disk block into smaller disk blocks (usually divided into 2 integer power ).
 
3.2 tail Packaging
Some file systems package the tails of several different files in a single shared tail block to take advantage of the unused space described above. This method is called tail packing ). Although this method seems to significantly increase the fragmentation of the file system, this side effect can be mitigated through the pre-read feature of the modern operating system, because for small files, several files stored in different blocks may be very close at the end, so they can be read at the same time, so that no additional tracing time will be introduced. Such file systems generally use heuristic methods to determine whether tail packaging is worthwhile in a given situation. More heuristic methods may be required for fragment software.
 
When most files are smaller than half the block size, for example, a small source code file or a small dot matrix in a folder, compared to a file system that does not use a tail package, the file system packaged at the end can achieve double efficiency. In this case, the introduction of tail packaging not only saves disk space, but also requires less data to be read due to better locality, and the page cache is more efficient. However, the increase in implementation complexity may eliminate the above advantages.
 
3.3 variable file block size
In a file system that supports variable file block sizes, each file can have different block sizes.
 
3.4 Area
In the extent file system, a region is a continuous storage space reserved for a file. When a file is written, an entire area is allocated to the file. When this file is written again, even if there are other write operations in the middle, the data can still be stored in the place where the file was last written. This method eliminates or reduces file fragments.
 
Many file system support regions, including NTFS, HFS, XFS, Reiser4, and Universal Disk Format (UDF), ext4, HPFS, OpenVMSFiles-11, JFS, BFS, sintran iii, Oracle's Automatic Storage Management and VERITAS File System.
 
3.5 allocation upon refresh
Allocate-on-flush, or delayed allocation) is a file system feature implemented on file systems such as HFS +, XFS, Reiser4, ZFS, Btrfs, and ext4. In a file system with delayed allocation, when a disk block is to be allocated to a write operation, the size of the disk space for additional data is subtracted from the idle space count, however, it does not actually allocate space in the free space bitmap. On the contrary, additional data is stored in the memory until the kernel determines to refresh the dirty buffer due to memory pressure, or the application executes a Unix-like "sync" system call to allocate space.
 
This method can merge multiple disk blocks in batches. This delay reduces CPU usage and disk fragmentation, especially for files that are growing slowly. When multiple files grow at the same time, it can maintain the continuous distribution of disk blocks. When used with copy on write, it can convert low-speed random writes to fast continuous writes.
 
3.6 sparse files
When most of the blocks allocated to a file are actually empty, you can use sparsefile to use the file system space more efficiently. The sparse file is implemented by writing brief information (metadata) to the disk to indicate empty blocks, instead of using the actual empty space to create empty blocks, thus reducing the use of disk space. The size of the entire block is used as the actual size only when the block contains real (non-empty) data, and thus is written to the disk.
 
When a sparse file is read, the file system transparently converts the metadata of the empty block into a real block that is full of zero bytes at runtime. The application is not aware of this conversion.
 
Most modern file systems support sparse files. Sparse files are generally used in Disk Images, database snapshots, log files, and scientific applications.
 
This article is welcome to reprint, please keep the original blog link http://blog.csdn.net/fsdev/article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.