Last year, I found that the ext4 file system has been used in the Android system of the project. At that time, I did not go into the reason why I used the ext4 file system. The advantages and disadvantages of using ext4 were not discussed.
Recently, we have just sorted out the changes caused by ext4 to ext2/ext3. There are not many articles on ext4 on the Internet. I only found ext4 howto from http://kernelnewbies.org/ext4/, which is a general article.
1. Introduction
Ext4 is an evolutionary version of ext3, which is the most common file system in Linux. Ext4 has improved ext3 in many ways, which is much more than ext3's change to ext2. The difference between ext3 and ext2 is limited to the log system, but ext4 modifies most important data structures of the file system, such as the storage method of file data. Therefore, the design, performance, stability, and features are improved.
2 ext4 features
2.1 large file systems, large file sizes
Ext2/ext3:
File System size calculation method 2 ^ 32 * block_size. If block_size is 4096, the maximum file system size is 16 TB;
The maximum file size is limited by three:
1. I _size in ext2_inode:
At first glance, I _size is only 32 bits, so the maximum file size cannot exceed 2 ^ 32. In fact, ext2/ext3 uses a dirty method and uses the I _dir_acl adjacent to I _size, therefore, 64 bits can be extended to indicate the file size. Therefore, this cannot be a limit.
2. Addressing space of Level 3 indirect blocks:
Assuming the block size is 4096, the third-level indirect block can be 1024*1024*1024*4096 = 4 t
3. The maximum file size cannot exceed (2 ^ 32-1) * 512 = 2 T, that is, the file size cannot exceed the number of sectors that 32 bits can represent.
Therefore, in the ext2/ext3 file system, the maximum file size is 2 TB.
|
Ext2/ext3 |
Ext4 |
File System size |
16 TB |
1eb |
Maximum File Size |
2 TB |
16 TB |
2.2 subdirectories
|
Ext2/ext3 |
Ext4 |
Maximum number of subdirectories |
32000 |
65000 |
|
|
|
2.3 extents
The ext2/ext3 file system uses indirect ing to manage the ing between logical addresses of file data and physical blocks. Indirect ing can be classified into level-1 indirect, level-2 indirect, and level-3 indirect, A small portion of the file data is directly mapped. Therefore, this management method is simple and efficient for small files. However, for large files, especially the deletion and truncate operations of large files, it takes a long time to process mappings for each deleted block. In addition, a large file requires a third-level ing. That is to say, to access the Logical Block of the file, you need to find and access the four physical blocks.
Modern file systems introduce the concept of "extents", which is completely introduced to the management of big file data. An extent contains a set of continuous physical blocks. Only one extent is used to manage the ing between a set of logical address blocks and physical blocks, instead of creating mappings for each pair. Consider a 25000 MB file. Ideally, we only need one extent to record the ing relationship. However, if we use the indirect ing of ext2/ext3, we need to establish a ing relationship for blocks.
Because extents facilitates continuous disk allocation, extents reduces fragmentation and improves file system performance.
Allocate more than 2.4 Blocks
When ext3 needs to write file data to the disk, the block distributor determines which idle block is used to write data. However, the ext3 block distributor can allocate only one block 4kb at a time. This means that for files of 25000 MB size, you need to call the block distributor. The inefficiency is not only because the block distributor is called for 25000 times, but also the block distributor cannot optimize the allocation policy because the block distributor cannot associate the 25000 allocation.
Ext4 uses A multiblock Allocator (mballoc) to allocate multiple blocks at a time. It avoids multiple allocations, optimizes allocation policies, and improves performance. Multiple splitters are particularly useful for delayed allocation and extents. This feature does not affect the disk layout.
In addition, ext4 block/inode splitters have other improvements, see http://ols.fedoraproject.org/OLS/Reprints-2008/kumar-reprint.pdf
2.5 delay allocation
Latency allocation is a performance optimization technique, which is used in several file systems, such as XFS, ZFS, btrfs, and resier 4. Compared with the traditional file system ext2/ext3 block allocation, the delay allocation can delay the block allocation as much as possible.
Disadvantages of the traditional method of immediate allocation: for example, for a write call, the file system code immediately allocates the storage location of data blocks, even data is stored in the cache for a period of time before being written back to the disk. When a process continuously writes data to a file, physical blocks are allocated for each subsequent write operation, but the file continues to grow.
Latency allocation: when calling the write operation, if the data is only written to the cache, the block is not allocated immediately, but is allocated only when the data is actually written to the disk. This gives the block distributor the opportunity to optimize and combine these allocated blocks. Latency allocation must work with extents and multi-block allocation. In some scenarios, files are allocated to the disk according to extents, And the called alloc is also called.
2.6 fast fsck
Fsck operations are time-consuming, especially the first step of fsck: Check all inode in the file system. Ext4 stores an unused inode linked list in each group of inode tables. Therefore, fsck does not need to check these unused inodes. Therefore, the entire fsck time can be improved by 2 to 20 times (depending on the number of unused inode ). Note that the inode linked list is not used by fsck instead of ext4. You must first run fsck to create the unused INDES linked list. The next fsck will be faster.
2.7 log checksumming
Logs are the most commonly used part of a disk. Therefore, these parts are more prone to hardware errors. Restoring the system from an error log may cause a larger error. Ext4 uses check summing to ensure that journal data is valid. In addition, the log checksumming has an additional effect: it allows two-phase commit of the ext3 file system to one-phase commit. In some cases, the speed increases by 20%, therefore, the reliability and speed are also improved.
2.8 non-log Mode
Logs ensure the consistency of the file system, but also increase the system load. In some special cases, when data integrity is not important, you can run ext4 without logs. The ext4 log function can be disable, which can slightly improve the performance.
2.9 online Fragment