Ext2 File System

Source: Internet
Author: User

ext2 File System


Tenet: The learning of technology is limited and the spirit of sharing is limitless.


First, the overall storage layout

A disk can be partitioned into multiple partitions, and each partition must first be formatted with a format tool (such as a MKFS command) to format the file system before the file is stored, and the formatted process writes some information about the storage layout on disk.


The smallest unit stored in the file system is block, and how large a block is determined at the time of formatting, such as the-B option for MKE2FS can set the block size to 1024, 2048, or 4096 bytes. And the size of the boot block is determined, is 1KB, the boot block is defined by the PC standard, used to store disk partition information and startup information, no file system can use the boot block. After the boot block is the beginning of the ext2 file system, the Ext2 file system is divided into several blocks of the same size (block group), each of which consists of the following parts.

1. Super Block ( Superblock )

Describes the file system information for the entire partition, such as the block size, file system version number, time of the last mount, and so on. The super block has one copy at the beginning of each block group.

2. Block Group Descriptor ( GDT , Groupdescriptor Table )

Consists of a number of block group descriptors, and how many block groups the entire partition is divided into will correspond to the number of block group descriptors. Each block group descriptor (Groupdescriptor) stores the descriptive information for a block group, such as where the Inode table starts in this block group, where it starts as a block, how many free inode and data blocks are available, and so on. Similar to the Super block, the Block group descriptor has a copy at the beginning of each block group, which is very important, and once the super block is accidentally damaged it loses the entire partition's data, and once the block group descriptor is accidentally corrupted, the entire block group of data is lost, so they all have multiple copies. Normally the kernel only uses copies of the NO. 0 block group, and when performing e2fsck checks for file system consistency, the Super block and block group descriptor in the NO. 0 block group are copied to other block groups, so that when the beginning of the NO. 0 block group is accidentally damaged, other copies can be used to recover, thereby reducing the loss.
3 , block Bitmap ( Blockbitmap )

The blocks in a block group are used in this way: Data blocks store data for all files, such as the block size of a partition is 1024 bytes, and a file is 2049 bytes, then three blocks of data are required, even if the third block has only one byte to occupy an entire block; Super block, Block Group descriptor, block bitmap, The Inode bitmap, the Inode table, stores the description information for the block group. So how do you know which blocks have been used to store file data or other descriptive information, and which blocks are still available for free? Block bitmaps are used to describe which blocks in the entire block group have been used for free, and it itself occupies a block, where each bit represents a block in the block, and this bit is 1 to indicate that the block is used, and that bit is 0 to indicate that the block is free.

Why is it very fast to use the DF command to count the entire disk's used space? Because you only need to look at the block bitmap for each block group, you don't need to search through the entire partition. Conversely, using the du command to view the used space for a larger directory is very slow because it is unavoidable to search through all the files in the entire directory.

Another question associated with this is: how many block groups will be drawn when a partition is formatted? The main limitation is that the block bitmap itself must occupy only one block. When formatted with MKE2FS, the default block size is 1024 bytes, you can specify the block size with the-b parameter, and now the block size is specified as B byte, then a block can have 8b bit, so that a block bitmap of size can represent the consumption of 8b blocks, so a block group can have up to 8b blocks, If the entire partition has s blocks, then you can have s/(8b) block groups. You can use the-G parameter to specify how many blocks are in a block group, but usually do not need to be specified manually, the MKE2FS tool calculates the optimal value.

4. Inode Bitmap ( Inodebitmap )

Like a block bitmap, itself occupies a block, where each bit indicates whether an inode is free to be available.

5. Inode Table ( inodetable )

We know that a file needs to be stored in addition to the data, such as file type (general, directory, symbolic link, etc.), permissions, file size, creation/modification/access time, etc., which is the information that the Ls-l command sees, which is in the inode rather than in the data block. Each file has an inode, and all the Inode in a block group makes up the Inode table.

The Inode table occupies the number of blocks that are determined and written to the block group descriptor when formatted, and the default policy for the MKE2FS format tool is how many inode is allocated for a block group with a 8KB. Since the data block occupies the vast majority of the block group, it is also possible to approximate how many 8KB of data blocks are allocated, in other words, if the average size of each file is 8KB, the Inode table will be fully utilized when the partition is full, and the data block is not wasted. If the partition is very large files (such as movies), then the data block when the inode will be a waste of time, if the partition is a small file (such as source code), then it is possible that the data block is not used to complete the inode has been exhausted, the data block may be a lot of waste. If the user is able to make a prediction about the size of the file to be stored after the partition is formatted, you can also manually specify each number of bytes by assigning an inode with the-i parameter of MKE2FS.

6. Data Block ( DataBlock )

Depending on the file type there are several scenarios for regular files, the data for the file is stored in the data block. For a directory, all file names and directory names in this directory are stored in a data block, noting that the file name is in the data block of the directory in which it resides, and that other information that the LS-L command sees is stored in the inode of the file, except for the file name. Note This concept: A directory is also a file, a special type of file. For symbolic links, if the target pathname is shorter, it is saved directly in the inode for faster lookups, and if the target pathname is longer, a block of data is allocated for saving. There are no data blocks for special files such as device files, FIFO, and sockets, and the main device number and secondary device number of the device file are stored in the inode.

II. Research File System format

first create a a 1MB file and clear 0 (replace the entire partition with a file)

DD If=/dev/zero of=fs count=256 bs=4k

The CP command can copy a file into another file, and the DD command can copy a portion of a file into another file. The purpose of this command is to copy the 1M (256x4k) bytes at the beginning of the/dev/zero file into a file named FS. /dev/zero is a special device file, it does not have a disk data block, read it to the device number 1, 5 of the driver. /dev/zero This file can be thought of as infinite, regardless of where it starts to read, the read is byte 0x00. This command copies therefore 1M 0x00 to the FS file. The IF and of parameters represent the input file and the output file, and the count and BS parameters represent how many times the copy is copied, and how many bytes each.


Now the size of FS is still 1MB, but it is no longer full 0, which already has block groups and descriptive information. Use the dumpe2fs tool to view the information in the Super Block and block group descriptor tables for this partition:


According to the above knowledge simple calculation, block size is 1024 bytes, 1MB partition total 1024 blocks, the NO. 0 block is the boot block, the start block after the start of the ext2 file system, so group 0 occupies 1th to 1023th block, a total of 1023 blocks. A block bitmap takes up a block, with a total of 1024x8=8192 bits, enough to represent the 1023 blocks, so just one block group is enough. The default is to allocate one inode per 8KB, so the 1MB partition corresponds to 128 inode, which matches the output of the DUMPE2FS.

File systems made of regular files can also mount to a directory like a disk partition:

The-o loop option tells Mount that this is a regular file instead of a block device file. Mount interprets the data in its data block as a partitioned format. After the file system is formatted, three subdirectories are automatically generated under the root directory:.... and Lost+found. Other subdirectories. Represents the current directory,.. Represents the top level directory, and the root directory of the. and. Both represent the root directory itself. The Lost+found directory is used by the E2fsck tool, and if an error is found while checking the disk, hang the wrong block in this directory, because these blocks do not know who it is, and cannot find the Lord, and it is here "lost and found".

You can now add delete files under the/mnt directory, which are automatically saved to file FS. Then umount down the partition to make sure that all the changes are saved to the file. sudo umount/mnt

Now we use the Binary View tool to view all the bytes of this filesystem and, compared with the output information of the DUMPE2FS tool, we can understand the storage layout of the file system very well.

"Omit N Rows"

A line beginning with a * indicates that the data is all 0 omitted. Detailed analysis of the OD output information:

1KB starting from 000000 is the boot block, since this is not a real disk partition, the contents of the boot block are all zeros. From 000400 to 0007ff 1KB is the Super block, against the output information of DUMPE2FS, detailed analysis is as follows:

The 204 bytes from 0004d0 to the end of the Super Block are padding bytes and remain unused. Note that each field in the Ext2 file system is stored on a small side, and if the position of the byte in the file is treated as an address, then the low address is at the beginning of the file and the low byte is saved.

Starting from 000800 is a block descriptor, the file system is small, only a block group descriptor, the output information against the DUMPE2FS is analyzed as follows:

The entire file system is 1MB, each block is 1KB, there should be 1024 blocks, remove the boot block and 1023 blocks, respectively numbered 1-1023, they all belong to group 0. Where Block 1 is a super block, the next block group descriptor indicates that the block bitmap is Block 6, so the middle block 2-5 is a block group descriptor, where block 3-5 remains unused. The Block group descriptor also states that the Inode bitmap is Block 7, and the Inode table starts at block 8, so which block of the Inode table ends? Since the super block indicates that each block group has 128 inode, the size of each inode is 128 bytes, thus a total of 16 blocks, and the Inode table is scoped to block 8-23.

Data blocks start at Block 24. The Block group descriptor indicates that there are 986 free blocks of data, since the file system is newly created and the free block is a continuous block of 38-1023, with the previous block 24-37 being removed. As you can see from the Block bitmap, the first 37 bits (the first 4 bytes plus the lower 5 bits of the last byte) are 1, which means that block 1-37 is used:


In the block bitmap, block 38-1023 corresponds to a bit of 0 (up to 001870 of the last byte of the lower 7 bits), the next bit has exceeded the file system space, whether 0 or 1 is meaningless. As you can see, the bits in each byte of a block bitmap should be in the order from low to high. In the future, with the use of file systems and the addition of deleted files, 1 of the block bitmap becomes discontinuous.

The Block group descriptor indicates that there are 117 idle inode, because the file system is newly created, the idle inode is contiguous, the inode number is from 1 to 128, and the idle inode number is from 12 to 128. As you can see from the Inode bitmap, the first 11 bits are 1, indicating that the first 11 inode is used:

The 128 bits of the 001C00 line represent all the inode, so the following line, whether 0 or 1, is meaningless. Of the 11 used Inode, the first 10 inode is reserved by the ext2 file system, where the 2nd inode is the root directory, the 11th Inode is the Lost+found directory, and the block group descriptor also indicates that the group has two directories, which is the root directory and Lost+found.

Explore the file system there is also a useful tool Debugfs, which provides a command-line interface that can perform various operations on the file system, such as viewing information, recovering data, and correcting errors in the file system. Use Debugfs to open the FS file, and then enter help at the prompt to see what it can do:

Compare the above information with the output of the OD command to analyze: "Inode of root directory"

The St_mode in octal, which contains the file type and file permissions, the highest bit of 4 indicates that the file type is a directory (the encoding of various file types is described in stat (2)), and the low 755 indicates permissions. A size of 1024 indicates that the root directory now has only one data block. Links 3 means that there are three hard links in the root directory, respectively, under the root directory. and.. and Lost+found subdirectories ... Note that although we usually use/represent the root directory, there is no hard link named/, in fact,/is the path delimiter and cannot appear in the file name. The blockcount here is counted as a block of 512 bytes, not the block size specified when the file system is formatted, the minimum read-write unit of the disk is called sector (Sector), usually 512 bytes, so Blockcount is the number of physical blocks of the disk, not the number of logical blocks of the partition. The location of the root data block is indicated by the blocks[0 in, that is, the 24th block, where it is located in the file system 24x0x400=0x6000, and the 006000 address "root block" is found from the output of the OD command.

The data block of a directory consists of a number of indeterminate records, each of which describes a file in the directory, expressed in a box. The first record describes the file with Inode number 2, which is the root directory itself, the total length of the record is 12 bytes, where the file name is 1 bytes long, the file type is 2 (see table below, note the file type encoding and St_mode inconsistencies here), the file name is: The second record is also a file that describes the Inode number 2 (the root directory), the total length of the record is 12 bytes, where the file name is 2 bytes long, the file type is 2, and the filename string is ... The third record continues to the end of the data block, describing the file with the Inode number 11 (the Lost+found directory), the total length of the record is 1000 bytes (and the previous two records add up to 1024 bytes), the file type is 2, the filename string is Lost+found, The back is all 0 bytes. If you want to create a new file under the root directory, you can truncate the third record and create a new one at the original 0 bytes. If there are too many filenames in the directory and one chunk is not enough, a new chunk is allocated and the block number is populated into the Inode's blocks[1] field.

Ext2 File System

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.