Ext2 File System Understanding

Source: Internet
Author: User

A disk can be partitioned into partitions, and each partition must first be formatted with a format tool (such as the MKFS command) in a file system before it can be stored, and the formatted process will write some information on the disk that manages the storage layout. Take the ext2 file system as an example of how files are stored on disk.

Overall storage layout for a partitioned ext2 file system

Boot block size determination, 1KB, boot block by the PC standard, used to store disk partition information and boot information, no file system can use the boot block. After the boot block is the beginning of the ext2 file system, the Ext2 file system divides the entire partition into several blocks of the same size (block group), each of which consists of the following parts.

Super Block

Describes the file system information for the entire partition, such as the block size, file system version number, time of the last mount, and so on.

The super block has one copy at the beginning of each block group.

Block Group Descriptor (Gdt,group descriptor Table)

Consists of a number of block group descriptors, and how many block groups the entire partition is divided into will correspond to the number of block group descriptors. Each block group descriptor (Group descriptor) stores the descriptive information for a block group, such as where the Inode table begins in this block group, where it starts as a block of data, how many free inode and data blocks are available, and so on. Similar to the Super block, the Block group descriptor has a copy at the beginning of each block group, which is very important, and once the super block is accidentally damaged it loses the entire partition's data, and once the block group descriptor is accidentally corrupted, the entire block group of data is lost, so they all have multiple copies. Normally the kernel only uses copies of the NO. 0 block group, and when performing e2fsck checks for file system consistency, the Super block and block group descriptor in the NO. 0 block group are copied to other block groups, so that when the beginning of the NO. 0 block group is accidentally damaged, other copies can be used to recover, thereby reducing the loss.

Chunk Bitmap (block Bitmap)

The blocks in a block group are used in this way: Data blocks store data for all files, such as the block size of a partition is 1024 bytes, and a file is 2049 bytes, then three blocks of data are required, even if the third block has only one byte to occupy an entire block; Super block, Block Group descriptor, block bitmap, The Inode bitmap, the Inode table, stores the description information for the block group. So how do you know which blocks have been used to store file data or other descriptive information, and which blocks are still available for free? Block bitmaps are used to describe which blocks in the entire block group have been used for free, and it itself occupies a block, where each bit represents a block in the block, and this bit is 1 to indicate that the block is used, and that bit is 0 to indicate that the block is free.

Why is it very fast to use the DF command to count the entire disk's used space? Because you only need to look at the block bitmap for each block group, you don't need to search through the entire partition. Conversely, using the du command to view the used space for a larger directory is very slow because

All files that are inevitably searched through the entire directory.

Another question associated with this is: how many block groups will be drawn when a partition is formatted? The main limitation is that the block bitmap itself must occupy only one block. When formatted with MKE2FS, the default block size is 1024 bytes, you can specify the block size with the-b parameter, and now the block size is specified as B byte, then a block can have 8b bit, so that a block bitmap of size can represent the consumption of 8b blocks, so a block group can have up to 8b blocks, If the entire partition has s blocks, then you can have s/(8b) block groups. You can use the-G parameter to specify how many blocks are in a block group, but usually do not need to be specified manually, the MKE2FS tool calculates the optimal value.

Inode Bitmap (inode Bitmap)

Like a block bitmap, itself occupies a block, where each bit indicates whether an inode is free to be available.

Inode table (inode table)

We know that a file needs to be stored in addition to the data, such as file type (general, directory, symbolic link, etc.), permissions, file size, creation/modification/access time, etc., which is the information that the Ls-l command sees, which is in the inode rather than in the data block. each file has an inode, and all the Inode in a block group makes up the Inode table .

The Inode table occupies the number of blocks that are determined and written to the block group descriptor when formatted, and the default policy for the MKE2FS format tool is how many inode is allocated for a block group with a 8KB. Since the data block occupies the vast majority of the block group, it is also possible to approximate how many 8KB of data blocks are allocated, in other words, if the average size of each file is 8KB, the Inode table will be fully utilized when the partition is full, and the data block is not wasted. If the partition is very large files (such as movies), then the data block when the inode will be a waste of time, if the partition is a small file (such as source code), then it is possible that the data block is not used to complete the inode has been exhausted, the data block may be a lot of waste. If the user is able to make a prediction about the size of the file to be stored after the partition is formatted, you can also manually specify each number of bytes by assigning an inode with the-i parameter of MKE2FS.

Data block

There are several scenarios depending on the file type

For regular files, the data for the file is stored in the data block.

For the directory, all filenames and directory names under this directory are stored in the data block, noting that the file name is in the data block of the directory in which it resides, and that other information that the LS-L command sees in addition to the file name is stored in the inode of the file. Note This concept: A directory is also a file, a special type of file.

For symbolic links, if the target pathname is shorter, it is saved directly in the inode for faster lookups, and if the target pathname is longer, a block of data is allocated for saving.

"The device file, FIFO and socket and other special files have no data block, the device file's main device number and the secondary device number is stored in the inode.

Now do a few small experiments to understand these concepts. For example, in the home directory Ls-l:

$ ls-l

Total 32

Drwxr-xr-x akaedu akaedu 12288 2008-10-25 11:33 akaedu

Drwxr-xr-x 4096 FTP FTP 2008-10-25

drwx------2 root root 16384 2008-07-04 05:58 lost+found

Why are the sizes of each directory 4096 integer multiples? Because the block size of this partition is 4096, the size of the directory is always an integer multiple of the data block. Why do some catalogs have a small directory? Because the data block of the directory holds the names of all the files and directories underneath it, if there are many files in a directory, and a block cannot contain so many filenames, more data blocks may be allocated

To this directory. Another example:

$ ls-l/dev

......

Prw-r-----1 syslog adm 0 2008-10-25 11:39 xconsole

Crw-rw-rw-1 root root 1, 5 2008-10-24 16:44 Zero

The type of the XConsole file is P (for pipe), which is a FIFO file, which is actually the identity of a kernel buffer and does not hold data on disk, so there is no data block and the file size is 0. The type of the zero file is C, which represents the character device file, which represents a device driver in the kernel, there is no data block, the original should write a file size of 1, 5 of these two numbers, indicating the main device number and the second device number, when accessing the file, the kernel according to the device number to find the appropriate driver. Another example:

$ Touch Hello

$ ln-s./hello Halo

$ ls-l

Total 0

lrwxrwxrwx 1 akaedu akaedu 7 2008-10-25 15:04 Halo-/hello

-rw-r--r--1 akaedu akaedu 0 2008-10-25 15:04 Hello

The file hello is just created, the number of bytes is 0, the symbolic link file halo points to Hello, the number of bytes is 7, why?

In fact, 7 is the "./hello" 7 characters, the symbolic link file is saved such a path name. Try hard links again:

$ ln./hello Hello2

$ ls-l

Total 0

lrwxrwxrwx 1 akaedu akaedu 7 2008-10-25 15:08 Halo-/hello

-rw-r--r--2 akaedu akaedu 0 2008-10-25 15:04 Hello

-rw-r--r--2 akaedu akaedu 0 2008-10-25 15:04 Hello2

Hello2 and hello except the file name is different, the other properties are identical, and the property of Hello has changed, the number of the second column is originally 1, now becomes 2. Basically, hello and Hello2 are the same file in the file system of two names, ls-l the number of the second column is the number of hard links, indicating a file in the file system has a few names (these names can be stored in different directories of data blocks, or can be located under different paths), The number of hard links is also stored in the inode. Since it is the same file, the Inode is of course only one, so using Ls-l to see their properties is exactly the same, because it is read from this inode. Then look at the number of hard links in the directory:

$ mkdir A

$ mkdir A/b

$ LS-LD A

Drwxr-xr-x 3 akaedu akaedu 4096 2008-10-25 16:15 A

$ Ls-la A

Total 20

Drwxr-xr-x 3 akaedu akaedu 4096 2008-10-25 16:15.

Drwxr-xr-x akaedu akaedu 12288 2008-10-25 16:14..

Drwxr-xr-x 2 akaedu akaedu 4096 2008-10-25 16:15 b

$ LS-LA A/b

Total 8

Drwxr-xr-x 2 akaedu akaedu 4096 2008-10-25 16:15.

Drwxr-xr-x 3 akaedu akaedu 4096 2008-10-25 16:15..

First create a directory A, and then create a subdirectory under it A/b. The number of hard links in directory A is 3, and these 3 names are under the A,a directory in the current directory, respectively. and b directory. The number of hard links in directory B is 2, and these two names are under the B and B directories under the A directory, respectively. Note that a hard link to a catalog can only be created in this way, with the LN command to create a symbolic link to the directory, but not to create a hard link to the directory.

Transfer from: akaedu textbook

Ext2 File System Understanding

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.