Ext2/ext3 Structure Analysis

Source: Internet
Author: User

Original article: ext2/ext3 Structure Analysis (Part 1)

Http://bbs.chinaunix.net/thread-3669811-1-1.html

Ext2/ext3 Structure Analysis

The number of words is 2-20 000 ..
Certificate ---------------------------------------------------------------------------------------------------------------------------------------
Lab machine: Virtual Machine + Red Hat 9

First look at the basic structure of the ext2/ext3 File System
Because the machine is dealing with byte streams, it is necessary to make a structured definition of byte streams, as is the case with the file system.
The following describes the structure of the ext2/ext3 file system.

Ext2/ext3 structure:
Upload

Download Attachment
(17.42 KB)

Part 1. Super block, 1024 B itself/take up 1 block size

The superblock contains all the information about the configuration of the filesystem. The information in
The superblock contains fields such as the total number of inodes and blocks in the filesystem and how
Tables are free, how many inodes and blocks are in each block group, when the filesystem was mounted
(And if it was cleanly unmounted), when it was modified, what version of the filesystem it is and which
OS created it.
The primary copy of the superblock is stored at an offset of 1024 bytes from the start of the device, and it
Is essential to mounting the filesystem. Since it is so important, backup copies of the superblock are
Stored in block groups throughout the filesystem.
The first version of ext2 (Revision 0) stores a copy at the start of every block group, along with backups
Of the group descriptor block (s). Because this can consume a considerable amount of space for large
Filesystems, later revisions can optionally reduce the number of backup copies by only putting backups in
Specific groups (this is the sparse superblock feature). The groups chosen are 0, 1 and powers of 3, 5 and
7.
Revision 1 and higher of the filesystem also store extra fields, such as a volume name, a unique
Identification number, the inode size, and space for optional filesystem features to store configuration
Info.
All fields in the superblock (as in all other ext2 structures) are stored on the disc in little endian format,
So a filesystem is portable between machines without having to know what machine it was created on.

Struct ext3_super_block {

/* 00 */
_ U32 s_inodes_count;/* inodes count */
_ U32 s_blocks_count;/* blocks count */
_ U32 s_r_blocks_count;/* Reserved blocks count */
_ U32 s_free_blocks_count;/* idle blocks count */

/* 10 */
_ U32 s_free_inodes_count;/* idle inodes count */
_ U32 s_first_data_block;/* the first data block */
_ U32 s_log_block_size;/* block size */
_ S32 s_log_frag_size;/* can be ignored */

/* 20 */
_ U32 s_blocks_per_group;/* number of blocks per block group */
_ U32 s_frags_per_group;/* can be ignored */
_ U32 s_inodes_per_group;/* Number of inodes per block group */
_ U32 s_mtime;/* Mount time */

/* 30 */
_ U32 s_wtime;/* write time */
_ 2010s_mnt_count;/* Mount count */
_ S16 s_max_mnt_count;/* maximal Mount count */
_ 2010s_magic;/* Magic signature */
_ 2010s_state;/* file system state */
_ 2010s_errors;/* behaviour when detecting errors */
_ 2010s_minor_rev_level;/* minor revision level */

/* 40 */
_ U32 s_lastcheck;/* time of last check */
_ U32 s_checkinterval;/* max. Time between checks */
_ U32 s_creator_ OS;/* can be ignored */
_ U32 s_rev_level;/* revision level */

/* 50 */
_ 2010s_def_resuid;/* default uid for reserved blocks */
_ 2010s_def_resgid;/* default gid for reserved blocks */
_ U32 s_first_ino;/* first non-reserved inode */
_ 2010s_inode_size;/* size of inode structure */
_ 2010s_block_group_nr;/* block group # Of This superblock */
_ U32 s_feature_compat;/* compatible feature set */

/* 60 */
_ U32 s_feature_incompat;/* incompatible feature set */
_ U32 s_feature_ro_compat;/* readonly-compatible feature set */

/* 68 */
_ U8 s_uuid [16];/x 128-bit UUID for volume */

/* 78 */
Char s_volume_name [16];/* volume name */

/* 88 */
Char s_last_mounted [64];/* directory where last mounted */

/* C8 */
_ U32 s_algorithm_usage_bitmap;/* can be ignored */
_ U8 s_prealloc_blocks;/* can be ignored */
_ U8 s_prealloc_dir_blocks;/* can be ignored */
_ 2010s_padding1;/* can be ignored */

/* D0 */
_ U8 s_journal_uuid [16];/* UUID of journal superblock */

/* E0 */
_ U32 s_journal_inum;/* Number of inode numbers of log files */
_ U32 s_journal_dev;/* device ID of the log file */
_ U32 s_last_orphan;/* Start of List of inodes to delete */

/* EC */
_ U32 s_reserved [197];/* can be ignored */
};

0.
No matter how large the block size of the partition is, the super block always starts at the 1024 B offset of the storage device!
At the same time, super block uses small-end storage!
These two points ensure portability!

1.
Block size = 1 <(s_log_block_size + 10), in byte
3. Get s_log_block_size = 0, 1 <10 = 2 ^ 10 = 1024

2.
Note s_magic, this bit is the same in ext2 and ext3,
This bit is similar to the magic number of multiplexing in TCP/IP protocol, which shows that the compatibility between ext2 and ext3 is good.

3.
The dumpe2fs command is used to view the super block.
Let's take a look at the block size of this partition:
# Dumpe2fs/dev/sda1
.......
Inode count: 24096
Block count: 96358
.......
First block: 1
Block Size: 1024
.......

The following information is provided:
Inode count: 24096 represents the total number of inode;
Block count: 96358 indicates the total number of blocks;
Block Size: 1024 indicates the block size of the partition;
First block: 1 indicates that the/dev/sda1 device writes data from the first block,

4.
To view the hard disk content, run the DD command,
Here, the 0th blocks of/dev/sda1 are not used:
Dd If =/dev/sda1 BS = 1024 COUNT = 1 skip = 0 | xxd | less verification, all 0 is displayed
Skip = 0 indicates 0th. Skip

Since one super block is 1024b, run the following command to check the 1024b content:
# Dd If =/dev/sda1 BS = 1024 COUNT = 1 skip = 1 | xxd | less
0000000: 205e 0000 6678 0100 d112 0000 aa47 0100 ^... FX ..
0000010: f65d 0000 0100 0000 0000 0000 0000] ......
0000020: 0020 0000 0020 0000 d807 0000 6685 344f ...... f.4o
0000030: 6685 344f 2800 FFFF 53ef 0100 0100 0000 f.4o (... s .......
0000040: f92a 254f 0000 0000 0000 0000 0100. * % O ............
0000050: 0000 0000 0b00 0000 8000 0000 0400 ................
0000060: 0600 0000 0100 0000 981f fa3c b538 4a2b ...... <. 8j +
0000070: ba5e 9c83 ceeb 34fe 2f62 6f6f 7400 0000. ^... 4./boot...
0000080: 0000 0000 0000 0000 0000 0000 0000 ................
All 0 s...
00000d0: 0000 0000 0000 0000 0000 0000 0000 ................
00000e0: 0800 0000 0000 0000 0000 125f b5b3 ............._..
00000f0: 29d5 46c5 99a4 7da6 c8af 233a 0200 0000). F ...}...#:....
0000100: 0000 0000 0000 0000 f92a 254f 0000 0000 ...... * % O ....
0000110: 0000 0000 0000 0000 0000 0000 0000 ................
All 0 s...
20173f0: 0000 0000 0000 0000 0000 0000 0000 ................

We can see that the superblock format is small Based on the ext3_super_block structure and the English description,
The first is u32, 205e 0000, that is, 5e20, decimal 24096, total inode
The second is also u32, 6678 0100, that is, 17866, decimal 96358, the total number of blocks
The two numbers match dumpe2fs/dev/sda1.

5.
Copy backup
The backup of super block is in ext2 and is backed up in super block of all group blocks.
Copy the super block in the group block numbers of, 7, and 3 5 7 indexes, such as 9, 25, 49 ,...
The number of block groups is limited.
It can be tested and confirmed:
# Dumpe2fs/dev/sda1 (a large amount of information has been omitted)
Group 0: (Blocks 1-8192)
Group 1: (blocks 8193-16384)
Group 2: (blocks 16385-24576)
Group 3: (blocks 24577-3276
Group 4: (blocks 32769-40960)
Group 5: (blocks 40961-49152)
Group 6: (blocks 49153-57344)
Group 7: (blocks 57345-65536)
Group 8: (blocks 65537-7372
Group 9: (blocks 73729-81920)
Group 10: (blocks 81921-90112)
Group 11: (blocks 90113-96357)

Then execute
# Dd If =/dev/sda1 BS = 1024 COUNT = 1 skip = 1 | xxd | less
# Dd If =/dev/sda1 BS = 1024 COUNT = 1 skip = 8193 | xxd | less
# Dd If =/dev/sda1 BS = 1024 COUNT = 1 skip = 16385 | xxd | less
# Dd If =/dev/sda1 BS = 1024 COUNT = 1 skip = 24577 | xxd | less
# Dd If =/dev/sda1 BS = 1024 COUNT = 1 skip = 32769 | xxd | less
# Dd If =/dev/sda1 BS = 1024 COUNT = 1 skip = 40961 | xxd | less
# Dd If =/dev/sda1 BS = 1024 COUNT = 1 skip = 49153 | xxd | less
# Dd If =/dev/sda1 BS = 1024 COUNT = 1 skip = 57345 | xxd | less
# Dd If =/dev/sda1 BS = 1024 COUNT = 1 skip = 65537 | xxd | less
# Dd If =/dev/sda1 BS = 1024 COUNT = 1 skip = 73729 | xxd | less
# Dd If =/dev/sda1 BS = 1024 COUNT = 1 skip = 81921 | xxd | less
# Dd If =/dev/sda1 BS = 1024 COUNT = 1 skip = 90113 | xxd | less

We can find that the data in group 0, 1, 3, 5, 7, and 9 is the same,
A super block exists in all block group X, but only block group 0 is valid,
Some super blocks are backups, while some super blocks are not used.

6.
Calculates the number of block groups.
We found that there were 12 groups from 0 to 11. How did we get this value?

We should use another idea to think about how to save the number of blocks in the group block if it is you.
Is the value 12 saved directly? In this way, the flexibility is lacking, because the value 12 is worth it,
The file system stores metadata, and 12 is no longer metadata.

Obviously, you can obtain the total number of group blocks by using the number of blocks used/The number of blocks occupied by each block.
2nd data field, Total number: s_blocks_count
9th data field, where each group occupies blocks: s_blocks_per_group
6th data domain, the first data block: s_first_data_block

According to the previous hexadecimal super block: (note the small-end storage)
S_blocks_count = 6678 0100 = 16678 H = 91768
S_blocks_per_group = 0020 0000 = 2000 h = 8192
S_first_data_block = 0100 0000 = 1

So the rounded down (91768-1-1)/8192 is 11, so there must be 11 blocks,
The remaining blocks will be allocated to another block group. Therefore, the first 11 block groups have s_blocks_per_group blocks.
So it is publicized:
Block group size = (s_blocks_count-s_first_data_block-1)/s_blocks_per_group rounded up + 1

Part 2. block group Descriptor Table, N * 32 B itself/take up 1 block size

The block group Descriptor Table is an array of block group descriptor, used to define parameters of all
The block groups. It provides the location of the inode bitmap and inode table, block bitmap, number
Free blocks and inodes, and some other useful information.
The block group Descriptor Table is located on the first block following the superblock. This wocould be
Third block on a 1kib block file system, or the second block for 2kib and larger block file systems.
Shadow Copies of the block group Descriptor Table are also stored with every copy of the superblock.
Struct ext3_group_desc {

_ U32 bg_block_bitmap;/* The block pointer points to the block bitmap */

_ U32 bg_inode_bitmap;/* The block pointer points to inode bitmap */

_ U32 bg_inode_table;/* The block pointer points to inodes table */

_ 16bg_free_blocks_count;/* idle blocks count */

_ 16bg_free_inodes_count;/* idle inodes count */

/* 10 */

_ 16bg_used_dirs_count;/* directory count */

_ 16bg_pad;/* can be ignored */

_ U32 bg_reserved [3];/* can be ignored */

};

0.
Tabel table Composition
Block group DESC No. 0, which can find the remaining information of block group 0;
Block group DESC 1 to find the remaining information of block group 1;
.......
Block group DESC on the 11th to find the remaining information of block group 11,
Combine the block gourp descriptor in block group 0-11 to form a DESC table,

Therefore, in the table of block group 0, all block group blocks gourp descriptor are aggregated.

1.
Copy backup
Similar to the backup principle of super block, which block group backs up super block also backs up this table.
Verification:
# Dumpe2fs/dev/sda1 (a large amount of information has been omitted)
Group 0: (Blocks 1-8192)
Group 1: (blocks 8193-16384)
Group 2: (blocks 16385-24576)
Group 3: (blocks 24577-3276
Group 4: (blocks 32769-40960)
Group 5: (blocks 40961-49152)
Group 6: (blocks 49153-57344)
Group 7: (blocks 57345-65536)
Group 8: (blocks 65537-7372
Group 9: (blocks 73729-81920)
Group 10: (blocks 81921-90112)
Group 11: (blocks 90113-96357)

# Dd If =/dev/sda1 COUNT = 1 bs = 1024 skip = 2 | xxd | less
# Dd If =/dev/sda1 COUNT = 1 bs = 1024 skip = 8194 | xxd | less
# Dd If =/dev/sda1 COUNT = 1 bs = 1024 skip = 16386 | xxd | less
.......
That is, you can add 1 more for each skip address. The first block is used by the super block, and the second is desc table.

Based on the backup principle of super block and group des table and the situation of my partitions, the figure is shown as follows:
Upload

Download Attachment
(37.61 KB)

In this way, even if the super block or group DESC table of block group 0 is broken, the backup is still available.

2.
The number of block groups is limited.
The total size of the entire DESC table cannot exceed the size of one block.
In this example, the block size is 1024/32 B, and each DESC is 32B, so the maximum size is = 32
For a 4 K block, the maximum capacity is 128x4/32B =.

3.
Three Internal important pointers
Let's take out the table of block group 0:

# Dd If =/dev/sda1 BS = 1024 COUNT = 1 skip = 2 | xxd | less
0000000: 0300 0000 0400 0000 0500 0000 bd07 ................
0000010: 0200 0000 0000 0000 0000 0000 0000 ................
0000020: 0320 0000 0420 0000 0520 e919 d807 .............
0000030: 0000 0000 0000 0000 0000 0000 0000 ................
0000040: 0140 0000 0240 0000 0340 491e c907...
0000050: 0100 0000 0000 0000 0000 0000 0000 ................
0000060: 0360 0000 0460 0000 0560 011f d807 .'...'...'......
0000070: 0000 0000 0000 0000 0000 0000 0000 ................
0000080: 0180 0000 0280 0000 0380 031f d807 ................
0000090: 0000 0000 0000 0000 0000 0000 0000 ................
00000a0: 03a0 0000 04a0 0000 05a0 0000 011f d807 ................
00000b0: 0000 0000 0000 0000 0000 0000 0000 ................
00000c0: 01c0 0000 02c0 0000 03c0 0000 031f d807 ................
00000d0: 0000 0000 0000 0000 0000 0000 0000 ................
00000e0: 03e0 0000 04e0 0000 05e0 0000 011f d807 ................
00000f0: 0000 0000 0000 0000 0000 0000 0000 ................
0000100: 0100 0100 0200 0100 0300 031f d807 ................
0000110: 0000 0000 0000 0000 0000 0000 0000 ................
0000120: 0320 0100 0420 0100 0520 011f d807 .............
0000130: 0000 0000 0000 0000 0000 0000 0000 ................
0000140: 0140 0100 0240 0100 0340 031f d807 .@...@...@......
0000150: 0000 0000 0000 0000 0000 0000 0000 ................
0000160: 0160 0100 0260 0100 0360 0100 6817 d807...
0000170: 0000 0000 0000 0000 0000 0000 0000 ................
All 0 s
20173f0: 0000 0000 0000 0000 0000 0000 0000 ................

Because 2. It is known that when the block size is 1024b, it contains a maximum of 32 32B descriptive tables,
Currently, there are only 12 block groups, so only 37.5% (11/32) of B is occupied)

We will take out the first analysis:
0000000: 0300 0000 0400 0000 0500 0000 bd07 ................
0000010: 0200 0000 0000 0000 0000 0000 0000 ................

For members using the struct ext3_group_desc,
_ U32 bg_block_bitmap = 0300 0000 = 3/* The block pointer points to the block bitmap */

_ U32 bg_inode_bitmap = 0400 0000 = 4/* The block pointer points to inode bitmap */

_ U32 bg_inode_table = 0500 0000 = 5/* The block pointer points to the inodes table */

_ 16bg_free_blocks_count = 0/* idle blocks count */

_ 48391 bg_free_inodes_count = bd07 =/* idle inodes count */

_ 16bg_used_dirs_count = 2/* directory count */

_ 16bg_pad;/* can be ignored */

_ U32 bg_reserved [3];/* can be ignored */

The three internal pointers are the first three members. The values they store are not addresses, but block numbers.
Next we will go to these blocks and analyze them separately.

Part 3. Block bitmap, take up 1 block size
The "Block bitmap" is normally located at the first block, or second block if a superblock backup is
Present, of the block group. Its official location can be determined by reading the "bg_block_bitmap" in
Its associated group descriptor.
Each bit represent the current state of a block within that block group, where 1 means "used" and 0
"Free/available". The first block of this block group is represented by bit 0 of byte 0, the second by bit 1
Of byte 0. The 8th block is represented by bit 7 (most significant bit) of byte 0 while the 9th block is
Represented by bit 0 (least significant bit) of byte 1.

The document has been clearly stated. In a block group:
1 indicates occupied, 0 indicates available
1st blocks use byte 0 bit 0
2nd blocks use byte 0 bit 1
8th blocks use byte 0 bit 7
9th blocks use byte 1 bit 0

This area is a block-bit big map, which uses 0 1 to indicate whether it is occupied.
With it, you can know which block is writable.

0.
Limited number of blocks
Apparently, 1 byte = 8 bits, while block bitmap occupies 1 block_size, so the entire block bitmap
The block switch status that can be stored has 8 block_size bits,
Therefore,
The maximum number of blocks that can be allocated in each block group is = block_size * 8.
In each block group, the maximum space is block_size * 8 * block_size.

1.
Have a look
From the DESC table, we can see that the block bitmap of block group 0 is in the block with the number 3. Go in and see:
# Dd If =/dev/sda1 COUNT = 1 bs = 1024 skip = 3 | xxd | less
0000000: FFFF ................
All ffffs
20173f0: FFFF ................
It means that block group 0 in/dev/sda1 is all used up (no inode is allocated, and it is a waste)

# Dumpe2fs/dev/sda1
Group 0: (Blocks 1-8192)
Primary superblock at 1, group descriptors at 2-2
Block bitmap at 3 (+ 2), inode bitmap at 4 (+ 3)
Inode table at 5-255 (+ 4)
0 free blocks, 1981 free inodes, 2 Directories
Free blocks:
Free inodes: 28-2008

Observed,
Group 0: (Blocks 1-8192) satisfies the 1. calculation method, 1 K * 8 = 8192
Free blocks: no more
Free inodes: 28-2008 is useless.

Part 4. inode bitmap, take up 1 block size

The "inode bitmap" works in a similar way as the "Block bitmap", difference being in each bit
Representing an inode in the "inode table" rather than a block.
There is one inode bitmap per group and its location may be determined by reading
"Bg_inode_bitmap" in its associated group descriptor.
When the inode table is created, all the reserved inodes are marked as used. In Revision 0 this is the first
11 inodes.

Similar to block bitmap, It is a super big bit map,
However, inode bitmap uses 0 1 to mark an inode, that is, the following structure.

The retained inode is marked as 1.
Because not all block groups use all inode by default.

0.
Limited inode count
Similar to block bitmap
Maximum number of inode allocated in each block group = block_size * 8

The inode table size is limited. We can see that each inode is 128 B,
Maximum inode table supports B * block_size * 8

1.
Have a look
From the DESC table, we can see that inode bitmap of block group 0 is in the block with number 4:
# Dd If =/dev/sda1 COUNT = 1 bs = 1024 skip = 4 | xxd | less
0000000: FFFF ff07 0000 0000 0000 0000 0000 ................
All 0 s
00000f0: 0000 0000 0000 0000 00FF FFFF ................
All ffffs
20173f0: FFFF ................

# Dumpe2fs/dev/sda1
Group 0: (Blocks 1-8192)
Primary superblock at 1, group descriptors at 2-2
Block bitmap at 3 (+ 2), inode bitmap at 4 (+ 3)
Inode table at 5-255 (+ 4)
0 free blocks, 1981 free inodes, 2 Directories
Free blocks:
Free inodes: 28-2008

Since there is no free block, it means the block group is full.
Free inodes: 28-2008,

Read from the hexadecimal code above:
FFFF ff07 is
11111111 11111111 11111111 00000111
This is because it is related to the small-end format.

This bitmap stores inode, that is, the next part of data.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.