New Generation Linux File system btrfs Introduction "Reprint"

Last Update:2014-05-17 Source: Internet

Author: User

Tags dashed line

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Btrfs is known as the next generation Linux file system. In recent years, EXT2/3 has encountered more and more extensibility problems, while looking forward to ext4, people have found Btrfs, it is said that it adopts a lot of advanced file system design, not only solves the EXT2/3 extensibility problem, but also let people see the next generation file system has many other features. All this makes you wonder, what features does Btrfs offer? How is it implemented? This paper discusses these issues, first of all, the new features provided by Btrfs, and briefly introduces the principle of the implementation of these features in Btrfs, and then demonstrates the common commands of Btrfs.

September 20, 2010

Content

About Btrfs

The file system appears to be a relatively stable part of the kernel, and over the years, people have been using the Ext2/3,ext file system for its excellent stability as the de facto Linux standard file system. In recent years, EXT2/3 has exposed some extensibility problems, so it has spawned ext4. The dev version of EXT4 was integrated in the Linux2.6.19 kernel released in 2008. When the 2.6.28 kernel was released, EXT4 ended the development and began accepting users ' use. It seems that ext will become synonymous with Linux file systems. However, when you read many articles about EXT4, you will find that you have mentioned btrfs in unison, and that EXT4 will be a transitional file system. Ext4 's author Theodore Tso also praised Btrfs and believes Btrfs will become the next generation Linux standard file system. ORACLE,IBM, Intel and other manufacturers have also shown great concern for Btrfs, invested in capital and manpower. Why is btrfs so eye-catching? This is the first question to be explored in this paper.

Kevin Bowling[1] has an article on various file systems, in his view, EXT2/3 and other file systems belonging to the "Classical period." The new era of file systems was pioneered by Sun's ZFS in 2005. ZFS stands for "Last word in File system", meaning that no additional file systems need to be developed thereafter. ZFS does bring a lot of new ideas and is an epoch-making piece of the file system.

If you compare Btrfs's features, you will find Btrfs and ZFS very similar. Perhaps we can assume that Btrfs is the Linux community's response to ZFS. Since then, in Linux, there is a file system that can rival ZFS.

Features of Btrfs

You can see a list of Btrfs features on Btrfs's home page [2]. I took the liberty of dividing the list into four parts.

The first is extensibility (scalability)-related features, and Btrfs's most important design goal is to address the scalability requirements of large machines for file systems. Features such as extent,b-tree and dynamic inode creation ensure that Btrfs still has excellent performance on large machines, with overall performance not decreasing as the system capacity increases.

Next is the data integrity-related feature. The system is confronted with unpredictable hardware failures, and Btrfs uses COW transaction technology to ensure file system consistency. Btrfs also supports checksum and avoids the emergence of silent corrupt. Traditional file systems cannot do this.

The third is the features associated with multi-device management. Btrfs supports creating snapshots (snapshot), and cloning (clone). Btrfs can also easily manage multiple physical devices, making traditional volume management software redundant.

Finally, other features that are difficult to categorize. These features are more advanced technologies that can significantly improve the time/space performance of file systems, including delay allocation, storage optimization of small files, directory indexing, and more.

Extensibility-related features

B-tree

All metadata in the Btrfs file system are managed by BTree. The main benefit of using BTree is that it is efficient to find, insert, and delete operations. It can be said that BTree is the core of Btrfs.

Blindly boast that BTree is very good and efficient may not be convincing, but if a little bit of time to see the EXT2/3 metadata management implementation, you can contrast the advantages of BTree.

An issue that hinders EXT2/3 extensibility comes from the way in which the directory is organized. A directory is a special kind of file in which the content of EXT2/3 is a linear form. 1-1 shown in [6]:

Figure 1. Ext2 Directory [6]

Figure 1 shows the contents of a ext2 directory file that contains four files. They are "home1", "usr", "oldfile" and "Sbin" respectively. If you need to find the directory in this directory sbin,ext2 will traverse the first three entries until you find the Sbin string.

This structure is more intuitive when the number of files is limited, but as the number of items in the list increases, the time to find the file grows linearly. In 2003, EXT3 designers developed the directory indexing technology to solve this problem. The data structure used by the directory index is BTree. If the number of files in the same directory exceeds the I_data field in 2k,inode, point to a special block. The directory index BTree is stored in the block. The BTree is more efficient than the linear table,

But designing two data structures for the same metadata is always less elegant. There are a lot of other meta-data in the filesystem, which is a very simple and graceful design with unified BTree management.

All meta data within Btrfs is managed by BTree and has good scalability. The different metadata within Btrfs is managed by different Tree. In Superblock, pointers are pointed to the roots of these BTree. 2 is shown below:

Figure 2. Btrfs Btree

FS Tree manages file-related metadata, such as inode,dir; Chunk tree Management device, each disk device has an item in the Chunk tree; Extent Tree manages disk space allocation, and Btrfs allocates a disk space for each The disk space information is inserted into the Extent tree. Query Extent tree will get free disk space information; Tree of tree Root holds many BTree roots. For example, every snapshot created by a user, Btrfs creates a FS Tree. To manage a tree, Btrfs uses the tree root to hold the root node of all the trees, and checksum the tree to preserve the checksum of the data block.

Extent-based file storage

Many modern file systems use extent instead of block to manage disks. Extent is a contiguous block, and a Extent is defined by the starting block plus the length.

Extent can effectively reduce metadata overhead. To further understand the problem, let's look at the negative examples in ext2.

EXT2/3 The block as the base unit, dividing the disk into blocks. To manage disk space, the file system needs to know which blocks are idle. EXT uses bitmap to achieve this goal. Each bit in the Bitmap corresponds to a block on the disk, and when the corresponding block is allocated, the corresponding bit in the Bitmap is set to 1. This is a classic and very clear design, but unfortunately when the disk capacity becomes larger, the bitmap itself occupies a larger space. This leads to scalability problems, and as the storage capacity increases, so does the amount of space that bitmap this metadata. And people hope that no matter how the disk capacity increases, the metadata should not increase linearly, so that the design is scalable.

The difference between block and extent is compared:

Figure 3. The use of extent btrfs and the use of bitmap EXT2/3

In EXT2/3, 10 blocks require 10 bits to represent; in Btrfs, only one metadata is required. For large files, extent shows better management performance.

Extent is the smallest unit of Btrfs's managed disk space, managed by Extent Tree. Btrfs allocates data or metadata to query the extent tree for information about free space.

Dynamic Inode Assignment

In order to understand dynamic inode allocation, it is still necessary to use EXT2/3. The following table lists the limitations of the Ext2 file system:

Table 1. Ext2 restrictions

	Limit
Maximum number of files	File system space Size v/8192 For example, 100G size file system, can create a maximum number of files 131072

Figure 4 shows the disk layout of the ext2:

Figure 4. ext2 layout

In Ext2, where the Inode area is pre-fixed and fixed in size, such as a 100G partition, the Inode table can only hold 131,072 inode, which means it is not possible to create more than 131,072 files, because each file must have a unique Inode.

In order to solve this problem, the inode must be allocated dynamically. Each inode is just one node in the BTree, and the user can arbitrarily insert the new inode indefinitely, and its physical storage location is dynamically allocated. So Btrfs doesn't have a limit on the number of files.

Optimized support for SSDs

SSDs are the short name for solid state Disk storage. In the past few decades, the development of Cpu/ram and other devices has always followed Moore's law, but HDD HDD Read and write rate has not been a leap-forward development. Disk IO is always a bottleneck for system performance.

SSD uses flash memory technology, internal no disk head and other mechanical devices, read and write rate greatly increased. Flash memory has some features that are different from HDD. Flash must perform the erase operation before writing the data, and secondly, Flash has certain limitations on the number of erase operations, at the current level of technology, the same data unit can be up to about 1 million erase operations, so in order to extend the life of Flash, write operations should be average to the entire flash On

SSDs implement distributed write operations such as wear leveling in the micro code inside the hardware, so the system eliminates the need for special MTD drivers and FTL layers. Although SSDs do a lot of hard work at the hardware level, they are still limited. The file system optimized for SSD performance improves SSD life and reads and writes. Btrfs is a handful of file systems that are specifically optimized for SSDs. Btrfs users can use the mount parameter to turn on special optimizations for SSDs.

Btrfs's COW technology fundamentally avoids repeated write operations on the same physical unit. If the user opens the SSD optimization option, Btrfs will be optimized on the underlying block space allocation policy: Aggregating multiple disk space allocation requests into a contiguous block of size 2M. The IO of a large contiguous address can improve IO performance by allowing micro code that is cured inside the SSD to be better read and write optimized.

Data consistency-related features

COW transactions

To understand COW transactions, you must first understand the terms COW and transactions.

What is COW?

The so-called COW, that is, each time the disk data write, the update data is written to a new block, when the new data is written successfully, and then update the relevant data structure to the new block.

What is a transaction?

COW can only guarantee the atomicity of a single data update. However, many of the operations in the file system need to update several different metadata, such as creating a file, you need to modify the following metadata:

Modify extent tree to allocate a disk space
Create a new inode and insert it into the FS Tree
Add a table of contents entry into the FS Tree

If any one step goes wrong, the file cannot be created successfully, so it can be defined as a transaction.

A COW transaction is shown below.

A is the root node of the FS Tree, and the new Inode information is inserted into node C. First, Btrfs inserts the inode into a newly allocated block C ', and modifies the upper node B to point to the new Block C '; modification B also throws COW, and so on, triggering a chain reaction until the topmost Root a. When the entire process is finished, the new node A ' becomes the root of the FS Tree. But when the transaction is not over, Superblock still points to A.

Figure 5. COW Transaction 1

Next, modify the directory entry (E-node), which also causes the process to generate a new root node A '.

Figure 6. COW Transaction 2

At this point, both the Inode and the directory entries have been written to disk, and the transaction can be considered closed. Btrfs modifies the superblock so that it points to A ', as shown in:

Figure 7. COW Transaction 3

COW transactions ensure file system consistency, and the system does not need to perform fsck after Reboot. Because superblock either points to the new a ', or to a, whichever is the consistent data.

Checksum

Checksum technology ensures the reliability of the data and avoids the silent corruption phenomenon. For hardware reasons, the data read from the disk will be faulted. For example, the data stored in block A is 0x55, but the data read is 0x54, because the read operation is not an error, so this mistake can not be detected by the upper software.

The solution to this problem is to save the checksum of the data and check the checksum after reading the data. If not, you know the data is wrong.

The EXT2/3 does not have checksums and fully trusts the disk. Unfortunately, disk errors persist, not only on inexpensive IDE hard drives, but also in silent corruption problems with expensive RAID. And with the development of the storage network, even if the data is read from disk correctly, it is difficult to ensure that the network device can be safely traversed.

Btrfs reads its corresponding checksum while reading the data. If the data that is eventually read from the disk is not the same as the checksum, Btrfs will first attempt to read the mirrored backup of the data, and Btrfs will return an error if the data does not have a mirrored backup. Before writing to disk data, Btrfs calculates the checksum of the data. The checksum and data are then written to disk at the same time.

Btrfs uses a separate checksum Tree to manage the checksum of data blocks, leaving the blocks of data protected by checksum and checksum to provide tighter protection. If you add a domain to the header of each data block to save checksum, then this data block becomes a self-protecting structure. There is an error in this structure that could not be detected, such as the filesystem intended to read block A from disk, but returned to block B, because checksum is inside the block, so checksum is still correct. Btrfs uses checksum tree to preserve the checksum of data blocks, which avoids the above problems.

Btrfs uses the CRC32 algorithm to calculate checksum, which will support other types of validation algorithms in future development. To improve efficiency, Btrfs executes the work of writing data and checksum separately in parallel with different kernel threads.

Features related to multi-device management

Each Unix administrator has been faced with the task of allocating disk space for users and various applications. In most cases, it is not possible to estimate in advance exactly how much disk space a user or application will need in the future. The exhaustion of disk space often occurs when people have to try to increase the file system space. Traditional EXT2/3 cannot cope with this demand.

Many volume management software is designed to meet the needs of users for multi-device management, such as LVM. Btrfs integrates the functionality of the volume management software, simplifying user commands on the one hand, and improving efficiency on the other.

Multi-Device Management

Btrfs supports adding devices dynamically. After the user adds a new disk to the system, you can use Btrfs's command to add the device to the file system.

To make the most of the device space, Btrfs divides disk space into multiple chunk. Each chunk can use a different disk space allocation policy. For example, some chunk only store metadata, and some chunk only store data. Some chunk can be configured as mirror, while others chunk can be configured as stripe. This provides the user with very flexible configuration possibilities.

Subvolume

Subvolume is a very elegant concept. That is, a part of the filesystem is configured as a complete sub-file system, called Subvolume.

With Subvolume, a large file system can be divided into sub-file systems, which share the underlying device space, are allocated from the underlying device when disk space is needed, and similar applications call malloc () to allocate memory. Can be called a storage pool. This model has a number of advantages, such as the ability to make full use of disk bandwidth, which simplifies the management of space.

The so-called full use of disk bandwidth, refers to the file system can read and write in parallel to the underlying disk, this is because each file system can access all disk. Traditional file systems cannot share the underlying disk device, whether physical or logical, and therefore cannot be read and written in parallel.

The so-called simplified management is relative to the LVM and other volume management software. With the storage pool model, the size of each file system can be automatically adjusted. With LVM, if there is not enough space for a filesystem, the file system does not automatically use free space on other disk devices and must be manually adjusted using LVM's management commands.

The subvolume can be mounted as a root directory to any mount point. Subvolume is a very interesting feature and has many applications.

If the administrator only wants some users to access a portion of the file system, for example, if they want users to be able to access all of the content below/var/test/, they cannot access other content under/var/. Then/var/test can be made into a subvolume. /var/test This subvolume is a complete file system that can be mounted with the Mount command. For example, to mount to the/test directory, to give users access to/test, then the user can only access the contents of/var/test below.

Snapshots and clones

A snapshot is a full backup of a file system at a time. After a snapshot is established, changes to the file system do not affect the content in the snapshot. This is a very useful technique.

such as database backup. If T1 at a point in time, the administrator decides to back up the database, he must first stop the database. Backing up files is a time-consuming operation, and if an application modifies the contents of the database during the backup process, it will not be able to get a consistent backup. As a result, the database service must be stopped during the backup process, which is not allowed for some critical applications.

With snapshots, administrators can stop the database at point-in-time T1 and establish a snapshot of the system. This process typically takes only a few seconds, and then the database service can be restored immediately. At any time thereafter, the administrator can back up the contents of the snapshot, at which time the user's modifications to the database do not affect the content in the snapshot. When the backup is complete, the administrator can delete the snapshot and free up disk space.

Snapshots are generally read-only, and when the system supports writable snapshots, this writable snapshot is called a clone. Cloning technology also has many applications. For example, install basic software in one system, then make different clones for different users, each user will use their own clone without affecting other users ' disk space. Very similar to virtual machines.

Btrfs supports snapshot and clone. This feature greatly increases the use of btrfs, and users do not need to buy and install expensive and use complex volume management software. Here's a brief introduction to Btrfs's rationale for implementing snapshots.

As mentioned in Btrfs using COW transaction technology, from figure 1-10 can be seen, after the end of the COW transaction, if not delete the original node a,c,e, then A,c,e,d,f still fully represents the file system before the start of the transaction. This is the basic principle of snapshot implementation.

Btrfs uses a reference count to determine whether to delete an existing node after a transaction commit. For each node, Btrfs maintains a reference count. When the node is referenced by another node, the count is added one, and when the node is no longer referenced by another node, the count is reduced by one. When the reference count is zero, the node is deleted. For normal Tree root, the reference count is added one at the time of creation because Superblock will refer to the root block. Obviously, the reference count for all other nodes in the tree is one in the initial case. When the COW transaction commits, Superblock is modified to point to the new root a ', the original reference count of root block A is reduced by one to 0, so a node is deleted. The deletion of a node causes the reference count of its descendants to be reduced by one, and the reference count of the B,c node in Figure 1-10 is thus changed to 0, thus being deleted. When the D,e node is COW, the counter is added one because it is referenced by a ', so the counter is not zeroed at this time and thus is not deleted.

When you create Snapshot, Btrfs copies the Root A node to the SA and sets the reference count of the SA to 2. When a transaction commits, the reference count of the SA node is not zeroed, so the user can continue to access the files in the snapshot through Root SA.

Figure 8. Snapshot

Software RAID

RAID technology has many attractive features, such as the ability to combine multiple inexpensive IDE disks into a RAID0 array to become a large disk, and the RAID1 and more advanced RAID configurations also provide data redundancy protection, which makes the data stored on disk more secure.

Btrfs is well-supported for software Raid,raid categories including RAID0,RAID1 and RAID10.

Btrfs metadata is RAID1 protected by default. As mentioned previously, btrfs divides the device space into chunk, and some chunk are configured as metadata, which stores only metadata. For this kind of chunk,btrfs divides the chunk into two bands, writes the metadata, simultaneously writes two strips, thus realizes to the metadata the protection.

Other features

Other features listed on the Btrfs home page are not easy to classify, and these features are advanced technologies in modern file systems that improve file system time or space efficiency.

Delay allocation

Delayed-allocation techniques can reduce disk fragmentation. In the Linux kernel, many operations can be delayed in order to improve efficiency.

In a file system, the frequent allocation and release of small spaces can result in fragmentation. Deferred allocation is a technique that saves data in memory when a user needs disk space. and send disk allocation requirements to the disk space allocator, the disk space allocator does not immediately allocate real disk space. Just record this request and return.

Disk space allocation requests can be frequent, so the disk allocator can receive a large number of allocation requests over a period of time that is deferred, some may be merged, and some requests may even be canceled during this delay. With such "Wait", it is often possible to reduce unnecessary allocations, as well as to consolidate multiple small allocation requests into one large request, thus improving IO efficiency.

Inline file

A large number of small files, such as hundreds of bytes or smaller, are often present in the system. If you assign a separate data block to it, it can cause internal fragmentation and waste disk space. Btrfs stores the contents of small files in metadata and no longer allocates additional disk blocks for file data. Improved internal fragmentation and increased file access efficiency.

Figure 9. Inline file

A BTree leaf node is shown. There are two extent data item metadata in the leaves that represent the disk space used by the file file1 and File2 respectively.

Suppose the size of the file1 is only 15 bytes, and the size of the file2 is 1M. , the file2 uses the common extent representation method: Extent2 metadata points to a extent, the size is 1M, its content is the content of the File2 file.

For File1, Btrfs embeds its file contents into the meta-data extent1. If you do not use the inline file technique. As shown in the dashed line, extent1 points to a minimum of extent, which is a block, but File1 has 15 bytes and the remaining space becomes a fragment space.

Using the inline technology, read file1 only need to read the metadata block, without first reading the Extent1 this metadata, and then read the real storage file content block, thereby reducing the disk IO.

Due to the inline file technology, Btrfs handles small files efficiently and avoids disk fragmentation issues.

Directory Indexes Directory Index

When the number of files in a directory is large, the directory index can significantly improve file search time. Btrfs itself uses BTree to store directory entries, so the efficiency of searching for files in a given directory is very high.

However, Btrfs does not use BTree to manage catalog entries in a way that meets the needs of Readdir. Readdir is a POSIX standard API that requires all files in the specified directory to be returned, and in particular, these files are sorted by inode number. The key of the Btrfs entry when inserting BTree is not the Inode number, but rather a hash value calculated from the file name. This is a very efficient way to find a particular file, but it is not suitable for readdir. To do this, Btrfs inserts another catalog item index at the same time each time a new file is created, in addition to inserting a directory entry with the hash value key, and the key for the index of the catalog item is sequence number as the key value of the BTree. This sequence number increases linearly each time a new file is created. Because Inode number is also incremented each time a new file is created, the order of sequence number and inode number is the same. Using this sequence number as KEY in the BTree, it is convenient to get a list of files sorted by inode number.

In addition, files sorted by sequence number are often located on the disk, so accessing large numbers of files in the order of sequence is a better IO efficiency.

Compression

We have used Zip,winrar and other compression software, a large file compression can effectively save disk space. Btrfs has a built-in compression function.

It is often thought that compressing data before it is written to disk consumes a lot of CPU time, which inevitably reduces the read and write efficiency of the file system. However, with the development of hardware technology, the gap between CPU processing time and disk IO time is increasing. In some cases, some CPU time and some memory, but can greatly save the number of disk IO, which can increase overall efficiency.

For example, a file requires 100 disk IO without being compressed. However, after a small amount of CPU time is compressed, only 10 disk IO is required to write the compressed file to disk. In this case, the IO efficiency is improved instead. Of course, this depends on the compression ratio. Currently Btrfs uses the defalte/inflate algorithm provided by zlib to compress and decompress. In the future, Btrfs should be able to support more compression algorithms to meet the different needs of different users.

There are still some shortcomings in the compression characteristics of btrfs, when the compression is enabled, all files under the entire file system will be compressed, but users may need finer-grained control, such as different compression algorithms for different directories, or suppress compression. I am confident that the Btrfs development team will address this issue in a future release.

Some types of files, such as JPEG files, can no longer be compressed. Trying to compress it will simply waste the CPU. For this reason, Btrfs will no longer compress the remainder of the file when compression is found to be weak after several blocks of a file have been compressed. This feature improves the IO efficiency of the file system to some extent.

Pre-allocation

Many applications have the need to pre-allocate disk space. They can tell the filesystem to reserve a portion of space on the disk through the Posix_fallocate interface, but it does not write data for the time being. If the underlying file system does not support Fallocate, then the application only uses write to pre-write some useless information to reserve enough disk space for itself.

It is more efficient for the file system to support the reservation space and can reduce disk fragmentation because all the space is allocated once, making it more likely to use contiguous space. Btrfs supports Posix_fallocate.

Summarize

At this point, we have a detailed discussion of the many features of Btrfs, but the features that Btrfs can offer are more than that. Btrfs is in the experimental development phase and will have more features.

Btrfs also has an important drawback that when an error occurs in a node in BTree, the file system loses all the file information under that node. EXT2/3, however, avoids this problem, which is called "Error diffusion".

But anyway, I hope you and I are starting to agree that Btrfs will be the most promising file system for Linux in the future.

Introduction to BTRFS Usage

Knowing the features of Btrfs, you must want to experience the use of Btrfs for yourself. This chapter will briefly describe how to use Btrfs.

Creating a file system

The Mkfs.btrfs command establishes a btrfs-formatted file system. You can create a Btrfs file system on the device Sda5 with the following command and mount it to the/btrfsdisk directory:

#mkfs. Btrfs/dev/sda5  #mkdir/btrfsdisk  #mount –t btrfs/dev/sda5/btrfsdisk

Such a Btrfs is set up on the device Sda5. It is worth mentioning that, in this default case, Btrfs also has redundant protection for metadata even if there is only one device. If you have multiple devices, you can make RAID settings when you create the file system. For more information, see the following introduction.

Here are a few other mkfs.btrfs parameters.

Nodesize and leafsize are used to set the size of the Btrfs internal BTree node, which defaults to a page size. However, users can also use larger nodes to increase fanout and reduce the height of the tree, which is, of course, only suitable for very large file systems.

The Alloc-start parameter is used to specify the starting address of the file system on the disk device. This allows the user to conveniently reserve some special space in front of the disk.

The Byte-count parameter sets the size of the file system, and the user can use only part of the device to increase the file system size when the space is low.

To modify the file system size

After the file system is set up, you can modify the file system size. /dev/sda5 mounted to the/btrfsdisk, the size is 800M. If you want to use only 500M of these, you need to reduce the size of the current file system, which can be achieved by the following command:

#df  Filesystem   1k-blocks     used      Available   use%   mounted on  /dev/sda1    101086        19000       76867         20%/     boot  /dev/sda5 811248 +       811216         1%     /btrfsdisk  #btrfsctl –r-300m/btrfsdisk  #df  Filesystem  1k-blocks      used      Available   use%   mounted on  /dev/sda1    101086        19000       76867         20%/     boot  /dev/sda5    504148       504106         1%     /btrfsdisk

Similarly, you can use the BTRFSCTL command to increase the size of the file system.

Create Snapshot

In the following example, there are 2 files in the system when you create a snapshot Snap1. After the snapshot is created, modify the contents of the Test1. Then go back to Snap1, open the Test1 file, you can see Test1 content is still the previous content.

#ls/btrfsdisk  test1 test2  #vi test1 This is  a test  #btrfsctl –s snap1/btrfsdisk  #vi test1  test 1 is modified  #cd/btrfsdisk/snap1  #cat test1-is  a test

As can be seen from the above example, the contents of the snapshot Snap1 saved will not be changed by subsequent write operations.

Create Subvolume

With the Btrfs command, users can easily build Subvolume. Assuming that the/btrfsdisk has been mounted to the Btrfs file system, the user can create a new subvolume within the file system. For example, create a/sub1 subvolume and mount the sub1 under/mnt/test:

#mkdir/mnt/test  #btrfsctl –s sub1/btrfsdisk  #mount –t btrfs–o subvol=sub1/dev/sda5/mnt/test

Subvolme 可以方便管理员在文件系统上创建不同用途的子文件系统，并对其进行一些特殊的配置，比如有些目录下的文件关注节约磁盘空间，因此需要打开压缩，或者配置不同的 RAID 策略等。目前 btrfs 尚处于开发阶段，创建的 subvolme 和 snapshot 还无法删除。此外针对 subvolume 的磁盘 quota 功能也未能实现。但随着 btrfs 的不断成熟，这些功能必然将会进一步完善。

创建 RAID

mkfs 的时候，可以指定多个设备，并配置 RAID 。下面的命令演示了如何使用 mkfs.btrfs 配置 RAID1 。 Sda6 和 sda7 可以配置为 RAID1，即 mirror 。用户可以选择将数据配置为 RAID1，也可以选择将元数据配置为 RAID1 。

将数据配置为 RAID1，可以使用 mkfs.btrfs 的 -d 参数。如下所示：

#mkfs.btrfs – d raid1 /dev/sda6 /dev/sda7  #mount – t btrfs /dev/sda6 /btrfsdisk

添加新设备

当设备的空间快被使用完的时候，用户可以使用 btrfs-vol 命令为文件系统添加新的磁盘设备，从而增加存储空间。下面的命令向 /btrfsdisk 文件系统增加一个设备 /sda8

#btrfs-vol – a /dev/sda8 /btrfsdisk

SSD 支持

用户可以使用 mount 参数打开 btrfs 针对 SSD 的优化。命令如下：

#mount – t btrfs – o SSD /dev/sda5 /btrfsdisk

开启压缩功能

用户可以使用 mount 参数打开压缩功能。命令如下：

#mount – t btrfs – o compress /dev/sda5 /btrfsdisk

同步文件系统

为了提高效率，btrfs 的 IO 操作由一些内核线程异步处理。这使得用户对文件的操作并不会立即反应到磁盘上。您可以做一个实验，在 btrfs 上创建一个文件后，稍等 5 到 10 秒将系统电源切断，再次重启后，新建的文件并没有出现。

对于多数应用这并不是问题，但有些时候用户希望 IO 操作立即执行，此时就需要对文件系统进行同步。下面的 btrfs 命令用来同步文件系统：

#btrfsctl – c /btrfsdisk

Debug 功能

Btrfs 提供了一定的 debug 功能，对于想了解 Btrfs 内部实现原理的读者，debug 将是您最喜欢的工具。这里简单介绍一下 debug 功能的命令使用。

下面的命令将设备 sda5 上的 btrfs 文件系统中的元数据打印到屏幕上。

#btrfs-debug-tree /dev/sda5

通过对打印信息的分析，您将能了解 btrfs 内部各个 BTree 的变化情况，从而进一步理解每一个文件系统功能的内部实现细节。

比如您可以在创建一个文件之前将 BTree 的内容打印出来，创建文件后再次打印。通过比较两次的不同来了解 btrfs 创建一个文件需要修改哪些元数据。进而理解 btrfs 内部的工作原理。

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More