The basic principle of qcow2
The QCOW2 image format is a disk image supported by the QEMU emulator. It is also possible to represent a fixed-size block device disk in the form of a file. Compared to normal raw format mirroring, the following features are available:
- Smaller footprint, even if the file system does not support voids (holes);
- Supports write-time copy (COW, copy-on-write), the image file only reflects the changes of the underlying disk;
- Support Snapshot (snapshot), image file can contain the history of multiple snapshots;
- Selectable Zlib-based compression mode
- You can choose AES encryption
Qcow2 image File Format header information
Each qcow2 file starts with a header in a big-endian (Big-endian) format, with the following structure:
Listing 1. Qcow2 Header
typedef struct QCOWHEADER { uint32_t magic; uint32_t version; uint64_t Backing_file_offset; uint32_t backing_file_size; uint32_t cluster_bits; uint64_t size; /* in bytes */ uint32_t Crypt_method; uint32_t l1_size; uint64_t L1_table_offset; uint64_t Refcount_table_offset; uint32_t refcount_table_clusters; uint32_t nb_snapshots; uint64_t snapshots_offset; } Qcowheader;
The following is an example of a 10G Qcow2 file that analyzes the meaning of each field.
Listing 2. 16 binary representations of qcow2 files
# file 1.cow21.cow2:qemu QCOW Image (v2), 10737418240 bytes0000000:5146 49fb 0000 0002 0000 0000 0000 0000 Qfi ... ....... 0000010:0000 0000 0000 0010 0000 0002 8000 0000 ........ 0000020:0000 0000 0000 0014 0000 0000 0003 0000 ........ 0000030:0000 0000 0001 0000 0000 0001 0000 0000 ........ 0000040:0000 0000 0000 0000 0000 0000 0000 0000 ........ 0000050:0000 0000 0000 0000 0000 0000 0000 0000 ........ 0000060:0000 0004 0000 0068 0000 0000 0000 0000 ..... h ..... 0000070:0000 0000 0000 0000 0000 0000 0000 0000 .......... .....
The first 4 bits contain the character q,f,i, then 0XFB, and the 5146 49FB in the instance is the Magic field.
The next 4 bits contain the version number of the image file, and 0000 0002 in the instance is the Version field, which represents the use of the Qcow2 release.
The Backing_file_offset occupies 8 bytes, and in the instance 0000 0000 0000 0000, gives a starting offset from a file.
Backing_file_size gives the length of a non-null-terminated string, which is 0000 0000 in the instance. If the image file is a write-time copy, then it is the path to the original file.
The cluster_bits,32 bit (0000 0010), which describes how to map a mirrored address to a local file, determines how the low offset address is indexed in a cluster. Because the L2 table occupies a single cluster and contains 8-byte table entries (entry), Cluster_bits has fewer than 3 bits, as an index to the L2 table.
The next size, 8 bytes, represents the size of the block device represented by the image file, 0000 0002 8000 0000 bytes in the instance, which is the space of 10G.
Crypt_method if 1 represents the use of AES encryption.
l1_size (0000 0014) and l1_table_offset (0000 0000 0003 0000::) give the L1 table size and offset, respectively.
Refcount_table_offset gives an offset to the RefCount table (0000 0000 0001 0000) and Refcount_table_clusters describes the size of the RefCount table in cluster (0000 0001).
Nb_snapshots gives the number of snapshots that the image contains (0000 0000), and snapshots_offset gives the offset to Qcowsnapshotheader for each snapshot (0000 0000 0000 0000).
A typical qcow2 image file consists of a few parts:
- The header information mentioned above
- L1 table
- RefCount table
- One or more refcount blocks
- Snapshot header
- L2 table
- Data cluster
Level 2 Lookup
In Qcow2, the contents of the disk are saved in cluster (each cluster contains some sectors with a size of 512 bytes). In order to find the cluster of the given address, we need to find two tables, l1->l2. The L1 table saves a set of offsets to the L2 table, and the L2 table saves a set of offsets to the cluster;
So an address should be divided into 3 parts according to the Cluster_bits (64-bit) setting, for example cluster_bits=12;
The low 12-bit is a 4Kb cluster offset (2 of the 12-square =4kb);
The next 9 bits are the L2 table with 512 table items;
The remaining 43 digits represent the L1 table offset.
In order to get the offset position of a given address (64-bit):
- Get the address of the L1 table from the L1_table_offset in the Head field
- Use the pre (64-l2_bits-cluster_bits) bit address to index the L1 table
- The offset in the L1 table gets the address of the L2 table
- Use the next l2_bits in the address to index the L2 table and get a 64-bit table entry
- Get the address of cluster with an offset in the L2 table
- Use the rest of the cluster_bits bit in the address to index the cluster to get the data block
If the offsets in the L1 table and the L2 table are empty, the area is not yet allocated by the mirrored file.
Note the first two bits of the offset in the L1 table and the L2 table are reserved for the expression ' copied ' or ' compressed '.
Copy-on-write image File
Qcow2 image can be used to save the change of another image file, it does not modify the original image file, only the original image file is different, this image file is called the copy-on-write image. Although it is a separate file, most of its data comes from the original image, and only the cluster that have changed compared to the original image file will be recorded.
This is easy to implement, in the header information to record the original file path. When you need to read a cluster from a copy-on-write image file, first check that the area is already assigned in the image file and read from the original file if it is not.
Snapshot
Snapshots are somewhat like copy-on-write files, but the difference is that a snapshot is a writable one. The snapshot is the original file itself (internal snapshot). It contains both the original file part before the snapshot, and it contains the writable part itself.
Each snapshot contains the following header structure:
Listing 3. Qcow2 Snapshot Header
typedef struct QCOWSNAPSHOTHEADER {/ * header is 8 byte aligned */ uint64_t l1_table_offset; uint32_t l1_size; uint16_t id_str_size; uint16_t name_size; uint32_t date_sec; uint32_t date_nsec; uint64_t vm_clock_nsec; uint32_t vm_state_size; uint32_t extra_data_size; /* for extension *// * Extra data follows *// * ID_STR follows */ * Name follows */ } QCOWSNAPSH Otheader;
Other features of Qcow2
Qcow2 supports compression, which allows each cluster (cluster) to use zlib compression alone. It also supports encryption using a 128-bit AES key.
Create Qcow2 and raw files and compare the two mirrors
Create qcow2 files using the Qemu-img software that comes with the QEMU package.
Listing 4. Creating Qcow2 and Raw files
$ qemu-img create-f qcow2 test.qcow2 10GFormatting ' test.qcow2 ', Fmt=qcow2 size=10737418240 encryption=off cluster_size= 65536 lazy_refcounts=off$ qemu-img create-f raw test.raw 10GFormatting ' Test.raw ', Fmt=raw size=10737418240
Compare the actual size of the files in both formats and the size of the footprint as follows:
Listing 5. Qcow2 and Raw file space usage comparison
$ ll-sh test.*200k-rw-r--r--1 qiaoliyong qiaoliyong 193K May 6 10:29 test.qcow2 0-rw-r--r--1 Qiaoliyong qiaol Iyong 10G May 6 10:28 test.raw[[email protected]]$ stat test.raw file: "Test.raw" Size: 10737418240 Block: 0 IO block: 4096 normal file [[email protected]]$ stat test.qcow2 file: "Test.qcow2" Size: 197120 block: 4096 IO BLOCK: Normal file
From the contrast, we can see that the image file size of the Qcow format is 197120 bytes, occupying a space of 200K and occupying 200 disk space. While the raw file does not occupy disk space, it is an empty file.
Back to top of page
Raw format and Qcow2 conversion
The Qemu-img tool provided in the QEMU package can be used for image mirroring of some common operations.
The command to convert RAW format to QCOW2 format is as follows:
Qemu-img convert-f raw-o qcow2 test.raw test.raw.qcow2[[email protected] kimchi]$ ll-sh test.*200k-rw-r--r--1 qiaoliy Ong Qiaoliyong 193K May 6 10:29 test.qcow2 0-rw-r--r--1 qiaoliyong qiaoliyong 10G May 6 10:28 test.raw20 0k-rw-r--r--1 qiaoliyong qiaoliyong 193K May 6 10:44 Test.raw.qcow2
Performance comparison of two format files
Table 1. Three image format performance comparisons using the IDE as a driver for virtual disks
cache = |
off |
Writethrough |
writeback |
Old Qcow2 (0.10.5) |
16:52 min |
28:58 min |
6:02 min |
New Qcow2 (0.11.0-RC1) |
5:44 min |
9:18 min |
6:11 min |
Raw |
5:41 min |
7:24 min |
6:03 min |
Table 2. Three image format performance comparisons using Virtio as a driver for virtual disks
cache = |
off |
writeback |
Old Qcow2 (0.10.5) |
31:09 min |
8:00am min |
New Qcow2 (0.11.0-RC1) |
18:35 min |
8:41 min |
Raw |
8:48 min |
7:51 min |
Summary
This article focuses on the format and characteristics of the image file Qcow2 used by the QEMU virtual machine, and compares it to the raw format image. Qcow2 formatted files have some loss in performance over the Rraw format (mainly in the case of file increments, qcow2 format files in order to allocate cluster more time), but the qcow2 format of the image is smaller than the Raw format file, Only when the virtual machine actually occupies the disk space, its files will grow, it can easily reduce the migration cost of traffic, more suitable for the cloud computing system, it also has encryption, compression, and snapshots and other raw formats do not have features.
Reference Learning
- Reference https://people.gnome.org/~markmc/qcow-image-format.html about the QCOW2 image format
- Refer to Http://qemu.weilnetz.de/qemu-doc.html#disk_005fimages_005fformats about how Qemu uses Qcow2
- Reference Http://www.linux-kvm.org/page/qcow2 new features about QCOW2
- Reference the performance of Https://fedoraproject.org/wiki/Features/KVM_qcow2_Performance Qcow2
- Find more resources for Linux developers, including beginners for Linux, in the DeveloperWorks Linux zone.
Transferred from: http://www.ibm.com/developerworks/cn/linux/1409_qiaoly_qemuimgages/
Image files used by QEMU: Qcow2 and Raw