Go to: Windows Registry Hive File Format Parsing

Source: Internet
Author: User

Parsing the hive file format in Windows Registry
Author: Fahrenheit
Introduction

We believe that you are familiar with the Registry of windows. You can use the Registry Editor (Regedit) provided by the system to access and modify the data in the registry. Intuitively, the Registry is displayed in the form shown in figure 1, which consists of the Root Key, subkey, value, and data). There are data types, including REG_SZ, String, REG_BINARY, binary, REG_DWORD, and double-font, reg_multi_sz, multi-string value type, reg_expand_sz, and variable-length data string type.

The Registry is equivalent to the data files of all 32-bit hardware/drivers and 32-bit applications in windows. It is a database of system information. Since it is a data file, there must be a registry shadow on the disk. Windows 2000/XP registry files are stored in the/System Folder/system32/config directory when system settings and default user configuration data are used, default, Sam, security, software, userdiff, and system. The user's configuration information is stored in the/documents and setting/directory of the system's disk, including ntuser. dat, ntuser. INI and ntuser. dat. log. The path of each file is indicated by the key value under the Registry Key HKLM/system/CurrentControlSet/control/hivelist.

The Registry structure we see is presented to us after being read by the Registry Editor. Its disk form is not a simple large file, but a group of independent files called HIVE, hive Chinese name is "Storage nest ". Each hive file can be understood as a separate registry tree. Like the PE format in Windows, it also has its own organizational form. The task in this article is to analyze the organization of hive files and complete an analysis program in hive format.

How registry APIs work

Windows provides a large number of APIS for users to access and modify data in the registry. Regedit is implemented based on these Apis. Registry APIs are roughly divided into two types: user space and kernel space. Generally, you call the former, call the transfer layer by layer, and then call the file system driver by the kernel registry API, access the hive file on the disk and return the requested data result. This process is a bit lengthy, but for the sake of the security of storing data in the registry, it is worthwhile to lose some performance.

Hive Structure Analysis

Before recognizing a real hive file, let's list several main features of the hive file. First, presenting them will help us understand their file organization and data structure.
? The Registry consists of multiple hive files.
? A hive file consists of multiple bins. The first part of the hive file contains a file header (Basic Block and base block), which describes some global information of the hive file.
? A bin is composed of multiple cells, which can be divided into five types (described later) for storing different registry data.
In this article, we do not use the English words hive, bin, and cell, but alternate with the corresponding Chinese words. In Chinese, they correspond to three terms: nest, nest, and nest.

A nest is regarded as a block allocation unit, which is similar to dividing disks into clusters. According to the definition, the size of each registry block is 4096 bytes (4 kb). When new data is to be added to a storage nest, the storage nest always increases according to the block granularity. The first block of a nest is the base block, which contains global information about the nest, including a feature signature "regf", updating the serial number, the timestamp of the last write operation in the nest, the version number of the nest format, the verification, and the internal file name of the nest file. The following _ hbase_block is the data structure restoration of a basic block.
Typedef struct _ hbase_block
{
Ulong signature;/* signature ascii-"regf" = 0x66676572 (small order )*/
Ulong sequence1;
Ulong sequence2;
Large_integer timestamp;/* timestamp of the last write operation */
Ulong Major;/* main version number */
Ulong minor;/* minor version */
Ulong type;
Ulong format;
Ulong rootcell;/* offset of the first key record */
Ulong length;/* data block length */
Ulong cluster;
Uchar name [64];/* nest file name */
Ulong reserved1 [99];
Ulong checksum;/* checksum */
Ulong reserved2 [2, 894];
Ulong boottype;
Ulong bootrecover;
} Hbase_block, * phbase_block;

Windows organizes the registry entries stored in a nest in a container called a nest. When a nest is added to a nest, in addition, when the nest room must be extended to accommodate the room, the system will create a allocation unit for the nest box. The nest box is the size of the boundary that the new nest room expands to the next block. The system regards any space between the tail of the nest room and the tail of the nest box as idle space, therefore, other nest rooms can be allocated.

The nest box also has a header identifier, including a special signature "hbin", a domain that records the offset of the nest box in the nest file, and the size of the nest box. The following is the data structure of the nest box.
Typedef struct _ hbin
{
Ulong signature;/* signature ascii-"hbin" = 0x6e696268 (small-end )*/
Ulong fileoffset;/* offset of the starting position of the current nest box relative to the first nest box */
Ulong size;/* size of the current nest Bin */
Ulong reserved1 [2];
Large_integer timestamp;
Ulong spare;
} Hbin, * phbin;

A nest room can hold a key, a value, a security descriptor, a column of sub-keys, or a column of key values, and each has a corresponding nest room to store data. At the beginning of the data in the nest room, a data field describes the data type in the nest room. The specific data structure is as follows:

? The key nest room contains a registry key (also called a key node). A key nest room contains a feature signature (for a key, kN, and a symbolic link, KL) the latest timestamp of the key, the nest room index of the parent key's nest room, the index of the nest room that represents the child key of the key, and the security of the key descriptor nest room index, a string key nest index that represents the key class name, and the name of the key.
Typedef struct _ cm_key_node
{
Ushort signature;/* signature ascii-"kN" = 0x6b6e (small-end order )*/
Ushort flags;/* root key ID: 0x2c; other values: 0x20 */
Large_integer lastwritetime;
Ulong spare;
Ulong parent;/* parent key offset */
Ulong subkeycounts [2];/* subkeycounts [0] indicates the number of subkeys */
Union/* offset: 0x001c consortium */
{
Struct
{
Ulong subkeylists [2];/* subkeylists [0] indicates the offset of the bin for the subkey list difference */
Child_list ValueList;/* ValueList struct */
};
Ulong childhivereference [4];
};

Ulong security;/* offset of the security descriptor record */
Ulong class;/* offset of the class name */
Ulong maxnamelen: 16;
Ulong userflags: 4;
Ulong rule controlflags: 4;
Ulong Debug: 8;
Ulong maxclasslen;
Ulong maxvaluenamelen;
Ulong maxvaluedatalen;
Ulong workvar;
Ushort namelength;/* key name length */
Ushort classlength;/* Class Name Length */
Pbyte name;/* key name */
} Cm_key_node, * pcm_key_node;

? Value nest room, a nest room, contains information about the value of a key, the nest room contains a signature kV, the type of the value, such as REG_DWORD or REG_BINARY, and the name of the value. A value nest room also contains the index of another value nest room, which contains the data of the former.
Typedef struct _ cm_key_value
{
Word signature;/* signature ascii-"KV" = 0x6b76 (small order )*/
Word namelength;/* Name Length */
Ulong datalength;/* Data Length */
Ulong data;/* Data offset or data. If the highest bit of datalength is 1, it is data,
And datalenth & 0x7fffffff is the data length. Otherwise */
Ulong type;/* value type */
Word flags;
Word spare;
Pwchar name;/* value name */
} Cm_key_value, * pcm_key_value;

? The child Key List nest room consists of a series of nest key indexes in the nest room, which are all the child keys under the same parent key.
Typedef struct _ cm_key_index
{
Word signature;
Word count;
Ulong list [1];
} Cm_key_index, * pcm_key_index;
If Signature = cm_key_fast_leaf, the signature is "FL", or signature = cm_key_hash_leaf, and the signature is "Hl", the list is a struct:
Struct
{
Ulong offset;
Ulong hashkey;
}
Otherwise: ulong offset;

? The Value List nest room consists of a series of nest room indexes of the value nest room, which are all values under the same parent key. The data structure is the structure mentioned above. That is, the ValueList data field in the consortium of _ cm_key_node.
Typedef struct _ child_list
{
Ulong count;/* Number of ValueList. Count values */
Ulong list;/* ValueList. List Value List difference bin offset */
} Child_list, * pchild_list;

? The security descriptor nest contains a security descriptor nest, whose header signature is KS and a reference count, the reference count value records the number of key nodes that share the security descriptor. Multiple key nest chambers can share the same security descriptor nest.
Typedef struct _ cm_key_security
{
Word signature;/* signature ascii-"SK" = 0x6b73 (small order )*/
Word reserved;
Ulong flink;/* offset of the previous "SK" record */
Ulong blink;/* offset of the next "SK" record */
Ulong referencecount;/* Reference count */
Ulong descriptorlength;/* data size */
Security_descriptor_relative descriptor;/* Data */
} Cm_key_security, * pcm_key_security;

The structure of the nest is established through some links, which are called cell indexes ). Each nest room index is the offset of a nest room in the nest file. Therefore, the nest room index is like a pointer pointing from one nest room to another, and the Configuration Manager interprets the nest room index as an offset relative to the starting position of the nest. Therefore, if you want to find the key nest room of sub-Key A, and the parent key of A is B, you must first use the index of the sub-Key List in the nest room of B, find the nest room that contains the list of all the sub-keys of B, and then use the index list of the sub-keys in the nest room to find the nest room of each sub-Key of B, find.

The differences between nest room, nest box and block are confusing. So let's look at a simple layout example of registry nest storage, 5. This example contains a basic block and two Nest Boxes. The first nest box is empty, and the second nest box contains several nest rooms. The nest has two keys: Root and subkey. Root has two values: val1 and val2. Through a sub-Key List nest room, you can locate the sub-keys of the Root Key, through a Value List nest room, you can locate the root key value. In the second nest box, the idle space belongs to the empty nest room.

Obtain hive files

Knowing the storage location of hive, we naturally want to capture them, dissect them one by one, and study them carefully. However, if you directly open or copy C:/Windows/system32/config/system, you will see the error message in figure 6, which is the system's exclusive resource.

The arrest may seem tricky, but it is easy to solve. Hive is an important resource for windows. It can only be exclusively accessed by the system since it is started. Another way of thinking is to start it in another system, just like Linux on a machine. Isn't the hive file of the current system accessible? However, if you have to access the hive file in the current system, you have to turn to the file system driver. The latter is beyond the scope of this article, so we will not consider it.
However, we always hope that the hive file is relatively simple for the purpose of learning examples, so that we can clearly understand its structure. Hive files in the system are generally not suitable. The size of the hive files has reached the MB level. Therefore, we need to build a small hive for learning at the beginning.
For future convenience, we have created a sub-key test_root under HKLM/SAM, and then created two sub-keys 1test and 2 test under test_root, in addition, five different values are created under 1test, and corresponding data is filled in. then, use the regsavekey function to compile a small program and save test_root as a hive file. In this way, test_root becomes the Root Key of the hive file. The applet is included in the attachment of the article. Next, we can analyze the structure and functions of each part of the hive file.

Analysis of hive format instances

Open the test_root file in a hexadecimal editor. First, we can see the signature of the basic block-"regf" string. This is the abbreviation of the Registry file, marking it as a registry file...

This section uses images and examples to explain in detail a Registry Hive file named test_root. For details, see the magazine.

Hive file Reader

Based on the above structure explanation and analysis, we can write a hive file reading program and store it in the attachment of the article. As shown in Figure 13, test_root has sub-keys 1test_subkey and 2 test. The former has five key values: REG_SZ, 2_reg_binary, 3_reg_dword, 4_reg_mulit_sz, and 5_reg_expand_sz. [] Is the type of the key value, () is the length of the value, in bytes, for REG_SZ data is printed out of its unicode encoding. Figure 14 shows the analysis result of the hive file named software on the local machine. The first item is the registry information of the 360 security guard.

Postscript

In this article, most of hive data formats come from networks and their own organization. Due to the lack of Microsoft official documentation support, we cannot guarantee that the analysis program is correct in all circumstances. Petter nordahl-hagen once wrote an NT Registry Hive access library, but it is a little older, and Windows may modify the organization of these structures as the version changes, it is no longer applicable to some existing XP systems. The Registry has many security application scenarios. If you need to learn more about this field, you can use windbg to use the Windows kernel data structure exported by it to further understand the Registry file organization, we also welcome everyone to discuss with me, learn together, and make progress together.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.