PE file infection and memory resident

Source: Internet
Author: User
Tags high cpu usage

This time, I will discuss the virus infection technology with you. In addition, from the beginning of this article, we will be gradually exposed to some advanced virus coding technologies. For example, memory resident, EPO (Entry Point fuzzy) technology, encryption technology, polymorphism and deformation. Through these advanced techniques, you will further experience the essence of virus technology, so as to better enjoy the exquisite ideas and programming skills.

After the problem of the starting directory is solved, you can use findfirstfile and findnextfile to traverse all the files and directories under these directories, the Traversal method can use the depth-first or breadth-first search algorithm, which is commonly used. The specific implementation method can be recursive search or non-recursive search. Recursive search requires stack space, which may result in stack space depletion and exceptions. However, this problem rarely occurs in real applications, but not in recursive search, but the code implementation is slightly complicated. In real-world applications, recursive traversal is the most used. When searching, you can specify the first parameter of findfirstfile *. * To search all files, determine whether the dwfileattributes member in the win32_find_data structure is a directory based on the search results. If the Member is a directory, continue to traverse the subdirectory, based on the file name Member in cfilename of win32_find_data, determine whether there is a file suffix to be infected to modify the infection action. The following code recursively searches a directory and all its subdirectories:

Void enum_path (char * cpath ){
Win32_find_data WFD;
Handle HFD;
Char cdir [max_path];
Char subdir [max_path];
Int R;
Getcurrentdirectory (max_path, cdir );
Setcurrentdirectory (cpath );
HFD = findfirstfile ("*. *", & WFD );
If (HFD! = Invalid_handle_value ){
Do {
If (WFD. dwfileattributes & file_attribute_directory)
{
If (WFD. cfilename [0]! = '.'){
// Synthesize the complete path name
Sprintf (subdir, "% s \ % s", cpath, WFD. cfilename );
// Recursively enumerate subdirectories
Enum_path (subdir );
}
} Else {
Printf ("% s \ n", cpath, WFD. cfilename );
// The virus can be determined based on the suffix
// No file to be infected
}
} While (r = findnextfile (HFD, & WFD), R! = 0 );
}
Setcurrentdirectory (cdir );
}
In just over 20 lines of C code, file traversal is implemented. The powerful functions of Win32 API not only provide convenience for developers, but also open the door for viruses. The implementation of Assembly is a little more complicated. Interested readers can refer to the enum_path section in elkern. The principle is the same. Due to the length, no compilation code is provided here.

Non-recursive search does not use stacks to store related information, but uses an explicitly allocated linked list or stack structure to store related information. An iterative loop is applied to achieve recursive traversal of the same function, the following is a simple implementation of using the linked list to process the sub-directory list in stack mode:

Void nr_enum_path (char * cpath ){
List <string> dir_list;
String cdir, subdir;
Win32_find_data WFD;
Handle HFD;
Int R;
Dir_list.push_back (string (cpath ));
While (dir_list.size ()){
Cdir = dir_list.back ();
Dir_list.pop_back ();
Setcurrentdirectory (cdir. c_str ());
HFD = findfirstfile ("*. *", & WFD );
If (HFD! = Invalid_handle_value ){
Do {
If (WFD. dwfileattributes & file_attribute_directory ){
If (WFD. cfilename [0]! = '.'){
// Synthesize the complete path name
Subdir = cdir + "\" + WFD. cfilename;

Cout <"Push subdir:" <subdir <Endl;
// Recursively enumerate subdirectories
Dir_list.push_back (string (subdir ));
}
} Else {
Printf ("% s \ n", cpath, WFD. cfilename );
// The virus can be determined based on the suffix.
// Whether to infect the corresponding file
}
} While (r = findnextfile (HFD, & WFD), R! = 0 );
}
} // End while
}

When using the assembly language, you need to manage the linked list and allocate and release the corresponding structure on your own. Therefore, it is cumbersome and the amount of code is a little large. Therefore, most viruses are searched recursively. It is worth noting that it is time-consuming to search for deep directories. To avoid high CPU usage for most viruses, sleep will be called to sleep after a certain number of files are searched, to avoid detection by sensitive users. The file search and infection module is usually run in a separate thread. After the virus obtains control, it creates the corresponding search and infection thread, and gives the master ready-to-use control to the original program.

Modify and infect PE files

Now that you have been able to search for all files in disk and network shared files and want to implement parasitic operations, the next step is to infect the searched PE files. An important consideration for infecting PE is the location where the virus code is written to the PE file. Read/write files generally use Win32 API createfile, createfilemapping, mapviewoffile, and other APIs to map files in memory. This avoids the trouble of managing the buffer by yourself and thus is used by many viruses. To read and write a file with read-only attributes
Before the operation, the virus first uses getfileattributes to get its attributes and save them. Then, it uses setfileattributes to change the attributes of the file to writable.
Restore the attribute value after infection.
  
Generally, there are several solutions to infect PE files:

A) Add a new section. Write the virus code to a new section, and modify the attribute values such as the file size in the section table and file header accordingly. Because a section is added at the end of the PE, it is easy to be noticed by users. In some cases, because the original PE Header does not have enough space to store the section table information of the new section, other data needs to be moved. In view of the above questions
There are not many pe viruses using this method.

B) attach it to the last section. Modify attribute values such as the size and attributes of the last section table and the file size in the file header. As more and more anti-virus software uses a tail scan method, many viruses also need to append random data after the virus code to escape this scan. This method is widely used by modern PE viruses.

C) written to the gaps reserved by unused sections in the PE file header. The size of the PE Header is generally 1024 bytes. The actual occupied part of a common PE file with 5-6 segments is generally about 600 bytes, and the remaining space of more than 400 bytes is available. Each section of the PE file is usually aligned by 512 bytes, but the actual data in the section is often not fully used by all the 512 bytes. The PE file alignment design was originally out of efficiency considerations, however, the gaps left behind the virus. The total length of the original PE culture may not increase after the infection. Therefore, the CIH virus has been favored by virus authors since its first use.

D) overwrite some very useful data. For example, in the relocation table of an EXE file, because EXE generally does not need to be relocated, it can overwrite the relocated data without causing problems, to be safe, you can clear the corresponding items in the datadirectory array that indicates the relocation item in the file header. This method generally does not increase the length of the infected file. Therefore, this method is widely used by many viruses.

E) compress some data or code to save the space for storing the virus code, and then write the virus code into the space. Before running the program code, the virus first decompress the corresponding data or code, then, the control is handed over to the original program. This method generally does not increase the size of the infected file. However, there are many factors to consider and it is difficult to implement it. Not much is used.

Regardless of the method, it involves operations on the PE Header and the section table. First, let's study the modification of PE, that is to say, how to add the virus code so that the PE file is still a valid PE file and can still be loaded and executed by the system loader. The attributes of each section in the PE file are described by a table item in the section table. The section table follows image_nt_headers, so the offset from the file 0x3c
Locate the start offset of image_nt_headers with the double characters at the position, and then locate the start position of the section table with the size of image_nt_headers (248 bytes). Each table item is an image_section_header structure:

 

Typedef struct _ image_section_header {
Byte name [image_sizeof_short_name];
// Section name
Union {
DWORD physicaladdress;
DWORD virtualsize;
// The actual size of the byte Calculation
} MISC;
DWORD virtualaddress;
// Starting virtual address of the node
DWORD sizeofrawdata;
// Filealignment according to the file header
  
// Size after Alignment
DWORD pointertorawdata;
// The offset pointing to the start of the Section in the file
DWORD pointertorelocations;
DWORD pointertolinenumbers;
Word numberofrelocations;
Word numberoflinenumbers;
DWORD characteristics;
// Section attributes
} Image_section_header, * pimage_section_header;

The number of table items is determined by the numberofsections member of image_nt_headers. The ing between the memory virtual address and the address in the file can be converted from the starting virtual address in the section table and the location of the Section in the file. To add a section, you need to modify the table array, add a table item in it, and then modify
The number of numberofsections. It is worth noting that some existing section tables in PE files may be followed by other data, such as bound import data. In this case, you cannot simply add a Section Table item, you must move the data and modify the corresponding structure before adding a section. Otherwise, the PE file cannot be executed normally. Because many viruses are self-modified, the Section attribute is usually set to e000xxxx, indicating that the section can be read and written, otherwise, you need to call APIs such as virtualprotect at the beginning of the virus to dynamically modify the attributes of the Memory Page.

The definition of the preceding section table also shows that the actual data of each section is aligned according to the filealignment in the file header. The size is generally 512, therefore, each section may have no more than 512 bytes of unused space (sizeofrawdata-virtualsize), which just gives the virus a chance. The famous CIH virus first adopted this technology, however, the problem is that the gap size of each section is not fixed. Therefore, you need to divide the virus code into several parts for storage and combine them through a piece of code during the runtime, the advantage is that if the virus code is small, you do not need to increase the size of the PE, which is more concealed. If the unused space of all sections is still insufficient for virus code, you can add a section or attach it to the last section.

Attaching to the last section is relatively simple. You only need to modify the virtualsize of the last section in the section table and the sizeofrawdata Member after alignment by filealignment. Of course, if the file size is changed in all the above-mentioned modifications, the size of the sizeofimage value in the file header must be corrected, this value is the size of all sections and headers aligned by sectionalignment.

There are two problems worth noting here. The first problem is the processing of WFP (Windows File Protection) files. The WFP mechanism is a new mechanism to protect system files from Windows 2000, if the system finds that an important system file is changed, a dialog box is displayed to warn that the file has been replaced. Of course, there are multiple methods to bypass WFP protection, but for viruses, the simpler method is not to infect the system files in the WFP list.

You can use sfcisfileprotected, the export function of SFC. dll, to determine whether a file is in the list. The first question of this API is getting started ?, The second parameter is the name of the file to be determined. If a non-0 value is returned in the list, otherwise, 0 is returned. Another problem is the PE File validation. Most PE files do not use the checksum field validation value in the file header. However, some PE files, such as key system service program files and driver files, must be correct, otherwise, the system loader rejects loading. The Checksum In the PE Header can be calculated using the export function checksummappedfile of imagehlp. dll. You can also calculate the checksummappedfile after clearing the field 0 using the following simple equivalent algorithm:

If the size of the PE file is an odd number of bytes, it is supplemented with 0, so that it is an even number of bytes. The Checksum field of the PE file header is cleared to 0, and then the ADC operation is performed in two bytes. Finally, the sum and the actual size of the file are calculated by the ADC operation to obtain the checksum value. The following cal_checksum process assumes that ESI has pointed to the PE file header, the checksum field of the file header has been cleared 0, and the CF flag has been reset:

 

; Call example:
; CLC
; Push pe_fileseize
; Call cal_checksum:
ADC bp, word [esi]; the initial ESI points to the file header, which is guaranteed by EBX
The file size is stored.
INC ESI
INC ESI
Loop cal_checksum
MoV EBX, [esp + 4]
The adc ebp, EBX, and EBP stores the PE checksum.
RET 4

In addition to the PE Header checksum, many programs also have verification modules, such as WinZip and WinRAR self-extracting files. If the files are infected, the files cannot be decompressed normally. Therefore, do not infect similar PE files.

The Code related to file modification in elkern is infected in infect. in ASM, the virus first tries to store its own code by using the gap between the PE Header and the section. If all gaps are still insufficient to store the virus code, it is appended to the last section, for more information about the code, see.

In fact, apart from the basic technologies mentioned above, such as virus relocation, API address acquisition, file search, and modification of infected pe, there are still several important aspects of virus technology that are not mentioned: virus memory resident infection technology, Kernel Mode virus technology, anti-analysis and hiding technology (EPO, polymorphism and Deformation Technology ).

Memory resident infection is the deformation of the above-mentioned active full-disk search technology. The virus code resident memory passively waits for user events or waits for the program code to be awakened to the specified path to execute the infection operation. Kernel virus, also known as ring0 virus, refers to viruses running in ring0 privileged kernel mode. This type of virus is quite special and needs to call the kernel driver interface for infection and transmission, due to the complexity of the NT kernel, this type of virus is very difficult to compile. In addition, due to the differences between the kernels of different versions of the NT System, the writers need to make extra efforts to make the virus run stably, the final result may be quickly discovered due to a blue screen accident due to inadequate tests, which will seriously affect the virus propagation speed and scope. Ring0
There are few viruses. Among them, the most famous ring0 virus does not belong to CIH, but it is extremely noticeable due to its huge destructive nature. Because the number of ring0 viruses is small and extremely complex
Miscellaneous. This article is limited in length and will not be introduced in depth. Hiding technologies against anti-virus software, anti-analysis, and virus itself can be said to be another important development direction of virus technology in recent years, in addition to using social engineering to rapidly spread over the Internet, its purpose is to prevent or escape scanning of anti-virus software and prolong its survival to the maximum extent, including EPO (Entry Point fuzzy) technology, encryption technology, polymorphism and deformation technology. The author will introduce it to readers in subsequent articles.

Memory resident infection Technology

If the reader has used a MS-DOS, there will be no stranger to programs that reside in the memory, intercept interruptions to perform specific operations (TSR. In the era of MS-DOS, not only does normal applications use TSR Technology in large quantities, but also uses TSR technology to reside in the memory, monitor file read/write operations and wait for infection. In Windows NT, the address space of each process is isolated. Different processes cannot freely access the memory of each other, and user-mode code has access restrictions: the ring3 program code can only read and write the exclusive part of the application in its process space (generally 2 GB lower when the process space is 4 GB ), you do not have the read/write permission for the space occupied by the system kernel. This makes memory resident infection difficult, but similar ideas and technologies are still possible. What needs to be done is a bit of flexibility: Since each process has its own process space, although it cannot be permanently resident, virus code can still reside at least during the life cycle of the process. Since windows still works and DOS interrupts the same API, the virus can naturally intercept the API, so as to monitor file read/write and server infection.

The API Interception Technology is often called the hook technology. It is easy to implement: Modify the API entry point code and change it to a jump command pointing to the virus code, at the beginning of the virus code, save the parameters passed to the API. After the virus code is executed, restore the API entry code and saved parameters, and redirect to the API entry to continue the program execution. The hook api also has several other types of deformation, such as modifying the API address pointer of the import table and modifying the call command of the call API point; or to prevent API hook from checking and modifying an instruction at the end or in the middle of an API function, the idea is similar.

Since we have introduced the full search infection technology, why does the virus still use the memory resident infection technology?

Let's look back and think about it. As the price of the storage media decreases, the capacity of the storage media available to users is getting bigger and bigger, but the disk read/write speed is not greatly improved, ordinary full disk search is time consuming. Imagine that the user's interface will appear 10 minutes after double-clicking a program. Even the user who is using the computer for the first time will be under suspicion, this is extremely unfavorable for hiding and spreading viruses. As a result, the popular viruses do not directly search for infections after obtaining control, but adopt the following workarounds:

A) Only search for the current directory and infect the PE files, and then quickly give control to the Host Program.

B) after a separate thread is created, the control is immediately handed over to the Host Program. The current directory or full search is infected while the user is operating.

C) use the hook technology to monitor file or directory-related API operations and infect user-operated files or executable files in their directories.

The technology for searching the current directory has been discussed earlier. The following describes the thread and memory resident hook infection technologies involved in B and C.

1. Create independent threads

If you are familiar with Win32 API programming, you will not be unfamiliar with the createthread API. Each thread is an independent execution unit and the most basic unit for scheduling and time slice allocation in the Windows Kernel. All threads in the same process share resources in the same address space. The virus can put the real functional part in this thread for execution. Virus code can be executed at the same time as the host program is running, which is hard for common users to detect.

2. memory resident hook infection Technology

The thread model mentioned above is still flawed. Imagine what if the host Program exits immediately after execution? The virus infected thread may not be running yet. It is worth noting that in the Win32 thread model, if a thread executes the exitprocess call, other running threads will not be automatically notified. The virus code may have exited before it is executed. Imagine that the infected thread only writes half of its code when infecting a PE file. Although similar situations rarely happen in reality, to improve the virus's ability to infect and conceal the virus is a problem that must be considered. In addition, the use of separate threads for full-disk search and infection will still cause a large amount of computer CPU resources to be occupied. Users may notice the slow running of programs and the flash of Hard Disk indicators. Of course, it can be solved by infecting a small number of files and then sleeping for a while to continue the infection or only infecting the computer when it is idle.

Let's think about it from another angle: Is it really necessary to search the whole and then infect all PE files? Or is it highly efficient to search for infected models? In fact, a large part of PE files on users' computer hard disks rarely have the opportunity to execute. Apart from system programs, there are only a few programs that often run: there may be several game programs or user business programs. If you use resources to infect PE files that rarely have the chance to be executed, it is of little significance except to demonstrate the concept of virus infection. On the contrary, if the programs that users often execute are preferentially infected, even if only a small number of files are infected each time, the virus transmission speed will be greatly improved. In view of this, modern virus authors often use a simple heuristic rule: preferentially infecting PES that are frequently accessed by users or user programs.
Files or PE files in frequently accessed folders. In this way, the virus gradually spreads from the infected file to the PE file that is related to the program or is close to the logical location (directory level, finally, files on the entire disk are infected.

Whether you want to get a notification at the end of the process or implement the above file infection model, you can use the hook technology to solve the problem. By using APIs related to hook files or directory operations, the virus can obtain the path or directory of the PE file being operated, in this way, the file is infected with the PE file being operated or the PE file under the directory being operated. APIS related to file or directory operations include:

Copyfile copyfileex createfile findfirstfile findfirstfileex
Findnextfile getcurrentdirectory setcurrentdirectory getfileattributes
Setfileattributes getfilesize getfiletype getfullpathname lockfile
Lockfileex movefile movefileex searchpath unlockfile unlockfileex

If you want to get a notification at the end of a process or thread or perform an infection operation, you can hook the following API:

Exitprocess terminateprocess exitthread terminatethread

If the 9x system is not considered, the hook api can also consider the hook native API. Windows Kernel code runs at the ring0 privilege level in CPU-protected mode. common applications run at the ring3 privilege level. General applications perform Io operations and access to memory resources, in addition to the strict user permission review mechanism of the Windows NT system, different users have different access permissions to resources, which leads to more and more considerations for virus running. In contrast, ring0 has no restrictions on the execution of privileged programs. Therefore, ring0 has a unique appeal to virus writers. In Windows 9x, common ring3 applications are connected to ring0.
The mode is very easy, but it is much more difficult in Windows NT, but it is not impossible: the most common method is to use the kernel mode-driven loading mechanism to write the virus code into the driver file and then load it. Other techniques include infection.
The Windows Kernel Driver file or the access restriction bits of ntldr IDT and gdt table items allow access in ring3 so that the virus code can be switched to ring0; you can also use the \ device \ physicalmemory Object Vulnerability to access ring0.

The most successful ring0 virus is the well-known CIH, but it can only run in Win9x. However, in Windows NT systems, although ring0 may still be used, there are very few popular ring0 viruses. This is because the kernel processing process of the NT system is very complicated, it is not difficult to compile a ring0 virus that can run stably on all versions of the NT System. It requires more advanced programming skills and a lot of tests. Therefore, the number of ring0 viruses is small, so this article will not go into detail. Anti-Virus analysis technology

Today, anti-virus has become an industry. After the virus is discovered, a large number of anti-virus manufacturers will rapidly analyze the virus and upgrade its virus database or launch a virus killing tool, which is quite a trend of "passing through the streets and shouting. However, this is based on the analysis of virus code and virus behavior.

From the virus perspective, it makes sense for anti-virus Analysis and Dynamic Static detection and removal, even though the tricky fox cannot escape the pursuit of experienced hunters, however, anti-analysis and anti-virus technology can at least delay the time for anti-virus vendors to launch anti-virus solutions to prolong the life cycle of viruses and expand the scope of virus propagation.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.