Linux Rootkit detection method based on memory Analysis
0x00 Introduction
A Linux server finds an exception. For example, it is determined that the Rootkit has been implanted, but the routine Rootkit detection method by O & M personnel is invalid. What else can we do in this situation?
Figure 1 Linux Server implanted with Rootkit
Figure 2 general process of system command execution in Linux
0x01 Rootkit implementation and detection methods
Generally, Rootkit can be detected in the following ways:
Trusted Shell-binary files with static Compilation: lsof, stat, strace, last ,......
Detection tools and scripts: rkhunter, chkrootkit, and OSSEC
LiveCD -- DEFT, Second Look, Helix
Dynamic Analysis and debugging: Use gdb to analyze/proc/kcore based on System. map and vmlinuz image
Directly debug the bare device: debugFS
Before analyzing the advantages and disadvantages of these methods, let's take a look at Figure 2 to understand the general implementation principle of Linux Rootkit.
When the system commands/applications that work on the Ring3 layer (user space) implement certain basic functions, the system. so file is called. Note 1. And these. the basic functions implemented by the so file. For example, the file is read and written by reading the Syscall Table of the Ring0 layer (kernel space). Note 2 (system call Table) the corresponding Syscall (system call) the hardware is used to complete file read and write.
If Rootkit is in progress, what will happen to this process? See figure 3 below.
Figure 3 General Rootkit Execution Process
Rootkit tampered with the memory address of Syscall in the Syscall Table, causing the program to read the modified Syscall address and execute malicious functions to achieve its special functions and purposes.
It just lists a typical Rootkit workflow. By modifying the program to read different links of Syscall, different types of Rootkit can be generated.
Partial Implementation of Rootkit:
Intercept interruption-redirect sys_call_table and modify IDT
Hijack System Call-Modify sys_call_table
Inline hook-Modify sys_call and insert the jmp command
This part is not the focus of this article. After understanding the implementation principles of Rootkit, let's compare the advantages and disadvantages of conventional Rootkit detection methods.
If Rootkit modifies Syscall, the output produced by this method is also unreliable, and we cannot see anything hidden by Rootkit.
If you use the Rootkit detection tool, let's analyze the rkhunter detection principle.
In the rkhunter script file, the code of the scanrootkit function is as follows:
Figure 4 scanrootkit function in rkhunter
Note: The following two variables are defined in the installation script:
RKHTMPVAR="${RKHINST_SIG_DIR}"RKHINST_SIG_DIR="${RKHINST_DB_DIR}/signatures"
Figure 5 file list in the Signatures directory-Rootkit signature list
From the above code, we can see that the rkhunter scan Rootkit calls three important variables: SCAN_FILES, SCAN_DIRS, and SCAN_KSYMS for each Rootkit check.
The following four pictures show the specific codes of Rootkit detection for Adore and KBeast respectively.
Figure 6 detection process of classic Rootkit Adore in rkhunter
Figure 7 List of Adore files and directories detected in rkhunter
Figure 8 Rootkit KBeast detection process in rkhunter
Figure 9 list of KBeast files and directories detected in rkhunter
Based on the above analysis, we can see that rkhunter only checks whether the corresponding file exists in the default installation path of known Rootkit components and compares the file signature (signature ). This detection method is obviously too rough to do anything about the modified/New Rootkit.
Another popular Rootkit detection tool, chkrootkit, uses the LKM Rootkit detection module source file chkproc. c. The last update date is. The detection principle is similar to rkhunter. It is also based on signature detection and compares the ps command output with the/proc directory. In its FAQ, Q2's answer also confirms our conclusion.
Figure 10 FAQ of chkrootkit Q2
The implementation principles of common Rootkit detection tools are analyzed. Let's look at the limitations of LiveCD detection.
The use of LiveCD means to use a pure CD operating system to mount the original storage to perform static analysis/reverse operations on suspicious files, so that you can understand the Rootkit execution logic and the dependent so/ko files, what is the loaded configuration file. If some Rootkit related files are not found in advance, it is a tedious process to troubleshoot the entire file system one by one. In addition, the premise of using this method is that the emergency response personnel must have physical access to the server, which is inconvenient for the Environment hosted in the data center. In fact, LiveCD is more common in Rootkit cleanup or judicial forensics, rather than its pre-process.
Based on the above analysis, we will briefly summarize the effect of Rootkit detection methods, as shown in the table below.
Comparison of Rootkit Detection Methods
Detection Method limitations/defects: static compiled binary files are used in the user space, and the Rootkit at Ring0 layer is invalid. The tool rkhunter and chkrootkit scan known Rootkit features, compare the file fingerprints, and check/proc/modules. The effect is extremely limited. LiveCD: the active processes and network connections of DEFTRootkit cannot be seen, and only static analysis is supported. GDB dynamic analysis, debugging, debugging, and analysis/proc/kcore. The threshold is slightly higher and more complex. Not suitable for emergency response. DebugFS bare devices directly read and write data without relying on kernel modules, which is complex and suitable for laboratory analysis only.
Since the conventional Rootkit detection method has such a defect, is there a better way to detect it?
The following describes the Rootkit detection method based on memory Analysis in detail.
0x02 Rootkit based on memory Detection and Analysis
Rootkit is difficult to detect, mainly because of its high hiding feature, which is generally manifested in hiding processes, ports, kernel modules and files. However, no matter how it is hidden, there must be some clues in the memory. If we can normally dump the physical memory and debug symbols. and kernel's data structure to parse memory files, then you can have a real "profiling" of the system's current activity status ", then compare it with the "false" result output directly by executing the command in the system to find out the suspicious aspect. The following describes some principles.
1. Memory-based analysis and detection process
In a Linux system, the ps-aux command is generally executed to view processes. The essence is to read/proc/pid/to obtain process information. In kernel task_struct Note 3 (process structure), it also contains process pid, creation time, image path, and other information. That is to say, the relevant information of each process can be obtained through its corresponding task_struct memory address. In addition, each task_struct becomes a two-way linked list through a string of next_task and prev_task. You can use the for_each_task macro to traverse the process. Based on this principle, we can first find the memory address of init_task symobol (ancestor process) with a PID of 0, and then traverse it to simulate the ps effect. For more information, see.
Figure 11 task_struct in the kernel
In addition, there is a thing in the Linux Kernel called PID Hash Chain, as shown in 12. It is a pointer array, and each element points to the elements in the task_struct linked list of a set of pid, allows the kernel to quickly find the corresponding process based on the pid. Therefore, pid_hash analysis can also be used to detect hidden processes and obtain information about corresponding processes, which is more efficient.
Figure 12 PID Hash Chain in the kernel
2. Process Memory Maps (Process ing) based on Memory Analysis)
In task_struct, mm_struct Note 4 describes the virtual address space of a process. The process ing is mainly stored in the mm_rb and mmap Structure Variables of vm_area_struct. The approximate structure is shown in
Figure 13 structure of mm_struct (memory descriptor)
Figure 14 details of task_struct in the kernel
Each vm_area_struct node records VMA (virtual memory area) attributes, such as vm_start (start address), vm_end (end address), and vm_flags (access permission) and the corresponding vm_file (ing file ). The information we get from the memory is equivalent to/proc/ /Maps content.
3. detects network connections and opened files (lsof) based on memory Analysis)
In Linux, lsof (List Open Files) actually reads information in the/proc/pid/folder. This information can also be obtained through task_struct.
Files in the structure (Data structure) of task_struct point to files_struct (file structure), which indicates the file table opened by the current process. Its structure contains an fd_array (File Descriptor array). Each element fd in the array (File Descriptor) represents the File opened by a process. The structure of each File Descriptor contains the directory item dentry, File operation f_ops, and so on. These are enough for us to find the files opened by each process.
In addition, when the f_op structure member of a file is socket_file_ops or its dentry. d_op is sockfs_dentry_operations, you can convert it to the corresponding inet_sock structure to obtain the corresponding network information.
4. bash_history detection based on memory Analysis
In the subsequent penetration test phase, attackers often use the history-c command to clear the command history that is not saved in the. bash_history file. In Rootkit, configuring HISTSIZE = 0 or HISTFILE =/dev/null is also a common method to hide command history. For the latter, the history of the bash process is also recorded in the corresponding MMAP (its corresponding macro is defined as HISTORY_USE_MMAP note 6), through history_list () the corresponding mmap data of the function can also restore its historical records.
Figure 15 Kernel Module List
Figure 16 shows the history_do_write function in the bash 4.0 source code histfile. c file.
5. kernel module based on memory analysis and Detection
Checking Rootkit by traversing all the struct modules on the module list is a way to replace the lsmod command. However, if the Rootkit removes its LKM from the module list but is still loaded into the memory, this method does not work.
However, it is difficult to hide Rootkit in the/sys/module/directory, so we can still check hidden kernel modules by traversing the Sysfs file system.
6. process credentials based on memory analysis and Detection
In earlier versions of kernel 2.6.29, Rootkit can elevate the privilege of user-State processes by setting their valid user ID and valid group ID to 0 (root. In later versions, the kernel introduces the 'cred' structure. To this end, Rootkit keeps pace with the times by setting the same 'cred' structure as a root permission process. Therefore, by checking the 'cred' structure of all processes, you can better find the active Rootkit.