How to Write a module dumper & discycler

Source: Internet
Author: User

| = ----------- [How to Write a module dumper & discycler] -------------- = |
| = ------------------------------------------------------------------------ = |
| = --------------- [Coolq <qufuping@ercist.iscas.ac.cn>] ----------------- = |
| = ------------------------------------------------------------------------ = |

0-Preface
1-Linux 2.6 kernel module Loading Process
1.1 Introduction to the loading process
1.2 Test Module
1.3 Key Point Analysis
1.4 feasibility conclusion
1.5 Additional Points
2-module Extraction
2.1 simple module Dumper
2.2 program functions
3-disassembly of modules
3.1 BFD Overview
3.2 Basic intel machine instruction format
3.3 compile a simple anti-compiler using the binutils package
3.4 notes
4-end
5-Reference

--------------------------------------------------------------------------
---- [0 Preface

There have been many lkm trojan in Linux systems, including knark, adore, adore-ng ..., In emergency response
During the process, we can use madsys's method [1] to find hidden lkm Trojan. Even so, for forensics
In terms of the process, this is not enough. We also need to obtain samples of the Trojan module. If attackers are cunning enough, they can
Delete all. O/. Ko files. Therefore, we do not know what this suspicious module has done by virtue of the module name.
.

Therefore, we need a method to extract these suspicious modules and analyze their functions. So this
What is the feasibility of an idea?

The subject of this article is divided into three parts
The first part analyzes the module loading process to check whether the module extraction and analysis are feasible.
In the second part, write a simple module to extract the specified module content.
The third part briefly introduces the basic knowledge of disassembly, including the structure of machine commands, BFD, and then use binutils
Write a simple anti-assembler in the source code.
At the end, some ideas can be improved.

---- [1 loading process of Linux 2.6 kernel module

---- [Introduction to the loading process 1.1

When the program executes insmod ABC. Ko, it will call sys_init_module system call [2] to pass the entire file content
In the kernel, program relocation, symbol parsing, and module-related functions are implemented by the kernel.
Sys_init_module performs some checks first, and then
MoD = load_module (umod, Len, uargs );
Load_module does most of the work and inserts modules into the linked list. If the initialization function is defined
Call this function at the end of the load (that is, the function specified by module_init), and then release mod-> module_init
. Note: The previous module_init refers to the module initialization function specified during module programming.
It refers to a pointer of the struct module, which points to the space that can be released only once.

Next we will transfer to load_module. This function will first check the validity of the ELF file, and then read the information of the ELF file header,
Then, calculate the memory space to be allocated according to the logo of each section. Then, copy the corresponding section to complete the calculation.
Symbol relocation process.

A test module is provided for ease of understanding.

---- [1.2 Test Module

First, a test program test. C is provided.

# Include <Linux/init. h>
# Include <Linux/module. h>
# Include <Linux/moduleparam. h>
# Include <Linux/kernel. h>
# Include <Linux/list. h>
# Include <Linux/string. h>

Static char * mod_name = "module ";
// Module_param (mod_name, CHARP, 0 );
Static int remove_init (void)
{
Struct module * mod_head, * mod_counter;
Struct list_head * P;

Mod_head = & __ this_module;
List_for_each (p, & mod_head-> List ){
Mod_counter = list_entry (p, struct module, list );
If (strcmp (mod_counter-> name, mod_name) = 0 ){
List_del (P );
Printk ("Remove module % s successfully./N", mod_name );
Return 0;
}
}
Printk ("can't find module % S./N", mod_name );
Return 0;
}
Static void remove_exit (void)
{
 
}
Module_init (remove_init );
Module_exit (remove_exit );

Module_license ("GPL ")

---- [1.3 Key Point Analysis

Since the. O/. Ko file is transferred to the memory, the header file information of ELF does not exist, so you must understand the details of the module space.
Memory layout to determine whether a module can be extracted, the beginning and end of the code segment (whether it can be decompiled)
).

Let's take a look at the generated elf Department of test. Ko:

# Readelf-s test. Ko

There are 19 section headers, starting at offset 0x620:

Section headers:
[Nr] Name type ADDR off size es flg lk inf al
[0] Null 00000000 000000 000000 00 0 0 0
[1]. Text progbits 00000000 000034 000080 00 ax 0 0 4
[2]. Rel. Text rel 00000000 000918 000048 08 17 1 4
[3]. altinstr_replace progbits 00000000 limit B4 000006 00 ax 0 0 1
[4]. rodata. str1.1 progbits 00000000 release Ba 1273e 01 AMS 0 0 1
[5]. altinstructions progbits 00000000 running F8 000017 00 A 0 0 4
[6]. Rel. altinstructi rel 00000000 000960 000020 08 17 5 4
[7]. modinfo progbits 00000000 000120 running 5B 00 A 0 32
[8] _ versions progbits 00000000 000180 000100 00 A 0 32
[9]. Data progbits 00000000 000280 000004 00 wa 0 0 4
[10]. Rel. Data rel 00000000 000980 000008 08 17 9 4
[11]. GNU. linkonce. Thi progbits 00000000 000300 000200 00 wa 0 0 128
[12]. Rel. GNU. linkonce rel 00000000 000988 000010 08 17 B 4
[13]. BSS nobits 00000000 000500 000000 00 wa 0 0 4
[14]. Comment progbits 00000000 000500 000066 00 0 0 1
[15]. Note. GNU-stack note 00000000 000566 000000 00 0 0 1
[16]. shstrtab strtab 00000000 000566 running B9 00 0 0 1
[17]. symtab 00000000 000998 000200 10 18 1C 4
[18]. strtab 00000000 000b98 running A6 00 0 0 1
Key to flags:
W (write), A (alloc), x (execute), m (merge), S (strings)
I (Info), L (link order), g (group), x (unknown)
O (extra OS processing required) O (OS specific), P (processor specific)

Then let's see how the space occupied by the module is allocated:

The size of the allocated space is mainly implemented by the layout_sections function:
/* Lay out the shf_alloc sections in a way not dissimilar to how LD
Might -- code, read-only data, read-write data, small data. Tally
Sizes, and place the offsets into sh_entsize fields: high bit means it
Belongs in init .*/
Static void layout_sections (struct module * mod,
Const elf_ehdr * HDR,
Elf_shdr * sechdrs,
Const char * secstrings)
{
Static unsigned long const masks [] [2] = {
/* Note: All executable code must be the first section
* In this array; otherwise modify the text_size
* Finder in the two loops below */
{Shf_execinstr | shf_alloc, arch_shf_small },
{Shf_alloc, shf_write | arch_shf_small },
{Shf_write | shf_alloc, arch_shf_small },
{Arch_shf_small | shf_alloc, 0}
};
Unsigned int m, I;

For (I = 0; I <HDR-> e_shnum; I ++)
Sechdrs [I]. sh_entsize = ~ 0ul;

Debugp ("core section allocation order:/N ");
For (m = 0; m <array_size (masks); ++ m ){
For (I = 0; I <HDR-> e_shnum; ++ I ){
Elf_shdr * s = & sechdrs [I];

If (S-> sh_flags & masks [m] [0])! = Masks [m] [0]
| (S-> sh_flags & masks [m] [1])
| S-> sh_entsize! = ~ 0ul
| Strncmp (secstrings + S-> sh_name,
". Init", 5) = 0)
Continue;
S-> sh_entsize = get_offset (& mod-> core_size, S );
Debugp ("/T % s/n", secstrings + S-> sh_name );
}
If (M = 0)
Mod-> core_text_size = mod-> core_size;
}

Debugp ("init section allocation order:/N ");
For (m = 0; m <array_size (masks); ++ m ){
For (I = 0; I <HDR-> e_shnum; ++ I ){
Elf_shdr * s = & sechdrs [I];

If (S-> sh_flags & masks [m] [0])! = Masks [m] [0]
| (S-> sh_flags & masks [m] [1])
| S-> sh_entsize! = ~ 0ul
| Strncmp (secstrings + S-> sh_name,
". Init", 5 )! = 0)
Continue;
S-> sh_entsize = (get_offset (& mod-> init_size, S)
| Init_offset_mask );
Debugp ("/T % s/n", secstrings + S-> sh_name );
}
If (M = 0)
Mod-> init_text_size = mod-> init_size;
}
}

Therefore, for sections marked as ax, AMS, A, WA and not ". init", the size is accumulated to mod-> core_size.
(Including certain alignment) and save the size of ax to mod-> core_text_size.
In the ". init" section, the size is accumulated to mod-> init_size, And the ax size is saved to mod-> init_text _
Size. For those sections without a, no space is allocated.

The code for copying each section is as follows:
For (I = 0; I <HDR-> e_shnum; I ++ ){
Void * DEST;

If (! (Sechdrs [I]. sh_flags & shf_alloc ))
Continue;

If (sechdrs [I]. sh_entsize & init_offset_mask)
DeST = mod-> module_init
+ (Sechdrs [I]. sh_entsize &~ Init_offset_mask );
Else
DeST = mod-> module_core + sechdrs [I]. sh_entsize;

If (sechdrs [I]. sh_type! = Sht_nobits)
Memcpy (DEST, (void *) sechdrs [I]. sh_addr,
Sechdrs [I]. sh_size );
/* Update sh_addr to point to copy in image .*/
Sechdrs [I]. sh_addr = (unsigned long) DEST;
Debugp ("/t0x % lx % s/n", sechdrs [I]. sh_addr, secstrings +
Sechdrs [I]. sh_name );
}
It can be seen that the Section is copied to the corresponding allocated space according to the original elf order.
(Module_core/module_init). The exception is sht_nobits, that is, the BSS segment, which is not allocated to the file.
Space, so no need to copy.

---- [1.4 feasibility conclusion

Now we can draw a conclusion that we can use fields in the struct module structure
Most of the information is extracted. Because the replication of our section is in order, and the. text section is the first section
Mod-> module_core actually points to the. Text Segment. Mod-> core_text_size also contains. Text
The size of the section. Therefore, our decompilation code segment has a clear range.

---- [1.5 add a few points

!!! Now let's take a look at the. symtab and. strtab sections. There are several lines in the program:
# Ifdef config_kallsyms
/* Keep symbol and string tables for decoding later .*/
Sechdrs [symindex]. sh_flags | = shf_alloc;
Sechdrs [strindex]. sh_flags | = shf_alloc;
# Endif
This indicates that space is allocated only when config_kallsyms is defined.

!!! The. init section does not appear in our program. Therefore, module_init, init_size, init_text_size
So when will the. init section exist? We try
Static int remove_init (void)-> static int _ init remove_init (void)
Let's look at the readelf results:

[2]. init. Text progbits 00000000 000038 running 7E 00 ax 0 0 1
It seems that remove_init will become the. init. Text Segment.

---- [Extraction of Module 2

According to the module memory allocation process, we can easily write a simple module dumper.

---- [2.1 simple module Dumper

# Include <Linux/init. h>
# Include <Linux/kernel. h>
# Include <Linux/module. h>
# Include <Linux/proc_fs.h>
# Include <Linux/fs. h>
# Include <Linux/file. h>
# Include <Linux/list. h>
# Include <Linux/string. h>
# Include <ASM/uaccess. h>

# Define EOF (-1)
# Define seek_set 0
# Define seek_cur 1
# Define seek_end 2

Struct file * klib_fopen (const char * filename, int flags, int mode );
Void klib_fclose (struct file * filp );
Int klib_fwrite (char * Buf, int Len, struct file * filp );

Static struct module * MOD;
Static char buffer [256];
Static char * mod_name;

Module_param (mod_name, CHARP, 0 );

Ssize_t show_mod_read (struct file * FP, char * Buf, size_t Len, loff_t * Off)
{
Struct file * filep;

Filep = klib_fopen ("./dump. dat", o_wronly | o_creat | o_trunc,
S_irusr | s_iwusr | s_irgrp | s_iroth );
If (filep = NULL ){
Printk ("error open files./N ");
Return 0;
}
Klib_fwrite (mod-> module_core, mod-> core_size, filep );
Klib_fclose (filep );
 
Filep = klib_fopen ("./dump.info", o_wronly | o_creat, o_trunc,
S_irusr | s_iwusr | s_irgrp | s_iroth );
If (filep = NULL ){
Printk ("error open files./N ");
Return 0;
}
Sprintf (buffer, "mod-> module_init = 0x % P/N"
"Mod-> module_core = 0x % P/N"
"Mod-> init_size = % LD/N"
"Mod-> core_size = % LD/N"
"Mod-> init_text_size = % LD/N"
"Mod-> core_text_size = % LD/N ",
Mod-> module_init, mod-> module_core,
Mod-> init_size, mod-> core_size,
Mod-> init_text_size, mod-> core_text_size );
Klib_fwrite (buffer, strlen (buffer), filep );
Klib_fclose (filep );
 
Return 0;
}
Static struct file_operations show_mod_fops = {

. Read = show_mod_read,
};

Static int dummy_init (void)
{
Struct proc_dir_entry * entry;
Struct list_head * P;
Struct module * head, * counter;
 
MoD = NULL;
If (! Mod_name)
MoD = this_module;
Else {
Head = this_module;
List_for_each (p, head-> list. Prev ){
Counter = list_entry (p, struct module, list );
If (strcmp (counter-> name, mod_name) = 0 ){
MoD = counter;
Break;
}
}
If (! MoD ){
Printk ("can't find module named % s/n", mod_name );
Return-1;
}
Entry = create_proc_entry ("show_mod", s_irusr, & proc_root );
Entry-> proc_fops = & show_mod_fops;
 
Return 0;
}

Static void dummy_exit (void)
{
Remove_proc_entry ("show_mod", & proc_root );
Return;
}

Struct file * klib_fopen (const char * filename, int flags, int Mode)
{
Struct file * filp = filp_open (filename, flags, mode );
Return (is_err (filp ))? Null: filp;
}

Void klib_fclose (struct file * filp)
{
If (filp)
Fput (filp );
}

Int klib_fwrite (char * Buf, int Len, struct file * filp)

{

Int writelen;
Mm_segment_t oldfs;

If (filp = NULL)
Return-enoent;
If (filp-> f_op-> write = NULL)
Return-enosys;
If (filp-> f_flags & o_accmode) & (o_wronly | o_rdwr) = 0)
Return-eacces;
Oldfs = get_fs ();
Set_fs (kernel_ds );
Writelen = filp-> f_op-> write (filp, Buf, Len, & filp-> f_pos );
Set_fs (oldfs );

Return writelen;

}

Module_init (dummy_init );
Module_exit (dummy_exit );

Module_license ("GPL ");

---- [2.2 program functions

This is actually a very simple small program that saves the descriptions of some struct modules to./dump.info,
Save the content after the module is loaded to./dump. dat. Because dump.info contains the program base address, the code segment
Length and other information to facilitate disassembly of dump. dat. However, this process is still not satisfactory for legal forensics.
We will discuss the requirements at the end of the article.

---- [Disassembly of Module 3

Now that we have obtained the module memory image, it is time to analyze the specific role of the module.
A normal module or an lkm Trojan?

---- [3.1 BFD Introduction

BFD, binary file descriptor, is a binary platform that supports multiple architectures and types. With this platform,
You can write a tool that supports various file formats (ELF, coff, A. Out...). objdump is an example.
When BFD opens the file, use magic to determine the file type, read the corresponding file header, and create several
Canonical object, which provides a unified interface to the upper layer of the call.
It can be seen that BFD is a good starting point for writing tools with multiple file formats, but BFD is not perfect.
Due to strict compliance with the specifications, the support for some malformed files is not good. In this case, we may need other
Tools, such as Fenris [3].

---- [Basic format of Intel 3.2 machine commands

Intel x86 machine command type: [4]
[Prefix] opcode [modr/m] [SIB] [displacment] [immediate]

Opcode is required, and others are optional. The prefix is a combination of 0-4 bytes, which may be:

The following are the allowable instruction prefix codes:
F3h rep prefix (used only with string instructions)
F3h repe/repz prefix (used only with string instructions ctions
F2h repne/repnz prefix (used only with string instructions ctions)
F0h lock prefix
The following are the segment override prefixes:
2EH CS segment override prefix
36 h SS segment override prefix
3eh DS segment override prefix
26 h es segment override prefix
64 h FS segment override prefix
65 h GS segment override prefix
66 h operand-size override
67 h address-size override

Therefore, when we disassemble a piece of machine code, we first skip all the prefixes starting from the first byte,
It is followed by 1-2 bytes of opcode (the first byte of opcode is a specific character, for example, 0x0f). For some
The opcode Command requires the corresponding [modr/m] [SIB] [displacment] [immediate ].
See intel manual.

---- [3.3 compile a simple anti-compiler using the binutils package

If you write an anti-compiler from scratch, Intel has a lot of instructions and requires a lot of work. We can use the ready-made
Open source code, slightly modified to meet our needs. The purpose of open source is to never invent wheel
Again, never solve the same problem twice.

We cannot use the standard BFD library because the elf header information has been lost in the image dumped,
Therefore, you need to call the underlying function of BFD to disassemble commands and process symbols by yourself.

For disassembly instructions, we can use the print_insn function in binutils/Opcodes/i386-dis.c [5,
For symbol parsing, we can use the contents of the system/boot/system. Map File.

Because the print_insn function is the underlying function of BFD, many data structures related to BFD are used.
For use, you need to construct some structures by yourself. By analyzing print_insn, we only need to construct the following structure:

Struct disassemble_info myinfo;
Static void info_init (void)
{
Myinfo. Mach = bfd_mach_i1__i386;
Myinfo. discycler_options = "i386, ATT, addr32, data32 ";
Myinfo. fprintf_func = fprintf;
Myinfo. Stream = stdout;
Myinfo. read_memory_func = my_read_func;
Myinfo. memory_error_func = my_error_func;
Myinfo. print_address_func = my_address_func;
Myinfo. buffer_vma = base_addr;
Myinfo. buffer_length = dis_size;
Myinfo. Buffer = malloc (dis_size );
}
My_read_func is a UDF for reading commands, and my_error_func is an error handler,
My_address_func is the print address function. Here we can find the symbol table to obtain
Symbol. Base_addr is mod-> module_core.

Simple implementation of my_address_func:
Void my_address_func (bfd_vma memaddr,
Struct disassemble_info * myinfo)
{
Char * P;

P = NULL;
Myinfo-> fprintf_func (myinfo-> stream, "0x % x", memaddr );
P = find_symbol (root, memaddr );
If (P)
Myinfo-> fprintf_func (myinfo-> stream, "<% S>", P );
Return;
}

Simple implementation of my_read_func:
Int my_read_func (bfd_vma memaddr,
Bfd_byte * myaddr,
Unsigned int length,
Struct disassemble_info * myinfo)
{
Unsigned long bytes;
 
Bytes = memaddr-myinfo-> buffer_vma;
 
Memcpy (myaddr, myinfo-> buffer + bytes, length );
Return 0;
}

We disassemble test. Ko and the result is as follows: (fragment)

...
<8881000 + 59> mov % edX, (% eax)
<8881000 + 61> movl $0x200200,0x4 (% ECx)
<8881000 + 68> movl $0x100100, (% ECx)
<8881000 + 74> pushl 0x8881488
<8881000 + 80> push $ 0x888108d
<8881000 + 85> JMP 0x8881072
<8881000 + 87> mov % edX, % ECx
<8881000 + 89> mov (% EDX), % eax
<8881000 + 91> prefetchnta (% eax)
<8881000 + 94> NOP
<8881000 + 95> CMP $0x8881504, % edX
<8881000 + 101> JNE 0x8881016
<8881000 + 103> pushl 0x8881488
<8881000 + 109> push $ 0x88810ad
<8881000 + 114> call 0x21188c7 <printk>
<8881000 + 119> pop % eax
<8881000 + 120> XOR % eax, % eax
<8881000 + 122> pop % edX
<8881000 + 123> pop % ESI
<8881000 + 124> pop % EDI
<8881000 + 125> RET
...

---- [3.4 notes

1. the BFD. h In the i386-dis.c does not exist in the source code package, you need to run./configure & make
Can be generated, but running./configure is not enough.
2. The _ init modifier mentioned above and the section starting with. init. After the module is inserted and run, this section
Memory space is released, so this information cannot be obtained during dump.

-----------------------------------------------------------------------------

---- [4 ends

At this point, we have implemented the basic module extraction and decompilation functions, but there are still many shortcomings and points worth improving.

The dump result of the O module is stored on the disk by default, which should be avoided in legal evidence (as few disk access as possible ).
The solution is to use the kernel to send packets to another host on the Internet. The content is the original file.

O The Decompilation process is still dependent on the underlying layer of BFD. It is immune to malformed commands.
This will cause an error in the decompilation result.

O If symtab and strtab are saved to the module image, this information is helpful for decompilation. We do not
.

You can add the anti-compiler function as required.

-----------------------------------------------------------------------------

---- [5 References

 

[1] madsys http://www.phrack.org/show.php? P = 61 & A = 3
Improved Version
Http://www.linuxforum.net/forum/gshowflat.php? Cat = & board = Security & number
= 512152 & page = 0 & view = collapsed & SB = 5 & O = All & fpart =
Http://www.linuxforum.net/forum/showthreaded.php? Cat = & board = Security &
Number = 437327 & page = 2 & view = collapsed & SB = 5 & O = all
Http://www.linuxforum.net/forum/gshowflat.php? Cat = & board = Security & number =
520818 & page = 0 & view = collapsed & SB = 5 & O = All & fpart =
[2] Linux kernel source code, www.kernel.org
[3] http://lcamtuf.coredump.cx/fenris/devel.shtml
[4] intel instruction manual
[5] binutils source code, http://www.gnu.org/software/binutils/
 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.