Unix elf File Format and virus analysis

Source: Internet
Author: User

Unix elf File Format and virus analysis

★Introduction

This article describes the Unix virus mechanism, implementation, and ELF file format. This article briefly describes the Unix virus detection and anti-detection technologies, and provides some examples under the Linux/i386 architecture. It requires some preliminary Unix programming experience to be able to understand the Assembly Language in Linux/i386. It is better to understand elf itself.

This article does not have any practical virus programming technology, but simply applies the virus principle to UNIX environments. Here, we do not intend to introduce the elf specifications from the beginning. If you are interested, please read the elf specifications on your own.

★Infected elf files

The process image contains "text segments" and "data segments". The memory protection attribute of text segments is r-X, so it is generally self-modified. Code Cannot be used for text segments. The data segment's memory protection attribute is RW -.

The section does not need to be an integer multiple of the page size. Here we use filling.

Keywords:

[...] A complete page

M memory used

P Filling

Page number

#1 [ppppmmmmmmmm] \

#2 [mmmmmmmmmmmm] | -- a segment

#3 [mmmmmmmmmmmatrix PPP]/

Segment does not limit the use of multiple pages, so single-page segments are allowed.

Page number

#1 [ppppmmmmmmmmppp] <-- a segment

Typically, the data segment does not need to start from the page boundary, while the text segment requires the start page boundary alignment. The memory layout of a process image may be as follows:

Keywords:

[...] A complete page

T Text Segment Content

D. Data Segment Content

P Filling

Page number

#1 [tttttttttttttttt] <-- Text Segment Content

#2 [tttttttttttttttt] <-- Text Segment Content

#3 [ttttttttttttpppp] <-- Text Segment Content (partial)

#4 [ppppdddddddddd] <-- Data Segment Content (partial)

#5 [dddddddddddddddd] <-- Data Segment Content

#6 [ddddddddddpppp] <-- Data Segment Content (partial)

Pages 1, 2, and 3 constitute text segments

Pages 4, 5, and 6 constitute data segments

From now on, for the sake of simplicity, the Section description chart uses a single page, as shown below:

Page number

#1 [ttttttttttttpppp] <-- Text Segment

#2 [ppppddddddpppp] <-- Data Segment

In i386, the stack segment is always positioned after the data segment is given sufficient space. Generally, the stack is located at the high-end memory, and it is growing to the low-end. In the ELF file, the loadable segments are physical images:

Elf Header

.

.

Segment 1 <-- Text Segment

Segment 2 <-- Data Segment

.

.

Each segment has a virtual address to locate its starting position. You can use this address in the code.

To insert parasitic code, you must ensure that the original code is not damaged. Therefore, you need to expand the memory required for the corresponding segment.

In fact, text segments not only contain code, but also elf headers, including dynamic link information. If you directly extend the text segment to insert parasitic code, there are many problems, such as reference absolute addresses. You can consider keeping the text segment unchanged and adding an additional segment to store the parasitic code. However, the introduction of an additional segment is indeed prone to suspicion and can be easily discovered.

Extending text segments to the high-end or low-end data segments may cause overlapping segments. Locating a segment in the memory may cause problems with codes that reference absolute addresses. It is not a good idea to consider extending data segments to the high-end. Some UNIX systems fully implement the memory protection mechanism and the data segments cannot be executed.

Page filling on the section boundary provides the place where the parasitic code is inserted, as long as the space permits. Inserting the parasitic code here does not destroy the content of the Original Segment and does not require relocation. Page filling at the end of a text segment is a good place and looks like the following:

Keywords:

[...] A complete page

V parasitic code

T Text Segment Content

D. Data Segment Content

P Filling

Page number

#1 [ttttttttttvvpp] <-- Text Segment

#2 [ppppddddddpppp] <-- Data Segment

A more complete elf executable layout is as follows:

Elf Header

Program header table

Segment 1

Segment 2

Section header table

Section 1

.

.

Section N

Typically, additional sections (those without corresponding segments) are used to store debugging information, symbol tables, and so on.

Here are some content from elf specifications:

The elf header is located at the beginning. It saves a "road map" and describes the organizational structure of the file. Section stores a large number of link information, symbol tables, and relocation information.

If a "program header table" exists, it will tell the operating system how to create a process image (execute Program ).

The executable file must have a "program header table", which is not required for relocated files. "Section header table" describes the section organization of a file. Each section has a table item in the table, which contains information such as the node name and size.

The file used during the link process must have a "section header table". Other target files may not have this table.

After inserting the parasitic code, the ELF File layout is as follows:

Elf Header

Program header table

Segment 1-Text Segment (Subject Code)

-Parasitic code

Segment 2

Section header table

Section 1

.

.

Section N

Parasitic code must be physically inserted into the ELF file, and text segments must be extended to include new code.

The following information is from/usr/include/elf. h.

/* The ELF File Header. This appears at the start of every

ELF File .*/

# Define ei_nident (16)

Typedef struct

{

Unsigned char e_ident [ei_nident];

/* Magic number and other info */

Elf32_half e_type;

/* Object file type */

Elf32_half e_machine;

/* Architecture */

Elf32_word e_version;

/* Object file version */

Elf32_addr e_entry;

/* Entry point virtual address */

Elf32_off e_phoff;

/* Program header table file offset */

Elf32_off e_shoff;

/* Section header table file offset */

Elf32_word e_flags;

/* Processor-specific flags */

Elf32_half e_ehsize;

/* Elf header size in bytes */

Elf32_half e_phentsize;

/* Program header table entry size */

Elf32_half e_phnum;

/* Program header table entry count */

Elf32_half e_shentsize;

/* Section header table entry size */

Elf32_half e_shnum;

/* Section header table entry count */

Elf32_half e_shstrndx;

/* Section header string table Index */

} Elf32_ehdr;

E_entry stores the virtual address of the program entry point.

E_phoff is the offset of "program header table" in the file. Therefore, to read the "program header table", you need to call lseek () to locate the table. E_shoff is the offset of "section header table" in the file. The table is located at the end of the file. After the parasitic code is inserted at the end of the text segment, the e_shoff must be updated to point to a new offset.

/* Program segment header .*/

Typedef struct

{

Elf32_word p_type;/* segment type */

Elf32_off p_offset;/* segment file offset */

Elf32_addr p_vaddr;/* segment virtual address */

Elf32_addr p_paddr;/* segment physical address */

Elf32_word p_filesz;/* segment size in file */

Elf32_word p_memsz;/* segment size in memory */

Elf32_word p_flags;/* segment flags */

Elf32_word p_align;/* segment alignment */

} Elf32_phdr;

The loadable segment (Text Segment/Data Segment) is identified by the member variable p_type in "program Header" and its value is pt_load (1 ). Like e_shoff in "elf Header", p_offset must be updated after the parasitic code is inserted to point to the new offset.

P_vaddr specifies the starting virtual address of the segment. With p_vaddr as the base address, re-calculate the e_entry to specify where the program flow starts. P_vaddr can be used to specify where the Program Stream starts. P_filesz and p_memsz correspond to the file size and memory size occupied by the segment respectively .. The uninitialized data section of the BSS section. We do not want uninitialized data to occupy file space, but the process image must ensure sufficient memory space can be allocated.

The. BSS section is located at the end of the data segment. Any location beyond the file size is assumed to be located in this section.

/* Section header .*/

Typedef struct

{

Elf32_word sh_name;

/* Section name (string TBL index )*/

Elf32_word sh_type;/* section type */

Elf32_word sh_flags;/* Section flags */

Elf32_addr sh_addr;

/* Section virtual ADDR at execution */

Elf32_off sh_offset;/* section File offset */

Elf32_word sh_size;/* section size in bytes */

Elf32_word sh_link;/* link to another section */

Elf32_word sh_info;

/* Additional section information */

Elf32_word sh_addralign;/* Section alignment */

Elf32_word sh_entsize;

/* Entry size if section holds table */

} Elf32_shdr;

Sh_offset specifies the offset of the Section in the file.

To insert parasitic code at the end of a text segment, we must do the following:

* Correct p_shoff in "elf Header"

* Locate "Text Segment program Header"

* Modify p_filesz

* Modify p_memsz

* For other phdr after the text segment phdr

* Modify p_offset

* Shdr for each section that affects the offset caused by inserting parasitic code

* Modify sh_offset.

* Physically Insert the parasitic code in the file to this location.

Text Segment p_offset + p_filesz (original)

There is a big problem here, as pointed out in the elf specification,

P_vaddr mod page_size = p_offset mod page_size

To meet this requirement:

* Modify p_shoff in "elf Header" and increase the page_size.

* Locate "Text Segment program Header"

* Modify p_filesz

* Modify p_memsz

* For other phdr after the text segment phdr

* Modify p_offset to increase the page_size.

* Shdr for each section that affects the offset caused by inserting parasitic code

* Modify sh_offset to increase the page_size.

* Insert parasitic code and fill in the file physically (ensure the composition

A full page) to this location

Text Segment p_offset + p_filesz (original)

We also need to modify the virtual address of the program entry point so that the parasitic code is executed before the host code. At the same time

At the end of the parasitic code, you can jump back to the original entry point of the Host code to continue the normal process.

* Modify p_shoff in "elf Header" and increase the page_size.

* The tail of the parasitic code is corrected so that it can jump back to the original entry point of the Host code.

* Locate "Text Segment program Header"

* Modify the e_entry in "elf Header" and point

P_vaddr + p_filesz

* Modify p_filesz

* Modify p_memsz

* For other phdr after the text segment phdr

* Modify p_offset to increase the page_size.

* For the last shdr of the Text Segment

* Modify sh_len (it should be sh_size, not sure) to increase the parasitic generation.

Code size

* Shdr for each section that affects the offset caused by inserting parasitic code

* Modify sh_offset to increase the page_size.

* Insert parasitic code and fill in the file physically (ensure a complete page is formed)

To this position, Text Segment p_offset + p_filesz (original)

Viruses can traverse a directory tree randomly and search for files whose e_type is equal to et_exec or et_dyn to infect them. These files are executable files and dynamic link library files.

★Analyze Linux viruses

The virus requires no use of libraries, avoiding libc, and then using the System Call mechanism.

To dynamically apply for heap memory for phdr table and shdr table, use the BRK system call. Obtain the address of a constant string using the same technique as buffer overflow.

Use gcc-s to compile C code and observe and adjust the ASM code.

Note: When entering/leaving the parasitic code, save/recover the register.

Use objdump-D to observe and adjust the offset to be determined.

★Virus Detection

The viruses described here are easy to detect. The most prominent is that the program entry point is not in the regular section, or even simply not in any section. The process of Virus Cleaning is similar to that of virus infection.

You can use objdump -- all-headers to easily locate the program entry point. You can use objdump -- disassemble-all to track the entry point of the program.

The default program entry point is _ start, but it can be changed during the link.

★Conclusion

Although not popular, Unix virus is indeed feasible.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.