A hidden threat to * nix WEB Servers
From: https://www.virusbtn.com/virusbulletin/archive/2014/07/vb201407-Mayhem
0x01 Introduction
Websites and even servers are becoming more and more popular. This infection is usually used to intercept communication, black hat SEO, leeching downloads, and so on. In most cases, such malware is composed of relatively simple PHP scripts. But in the last two years, many more complex malware families have been discovered. Mayhem is a multi-purpose modular bot for website servers. Our team studied this bot to get an understanding of it, not just the malware client, but also some of its C & C server commands, which allows us to collect some statistics. This article should be considered a supplement to Article 1 published by the Malware Must Die team. We met Mayhem bot in April 2014, and this article is the result of our independent research. 2 is the only publication we have found about Mayhem. In our research, we also found that Mayhem is a continuation of the larger 'fort disco' brute force cracking campaign (published in 3 ).
0x02 malware display
First, this part of malware is represented by a PHP script. We analyzed the SHA256 hash: b3cc1aa3259cd934f41537e6371f270c23edf96d2c0801 728b0109dd07a0a035 of the PHP virus Releaser version. The analysis results of this script are shown in table 1.
Date VirusTotal results 3/54 3/51 3/52
Table 1 results of using VirusTotal to check the PHP virus Releaser
After execution, the script kill all the '/usr/bin/host' processes to identify the system architecture (x64 or x86) and system type (Linux or FreeBSD ), then release a malicious dynamic link library named 'libworker. so '. This Code also defines a variable 'au ', which saves the complete URL of the script to be executed. The first part of this PHP script is shown in Figure 1.
Figure 1 Part 1 of the PHP virus Releaser
After that, the PHP virus Releaser created a Shell script named '1. Sh', which is described in table 2. In addition, this script also creates the environment variable 'au ', which is the same as the one defined in the PHP script.
Figure 2 content of the '1. Sh' script
The PHP virus Releaser then runs the SHELL script by running the command 'at now-f 1. Sh. This command adds a scheduled task. After execution, the virus Releaser waits for a maximum of five seconds and then deletes the scheduled task. If the 'at' command fails to be executed, the virus Releaser runs the '1. Sh' script directly. The PHP virus Releaser code is shown in figure 3.
Figure 3 last part of the PHP virus Releaser
0x03 dynamic link library Initialization
LD_PRELOAD technology allows the first dynamic link library to be loaded and allows it to be easily hooked into different functions. If a standard library function is overwritten in such a Dynamic Linked Library, the library will block all calls to that function. This malicious sample contains its own 'exit 'function, so this malicious function replaces the original function when called by'/usr/bin/host. An additional initialization function is called during the execution of the 'exit 'function of hooked. the workflow of this function is shown in Figure 4. In this initialization process, the following steps are executed:
• An ELF file containing only the 'exit 'function is released.
• This process forks then the sub-process runs the ELF File and ends its execution
• The parent process performs more initialization work: it tries to connect to the Google DNS Service (the IP address is 8.8.8.8), decrypt and parse the configuration file, and then obtain various system parameters.
Chart 4 initialization function Workflow
After initialization, the dynamic link library file is deleted from the hard disk. This malware then tries to open a file, that is, a hidden file system, which is mapped to the memory. Then a hidden file system is initialized. Then the process forks, the parent process exits, and the child process continues to execute. Figure 5 shows the workflow of the hooked 'exit 'function that is highly abstract. The routes that are successfully executed are marked in red in the flowchart. As you can see, the execution route is neither a parent process nor a child process. We assume that this is a anti-Debugging Technique for those who have set only sub-process execution tracing or only parent process execution tracing after fork.
Chart 5: hooked's 'exit 'function highly abstract Workflow
After these steps, the sub-process (the only one that is still alive) runs the main loop of the malicious program. This malicious program will wait for the time set in the configuration and then run the letter to do the actual work.
0x04 main cyclic Functions
This function first establishes a socket to communicate with the C & C server, and then checks whether the information of the infected host starts from this valid session, that is, the malicious software has been sent to C & C since it was executed. If the flag information has been successfully sent to the C & C server, the malware sends a ping packet and then receives and executes the C & C command.
If the flag information has not been successfully delivered, the malware will prepare an HTTP packet containing the 'uname-a' command output, the infected system architecture information, and information about the permissions of system users to execute processes. After this packet is sent, the malware reads the C & C Response and then exits the function if an error occurs. If everything works, the malware updates the flag and tries to read and execute commands in other C & C responses. The workflow of a highly abstract main cyclic function is shown in Figure 6.
Figure 6 Master cyclic function highly abstract workflow of the Dynamic Link Library
At work, malware contains four lists and two queues. One queue is the string used for input (the string received from the C & C server), and the other queue is used to output the string (the string to be sent to the C & C server ). The first list is used to store the address of the plug-in's working function, and the second list is used to store the address of the function that processes data before writing data to the socket (the one that transfers data to C & C ), the third list is used to store the address of the data processing function (the data received from C & C) before reading data from the socket ), and the fourth function address is used to store the function address that will process data from the string queue. Figure 7 shows how these queues and lists are used in malware workflows.
Figure 7 workflow for reading data from the C & C Server
Chart 8 shows the workflow of malware processing tasks.
Workflow for processing strings using the chart 8 plug-in
0x05 C & C commands
Seven different commands are used for communication between the C & C server and malware. These commands can be divided into two groups: Input commands (C & C to bot) and output commands (bot to C & C ). All these commands are sent in the http post request and response, that is, the input command is sent in the http post request, and the output command is sent in the HTTP Response to the POST request.
'R' command (output)
By sending this command, the malware notifies C & C that it has been loaded successfully and is ready to work. If the WEB server runs under the root permission, the 'R' command format sent to C & C is as follows:
R, 20130826, <system architecture-64 or 32>, <'/usr/bin/host' ELF header EI_OSABI value>, ROOT, <'uname-a' command output>
If the WEB server runs with limited permissions, the command is the same, but the 'root' is replaced by getenv ('au ') -The PHP script starts executing the malicious software URL. If everything is normal, the C & C server returns 'r, 100'
'G' command (input)
This command is sent by the C & C server to malware. This command has the following format:
G, <task ID>
If the current task ID is not the same as the received task ID, the malware completes the current running task and starts a certain number of new working threads. The number of worker threads is the 'F' command (output) set by the 'l' command to request files from the server. If the malware wants to request a new file, it will send the following command:
F, <File Name>, 0
If the malware wants to check whether the obtained file has an updated version, it will send:
F, <File Name>, <file CRC32 verification>
If the file is not found on the C & C server, the server will respond to: F, 404, <File Name>
If the file has not been changed since it was accepted, C & C will respond:
F,304,-
If the newly created or updated file is found, the server will respond to the following:
F, 200, <file name >,< BASE64 encoded file data>
After receiving the command containing data, the malicious program decodes base64 and writes it to the hidden file system on the hard disk. Then it tries to determine whether the received file is a plug-in. If the file is a plug-in, the malicious program checks its CRC32 check stored in the field of the unused ELF header, and then loads the plug-in into the memory.
The 'l' command is used by the C & C server to configure malware and load a plug-in. If C & C wants to configure the core module of the malware, it will send:
L, core, <Number of worker threads>, <sleep timeout>, <socket timeout>
After receiving this command, the malware will complete all the working threads and then update the number of working threads. sleep timeout and socket timeout if C & C wants the malware to load a plug-in, it will send:
L, <plug-in name >,< plug-in parameters separated by commas>
If the malware receives this command and other plug-ins are running, the running plug-ins will be terminated and the new plug-ins will be found from the hidden file system. If the search fails, a file with a plug-in will be requested from C & C through the F command. The plug-in will be loaded, initialized, and then run
The 'q' command (input and output) is used to transmit work data from C & C to malware-and vice versa. If C & C wants to add a string to the processing queue of malware, it will send: Q, all these strings are added to the malware input queue and will be processed by running plug-ins. If the malware wants to upload the results of its work, it will send: Q, <plug-in Name>, <result string>
Then delete these strings from its output queue.
The 'P' command (output) is used by the malware to send its current status to the C & C server. The command format is:
P, <flag running of the task>, <worker thread count>, <Number of read/write requests from the server per second>, <total number of read/write operations on the server since the value is set to 0>
'S command (input)
If the malware receives this command, it will complete all the threads currently working, clear the input and output queues, and release other system resources. After that, it will be ready to process a new task. To sum up, these commands are as follows: Output command: R-Send Report F-request file Q-send data P-Report status input command: g-run a new task L-load plug-in Q-send data S-terminate the current task
0x06 Configuration
The dynamic link library stores encrypted configuration information in the data segment. The decrypted key is also stored in the data segment. First, only the first eight bytes are decrypted, and then the malware checks whether the last four bytes are equal to 0xDEADBEEF. If so, the first four bytes represent the length of the encrypted data. After that, the remaining ciphertext can be decrypted. Figure 9 shows the pseudocode of the decryption algorithm.
Figure 9 decryption algorithms used by malware
We analyzed the code of this algorithm and found that this is the implementation of an XTEA 4 encryption algorithm, 32 rounds of 5, the operation mode is ECB 6, 7 charts 10 shows the decrypted configuration content sample
Sample configuration content for Chart 10 decryption
All samples analyzed have the same configuration format. The first part of the configuration contains the special flags and the offset pointing to the remaining configuration array data. The decrypted configuration format is displayed in table 2.
Offset Size in bytes Description 4 This field contains the number of eight-byte blocks in the configuration-in other words, the length of the configuration in eight-byte blocks 4 4 Special marker 0 xDEADBEEF 8 4 Offset to the C & c url 12 4 Sleep time between executions of the main loop function of the malware 16 4 Size of file mapping for the hidden file system 20 4 Offset to the name of the file that contains the hidden file system
Table 2 Description of malware Configuration
As shown in table 2, a C & C address is directly defined in the malware configuration and is not using DGA]
0x07 hide a File System
As mentioned earlier, this malicious program uses a hidden file system to store its files. This file system consists of a file created during initialization. The name of this hidden file system file is defined in the configuration, but its name is usually '. sd0 '. To work with this File, an open-source Library 'fat 16/32 File System library' is used '. However, it is not used in the original version, and some functions are modified to support encryption. Each block is encrypted in 32 rounds of XTEA algorithm ECB mode and each encryption key varies with the block. This hidden file system is used to store plug-ins and files containing strings to be processed: URL list, user name, password, and so on. Table 11 shows the content of a file system instance:
Figure 11 content of a file system instance
We developed a simple tool based on the open-source {database that can decrypt and Extract files from such a file system }.
0x08 plug-in Analysis
As mentioned earlier, this malware has the ability to use plug-ins. In our research, we found eight different plug-ins for the bot. Plug-ins and their configuration files are stored in a hidden file system. All plug-ins described here are discovered when the malware is deployed and used outside. Each plug-in interface exports a structure that contains two special tags: a pointer to a useful plug-in function and a string containing the plug-in name. Each plug-in contains at least two such pointers: A pointer pointing to the plug-in Initialization function and a pointer pointing to the function executing the "de-initialization" operation. The two tags in this structure are constants: 0xDEADBEEF and a constant 20130826. We guess it is the version of the plug-in. Example 12 of such a structure is shown in Table 12:
12 example of describing the plug-in structure in the chart
Based on the fact that all plug-ins are stored in the hidden file system, neither of them is detected by VirusTotal using any anti-virus vector to find rfiscan. so
SHA256 hash sum: 9efed12a67e5835c73df5882321c4cd2dd2 3e4a571e5f99ccd7ec13176ab12cb
This plugin is used to discover websites with Remote File Inclusion Vulnerability (RFI. During the initialization process, this plug-in downloads a list pattern mode and a list of website to check. Then it sends a special HTTP request to the site and tries to include 'HTTP: // www.google.com/humans.txt' and analyze the corresponding http response. If the HTTP response contains the 'we can shake 'substring, the plug-in confirms that the website has a Remote File Inclusion Vulnerability. A part of the list with the pattern is displayed in Figure 13.
Figure 13 is used by 'rfiscan. so' to find the pattern of the RFI website.
These results are sent to the C & C server using the 'q' command. The meaning of these commands is shown in table 3.
Command Description Q, rfiscan, An RFI vulnerability has successfully been found Q, rfiscan,-RFI vulnerabilities havene' t been found
Table 3 Description of the 'rfiscan 'plug-in 'q' command
Wpenum. so SHA256 hash sum: Sums
This plug-in is used to outline the User Name of the WordPress site. The function of this plug-in receives a URL, converts it, and then sends an HTTP request using the following query template <remove the initial query at the end of the last part> /? Author = <user id>
The User ID ranges from 0 to 5. If the corresponding HTTP response contains the sub-string 'location: 'and the target URL contains the sub-string'/author/', the user name is extracted from the target URL. Use the 'q' command to send the first user to the C & C server. The meanings of these commands are shown in table 4.
Command Description Q, wpenum,
,
,
Username has successfully been found Q, wpenum, no_matches No username has been found Q, wpenum,-Connection failed
Table 4 description of the 'wpenum' plug-in 'q' command
Cmsurls. so
SHA256 hash sum: Signature
The working function of this plug-in receives the hostname, constructs an http get request Assembly '/wp-login.PHP' query, and then finds the substring 'name = "log" 'in the corresponding response "'. Therefore, this plug-in searches for user logon pages on WordPress CMS-based sites. The result is sent to C & C through the 'q' command. The meanings of these commands are shown in table 5.
Command Description Q, cmsurls, URL for login page has successfully been found Q, cmsurls, URL for login page has not been found Q, cmsurls,-Connection failed
Table 5 describes the 'q' command of the 'cmsurls. so' plug-in.
Bruteforce. so
SHA256 hash sum: Signature
This plug-in is used to brute force crack the password of a site created based on WordPress and Joomla CMS. This plug-in does not support HTTPS. During our research, we found a dictionary containing passwords used by this plug-in. The dictionary contains 17,911 passwords. These passwords can be 1 to 32 characters in length.
Bruteforceng. so
SHA256 hash sum: 992c36b2fcc59117cf7285fa39a89415c62a56fe4f0a192a05a%e7a6dcdea6
This plug-in is also used to brute force crack the website password, but unlike bruteforce. so, this plug-in supports HTTPS and regular expressions, and can be configured to brute force crack any login page. An example of such a configuration is shown in figure 14.
Figure 14-an example of 'bruteforceng. so' plug-in configuration
We analyzed other configurations of this plug-in and found that it is also used to brute force crack sensitive information of the DirectAdmin control panel.
Ftpbrute. so SHA256 hash sum: Digest
This plug-in is used to brute force crack FTP accounts.
Crawlerng. so
SHA256 hash sum: Signature
This plug-in is used to crawl WEB pages and retrieve useful information. Obtain a list of crawling websites from the C & C server and other parameters similar to crawling depth. This plug-in also supports the HTTPS protocol and uses the SLRE 10 library to process regular expressions. This plug-in is very flexible. a configuration file corresponding to this plug-in is shown in table 15. As you can see, this plug-in is used in this example to find the web pages related to drug collection.
Figure 15: a configuration file for the 'crawlerng. so' plug-in
Crawlerip. so
SHA256 hash sum: Signature
This plug-in is the same as the 'crawlerng. so' plug-in. The only difference is that this uses an IP list instead of a URL list.
0x09 Analysis of C & C
In our research, we found that three C & C servers were used to manage botnets. We can find a way to access two of them and obtain some statistics. A general summary of the C & C Management Panel is shown in table 16. The page that allows users to add tasks to a BOT is displayed in figure 17.
Chart 16 (bot list displayed on the C & C Management Panel)
Figure 17 other task interfaces in C & C
The two C & C servers jointly control about 1,400 bots. The first botnet contains about 1,100 bots, and the second contains about 300 bots. During analysis, botnet bots were used to brute force crack WordPress passwords. Table 18 shows a brute-force cracking task, and Table 19 shows the results of these brute-force cracking tasks.
Figure 18 brute force cracking tasks in the larger botnet Control Panel
Figure 19 results of some botnet brute-force cracking tasks
The geographic distribution of infected server servers in botnet is shown in table 20. As you can see, the most infected countries are the United States, Russia, Germany, and Canada.
The geographic distribution of infected server servers in the botnet with the 20 largest chart.
The deeper the blue, the more infected servers the third C & C server is located by the Malware Must Die 1 team, during our analysis, it has been disabled and we analyzed the two C & C servers that are still running. In addition to the home page, the source code also contains two additional PHP scripts: config. php and update. the first php script contains configuration data: Database confidential data, Management Panel password MD5, maximum task determination time, bot wake-up time, and so on. Part of this script is shown in table 21.
21-Part C & C configuration data in the chart
The update. php script is used to wake up the bot. This script accesses an idle bot and runs the PHP script mentioned in 'malware representation. We also found that the C & C server supports a certain number of plug-ins not found outside. For example, a plug-in uses the recently released 'heartbleed' vulnerability and collects information from a vulnerable server. Code Table 22 describing all available plug-ins
The code in Figure 22 shows a certain number of plug-ins that we did not find outside.
C & C uses MySQL and memcached (if available) for data storage, however, the plug-in is stored on the hard disk and we also find that the C & C script code also contains a certain amount of security issues, but the descriptions of these vulnerabilities are beyond the scope of this article.
Comparison between 0x10 and other malware families
In our analysis, we found some common characteristics between Mayhem and other * nix malware. This malware is similar to 'trololo _ mod' and 'effusion' 11-two intrusion tools for Apache and Nginx servers respectively. All three malware families share the following: • configuration uses the same format
• Use XTEA algorithm encryption in ECB Mode
• 0xDEADBEEF tags are widely used in configuration files and other code sections.
• The ELF headers of the dynamic link library is corrupted in the same way.
Despite the lack of evidence, we suspect that all three malware families were developed by the same gang. After completing this study, we can say that botnets manufactured for * nix web servers are becoming increasingly popular, just like the modern trend of malware. Why? The reasons are as follows:
• The Web server botnets provides a unique profit model, such as traffic redirection, leeching, and black hat SEO.
• The Web server has good online time, network channels, and better performance than conventional PCs
• In * nix world, automatic update technology is not widely used, especially when comparing desktop computers and smartphones. Most website administrators and system administrators need to manually upgrade their software and test to ensure that their basic services work properly. Professional maintenance is very expensive for normal websites, and the website administrator has no chance to do so. This means that it is easy for hackers to find such vulnerable Web servers and add them to botnet.
• In * nix world, anti-virus technology is not widely used. Many operators do not provide active defense mechanisms or process memory detection modules. What's more, a common website administrator usually does not want to spend time reading the software instructions and solve the performance problems that may arise.
Mayhem is a very interesting and sophisticated malware with a flexible and complex architecture. We hope our research can help the security community to combat such threats.
0x11 thanks:
We would like to thank Fraser Howard and Charles McCathie Nevile for their comments and suggestions which helped us improve this article.
0x12 references
Http://blog.malwaremustdie.org/2014/05/elf-shared-so-dynamic-library-malware.html.
Http://sysadminblog.net/2013/11/fake-wordpress-plug-ins.
FortDiscoBruteforceCampaign.
Http://www.arbornetworks.com/asert/2013/08/fort-disco-bruteforce-campaign.
Wheeler, D.; Needham, R. CorrectiontoXTEA.
Http://www.movable-type.co.uk/scripts/xxtea.pdf.
Http://en.wikipedia.org/w/index.PHP? Title = XTEA & oldid = 558387953.
Wikipedia. Blockciphermodeofoperation. http://en.wikipedia.org/w/index.PHP? Title = Block_cipher_mode_of_operation & oldid = 582012907.
Schneier, B. AppliedCryptography. JohnWiley & Sons, 1996.
Http://ultra-embedded.com/fat_filelib.
Https://github.com/freeoks/SD0_reader.
Http://slre.sourceforge.net /.
Effusion-anewsophisticatedinjectorforNginxwebservers.
Https://www.virusbtn.com/virusbulletin/archive/2014/01/vb201401-Effusion.
Http://www.linuxjournal.com/Article/7795.
0x13 Note:
[1] bot definition: Each such compromised device, known as a "bot", is created when a computer is penetrated by software from a malware (malicious software) distribution (from http://en.wikipedia.org/wiki/Botnet)
[2] C & C Definition: This server is known as the command-and-control (C & C) server (from http://en.wikipedia.org/wiki/Botnet)
[3] DGA definition: Domain generation algorithm (from http://en.wikipedia.org/wiki/Domain_generation_algorithm)