0x00 Introduction
See the Wooyun Knowledge base of many great gods of the wonderful article, deeply self low to explode, so hard to find a topic, the main request for an invitation code, in order to continue to learn later.
For the network maintenance personnel, I am afraid that the most headache is the site is black, but also left the back door, even the server has been put right. And hacker usually leave the back door in a relatively unobtrusive place. It's hard to do it one by one, so this article explores the possible scenarios for Webshell automatic detection in PHP environments.
0x01 keyword Detection
Some common Webshell keywords are scanned using regular expressions and so on to determine if the file is Webshell. This kind of detection method is very violent, is also the simplest and most traditional detection method. It is obvious that such a simple and crude test will produce a higher false positive rate, and for some of the webshell or deformed, there will be no detectable problems. Therefore, such screening Webshell also through the Network Maintenance Personnel manual verification, Avira principle is: rather wrong kill 100, not let a ....
For example, the following section of the code on a website get, allegedly an online Webshell avira real code:
#!phpclass scan{private $directory = '. '; Private $extension = Array (' php ');p rivate $_files = Array ();p rivate $filelimit = 5000;private $scan _hidden = true;private $_self = "; Private $_regex= ' (preg_replace.*\/e| '. *?\$.*? ' |\bcreate_function\b|\bpassthru\b|\bshell_exec\b|\bexec\b|\ Bbase64_decode\b|\bedoced_46esab\b|\beval\b|\bsystem\b|\bproc_open\b|\bpopen\b|\bcurl_exec\b|\bcurl_multi_exec \b|\bparse_ini_file\b|\bshow_source\b|cmd\.exe| kadot@ngs\.ru| Group | special | rights | php\s rebound |shell\s? Enhanced Edition | wscript\.shell| Php\s? shell| eval\sphp\scode| udp1-fsockopen|xxddos| Send\sflow|fsockopen\ (' (UDP|TCP) | Syn\sflood) ';p rivate $_shellcode= ';p rivate $_shellcode_line=array ();p rivate $_log_array= Array ();p rivate $_log_ Count=0;private $action = ";p rivate $taskid =0;private $_tmp=";
0X02 's judgment on confusing Webshell
1. Information entropy
When it comes to the webshell of encryption, we have to mention information theory. The basic point of Shannon information theory is to use random variables or random vectors to represent sources, and to use probability theory and stochastic process theory to study informations. The encoded Webshell file contains a lot of random content or special information, this file will generate more ASCII code, the use of ASCII code to calculate the file's entropy will become larger, that is, to measure the Webshell of ordinary file uncertainty.
Formula Description:
- where n is the ASCII code, the determination of the ASCII 127 character (space) is meaningless, removed; Xn is the number of times the nth ASCII code appears in the current file, S is the total number of characters for the current script file.
- The greater the entropy value of info (A), the greater the likelihood of Webshell.
For more information about entropy, you can refer to the following: Https://en.wikipedia.org/wiki/Entropy_%28information_theory%29
2. Index of coincidence (IC, coincident Index)
Here's another way: set X as a ciphertext string of length n, we use a set to represent this cipher string {x1,x2,..., the coincident exponent of xn},x is the probability of randomly extracting any two elements.
Set NI as the number of characters I appear in this cipher. A two-character extraction from n characters is
And where ni I make a pair of the way there are
Thus, the two-to-one ratio, that is, from the X two characters are drawn to the probability of I.
The ciphertext of encrypted file becomes larger and its coincident exponent becomes lower. The encoded Webshell is similar to a random file, while the Webshell of the plaintext is based on a similar random string that is used to raise weights or contains binary, hexadecimal sequence content, so the extended ASCII code is used as the object of study. That is, the coincident exponent of 254 characters (minus ASCII 127) is calculated. For a script file, the lower the coincident exponent, the greater the likelihood of Webshell.
In addition to the two algorithms described above, you can also use Base64 encoding to detect the script file, for the encrypted Webshell file, because Base64 encoding eliminates non-ASCII characters, so that the actual base64 encoded pieces of the character will have such a feature- Smaller and uneven distribution, which means that the file's compression ratio becomes larger. The characteristic of detecting webshell in this way is that it has a greater compression ratio than other files.
For the confusion of the Webshell judgment, according to the algorithm calculated by the results are some specific values, according to the actual set of thresholds to compare, so as to determine whether it is Webshell. The thresholds are set because different sites are not the same, so you have to go through specific tests.
0x03 Webshell Real-time dynamic detection based on PHP extension
This kind of detection method is popular at present, mainly because this method uses the hook which calls the dangerous function to PHP, can detect the Webshell dynamically, relatively real-time, and quickly. To a certain extent, it makes up for the shortcomings of traditional webshell static detection, and it is also relatively convenient.
PHP operates within the Web container of three main types: module loading mode running, CGI or fastcgi mode of operation, three methods have to go through 5 stages: module initialization, request initialization, code execution, request end, module end. In PHP code execution, through lexical analysis, the PHP code into a language fragment (tokens), then parse into a meaningful expression, and finally compile the expression into an intermediate bytecode (opcodes). The intermediate bytecode is executed on the Zend virtual machine, and then the output results.
We use the common interface provided by the PHP kernel: zend_set_user_opcode_handler to modify the corresponding processing function of the intermediate bytecode, to achieve the effect of the PHP kernel hook. Function Prototypes:
#!cint Zend_set_user_opcode_handler (Zend_uchar opcode,opcode_handler_t handler)
The former is the required opcode, and the latter is the handler function after the hook.
Processing functions such as Zend_include_or_eval, Zend_do_fcall, zend_do_fcall_by_name (see example functions below) are generally used in extensions Zend_set_user_opcode_handler Be processed.
An attacker could upload Webshell to the directory through any file upload vulnerability. Then when the file is accessed after the upload, by the path of the file is in the black and white list to judge, if not meet the black and white list rules, is considered an attack and timely interception.
Potentially dangerous functions that can be hook:
- Command execution class: Passthru,system,popen,exec,shell_exec, etc.
- File System class: Fopen,opendir,dirname,pathinfo, etc.
- Database Operation class: Mysql_query,mysqli_query, etc.
- callback function: Array_filter,array_reduce,usort,uksort, etc.
- Reflection function: Reflectionfunction
PHP extensions are written in pure C, giving the main code for reference:
#!c#include "Config.h" #include "php.h" #include "php_ini.h" #include "ext/standard/info.h" #include "php_waf.h" static int Le_waf;const Zend_function_entry waf_functions[] = {Php_fe (confirm_waf_compiled,null), PHP_FE_END};zend_module_ Entry Waf_module_entry = {#if zend_module_api_no >= 20010901standard_module_header, #endif "WAF", Waf_functions,php_ Minit (WAF), Php_mshutdown (WAF), Php_rinit (WAF), Php_rshutdown (WAF), Php_minfo (WAF), #if zend_module_api_no >= 20010901php_waf_version, #endifSTANDARD_MODULE_PROPERTIES}; #ifdef Compile_dl_wafzend_get_module (WAF); #endifPHP_ Minit_function (WAF) {Zend_set_user_opcode_handler (zend_include_or_eval,manage);//Hook EVAL etc zend_set_user_opcode_ Handler (zend_do_fcall_by_name,manage); Hook variable function Execution Zend_set_user_opcode_handler (zend_do_fcall,manage); The hook command executes return SUCCESS;} int Manage ()/*hook handler function */{char* filepath = (char*) zend_get_executed_filename (Tsrmls_c); if (Strstr (filepath, "upload") /* * Determine if the string "upload" is a substring of filepath */{php_printf ("Do not execute malicious code
Execute file path:%s ", filepath); return zend_user_opcode_return;} Elsereturn Zend_user_opcode_dispatch;}
0X04 Summary
In addition to the above-mentioned Webshell detection methods, there are currently network-based detection methods.
For example, current research has focused on the method of configuring IDs or WAF to detect Webshell at the network entrance. FIREEYE[28] The use of Snort Configuration feature rules to detect a sentence Trojan. There is also a way to detect the behavior of the upload Webshell by configuring the Modsecurty core rule set (Corerule sets).
Both of these methods are used to analyze whether special keywords are included in the HTTP request (for example,
<%>