Exact PHPWEBSHELL Trojan search method (1)

Source: Internet
Author: User
Today, I thought about how to find the PHPWEBSHELL Trojan. For more information, see. First, we can check the reverse quotation marks to successfully execute the code snippet named. The code is as follows:

The code is as follows:


'Ls-Al ';
'Ls-Al ';
Echo "sss"; 'ls-Al ';
$ SQL = "SELECT 'username' FROM 'table' WHERE 1 ";
$ SQL = 'select' username' FROM 'table' WHERE 1'
/*
It is nothing more than a blank character in front, or after the end of a line of code, followed by writing, the following two behavior accidents, that is, the anti-quotation marks in the SQL command, should be excluded.
*/


How to write a regular expression?
Analysis:
What do portability have in common? What is the difference between it and other normal parts that contain backquotes?
They can have leading spaces, tab keys, and other blank characters. You can also have program code, provided that the quotation marks (single and double) must be closed. It is dangerous and hidden. Then the regular expression given by CFC4N is as follows :【(? :(? : ^ (? : \ S + )?) | (? :(? P ["']) [^ (? P = quote)] +? (? P = quote) [^ '] *?) '(? P [^ '] + )'].
Explanations:
【(? :(? : ^ (? : \ S + )?) | (? :(? P ["']) [^ (? P = quote)] +? (? P = quote) [^ '] *?)]
Match the starting position or starting position with a blank character or code in front, and the code has a closed single double quotation mark. (Capture naming and reverse reference are used in this PYTHON regular expression)
【'(? P [^ '] +)': This is relatively simple. it matches the string in the middle of the backquotes.

A python script for detecting PHP webshell is not well considered.
Let's look at the first element in the next list. [(System | shell_exec | exec | popen )], this regular expression indicates that the string contains four strings: system, shell_exec, exec, and popen, which are considered dangerous characters. Obviously, this method is not rigorous. If the code written by a programmer contains these four groups of characters, it can be regarded as a dangerous function. Very inaccurate, and extremely high false positive rate. See

A python script for detecting PHP webshell is not well considered.
What kind of code is suspicious? What are the keywords?

Suspicious code must be composed of functions that can execute dangerous operations. The most important PHP function to execute dangerous operations is the "eval" function, for encrypted PHP code (only deformation strings, non-zend encryption, etc.), the "eval" function must be used. Therefore, for any encryption method code, you must use the "eval" function. The second is the function that can execute system commands, such as the four "system", "shell_exec", "exec", and "popen" mentioned in the code above ". Of course there are others, such as passthru. PHP also supports the "·" character (the one under the ESC key) to directly execute system commands. We can write the regular expression as [\ B (? P Eval | proc_open | popen | shell_exec | exec | passthru | system) \ B \ s *\(].

Check that PHP webshell python scripts are relatively strictly matched.
Explanations:

Everyone knows that [\ B] is used to match the positions on both sides of a word. Make sure that the word is in the middle of [\ B]. even if a special character is added before the function name, matching is also done, for example, adding @ to shield errors. The following [\ s *] is used to match blank characters, including spaces and the tab key. the number of times is 0 to countless. The preceding 【(? P) is the capture naming group. Used as a python code to directly reference the key of the matching result.

Some netizens mentioned that what if I put the code in a file with the extended image name? Then you can only check the. php and. inc files and still cannot find my files. Well, yes. If malicious code is in a mess of extension name files such as gif, jpg, png, and aaa, it cannot be parsed by web Services such as apache and IIS, it must be introduced through include/require (_ once. Then, we only need to match the file names After include/require (_ once) to be regular ". php" and ". inc" files. If not, it is a suspicious file. The regular expression is as follows 【(? P \ B (? : Include | require )(? : _ Once )? \ B) \ s *\(? \ S * ["'] (? P .*? (?
Measure the test taker's knowledge about php webshell python scripts.
Explanations:

First look 【(? P \ B (? : Include | require )(? : _ Once )? \ B )],【(? P ) Is the "name capture" of the regular expression, which is used in PHP in the same way. That is to say, the captured data in the brackets will be allocated to the value where the key of the result array is "name. Look at [\ B (? : Include | require )(? : _ Once )? \ B]. [\ B] is the boundary position of a word. 【(? : Include | require): matches the strings "include" and "require 【(? :)] You can remove 【? :] To [(include | require )]. In the next 【(? : _ Once) is also used for non-assigning operations to improve the efficiency of regular expressions. Similarly, the following quantifiers are "?" Indicates that this group is dispensable. This satisfies four conditions: "include", "include_once", "require", and "require_once. Some may write [(include | include_once | require | require_once)] in this way. However, to improve the efficiency, we optimize the regular expression and make branch changes for some strings, and change it to the [\ B (? : Include | require )(? : _ Once )? \ B ].

Let's look at the following [\ s *\(? \ S * ["'] (? P . +? (? . +? (? )] As described above, the results are saved in match. group ("filename") for naming capture. 【.*?] It is any character, followed by a "ignore preference", that is, "non-greedy ". There must be at least zero matches here (to prevent files without file names such as. aa and. htaccess, and only files with file extension names will be introduced ). The following 【(?
In conclusion, the python code provided by the monks is as follows:

The code is as follows:


#! /Usr/bin/python
#-*-Encoding: UTF-8 -*-
###
##@ Package
##
##@ Author CFC4N
##@ Copyright (c) Www. cnxct. Com
##@ Version $ Id: check_php_shell.py 37 2010-07-22 09: 56: 28Z cfc4n $
###
Import OS
Import sys
Import re
Import time
Def listdir (dirs, liston = '0 '):
Flog = open (OS. getcwd () + "/check_php_shell.log", "a + ")
If not OS. path. isdir (dirs ):
Print "directory % s is not exist" % (dirs)
Return
Lists = OS. listdir (dirs)
For list in lists:
Filepath = OS. path. join (dirs, list)
If OS. path. isdir (filepath ):
If liston = '1 ':
Listdir (filepath, '1 ')
Elif OS. path. isfile (filepath ):
Filename = OS. path. basename (filepath)
If re. search (r "\.(? : Php | inc | html ?) $ ", Filename, re. IGNORECASE ):
I = 0
Iname = 0
F = open (filepath)
While f:
File_contents = f. readline ()
If not file_contents:
Break
I + = 1
Match = re. search (r '''(? P \ B (? : Include | require )(? : _ Once )? \ B) \ s *\(? \ S * ["'] (? P .*? (? If match:
Function = match. group ("function ")
Filename = match. group ("filename ")
If iname = 0:
Info = '\ n [% s]: \ n' % (filepath)
Else:
Info =''
Info + = '\ t | -- [% s]-[% s] line [% d] \ n' % (function, filename, I)
Flog. write (info)
Print info
Iname + = 1
Match = re. search (r' \ B (? P Eval | proc_open | popen | shell_exec | exec | passthru | system) \ B \ s * \ (', file_contents, re. IGNORECASE | re. MULTILINE)
If match:
Function = match. group ("function ")
If iname = 0:
Info = '\ n [% s]: \ n' % (filepath)
Else:
Info =''
Info + = '\ t | -- [% s] line [% d] \ n' % (function, I)
Flog. write (info)
Print info
Iname + = 1
F. close ()
Flog. close ()
If '_ main _' = _ name __:
Argvnum = len (sys. argv)
Liston = '0'
If argvnum = 1:
Action = OS. path. basename (sys. argv [0])
Print "Command is like: \ n % s D: \ wwwroot \ 1 -- recurse subfolders" % (action, action)
Quit ()
Elif argvnum = 2:
Path = OS. path. realpath (sys. argv [1])
Listdir (path, liston)
Else:
Liston = sys. argv [2]
Path = OS. path. realpath (sys. argv [1])
Listdir (path, liston)
Flog = open (OS. getcwd () + "/check_php_shell.log", "a + ")
ISOTIMEFORMAT = '% Y-% m-% d % x'
Now_time = time. strftime (ISOTIMEFORMAT, time. localtime ())
Flog. write ("\ n ---------------------- % s checked ------------------- \ n" % (now_time ))
Flog. close ()
# The latest code is provided at the end of the article. Updated on 2010/07/31.


For your reference only.

The following code scans Discuz7.2. of course, there are also false positives. It has fewer false positives and is more accurate than the python scripts circulating on the Internet.
Check the python script of php webshell.
Q: Is this method perfect? Can I find all the currently known dangerous function files?
A: No. if there is no extended name for the file introduced such as include, it cannot be matched here.
Q: How can this problem be solved?
A: You can solve the problem. if you are smart, you can do it.
PS: "'" has not been written to execute the command against quotation marks, and there is no good method yet. It is easy to confuse with the back quotes in SQL statements. Not very good match. If the quotation marks are displayed after matching, the false positive is too large. To be determined. (There is a specialization in the industry. do not deny a person's abilities because of a bad code. You know. Again, this article only targets the code, not the people. Second, the python code provided by the monks can be copied and disseminated at will. if you love to retain the copyright, you can retain the copyright. if you don't like it, you can delete the relevant characters, that is, what do you like to do .)
I will take a rest and talk about it tomorrow. (The first half is the line that killed Cao Ren in the Three Kingdoms .)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.