Let's take a look at the inverted quotes to successfully execute the named code fragment. The code is as follows:
Copy Code code as follows:
' Ls-al ';
' Ls-al ';
echo "SSS"; ' Ls-al ';
$sql = "Select ' username ' from ' table ' WHERE 1";
$sql = ' SELECT ' username ' from ' table ' WHERE 1 '
/*
Nothing more than a white-space character before, or after the end of a line of code, followed by write, the following two behavioral accidents, that is, the SQL command in the inverted quotation marks, to exclude it.
*/
How do you write regular expressions?
Analysis:
What are some common features of portability? What is the difference with other normal parts that contain inverted quotes?
They can be preceded by a space, tab, and other white-space characters. You can also have program code, provided that the quotation marks (single and double) must be closed. Is dangerous and hidden. CFC4N gave the following positive: "(:(?: ^ (?: \ S+) | (?:(? P<quote>["']) [^ (? P=quote)]+? (? p=quote) [^ ']*?)] `(? p<shell>[^ ']+ ' ".
Explain:
"(?:(?: ^ (?: \ S+) | (?:(? P<quote>["']) [^ (? P=quote)]+? (? p=quote) [^ ']*?)] 】
Matches the start position or the start position with a white space character or preceded by code, and the code has a closed single double quote. (This python is used for capture naming and reverse references)
"' (? p<shell>[^ ']+ ' "This is relatively simple, matching the string in the middle of the inverted quote.
A Python script that detects PHP Webshell is poorly considered.
Take a look at the first element of the next list. "(System|shell_exec|exec|popen)", which means that as long as the string contains "system", "shell_exec", "exec", "Popen" the four sets of strings are judged to be dangerous characters. Obviously, this method is too imprecise. If the programmer writes a code that contains these four sets of characters, it can be judged as a dangerous function. Very inaccurate, false alarm rate is very high. See figure below
A Python script that detects PHP Webshell is poorly considered.
What kind of code is suspicious? What is the key word?
Suspicious code is definitely made up of functions that can perform dangerous operations, PHP functions that can perform dangerous operations, the most important is the "eval" function, for encrypted PHP code (only deformed strings, zend and other means of encryption), must use the "eval" function, so, For code that uses either encryption method, the Eval function is definitely used. The second is the function that can execute system commands, such as the Four "system", "shell_exec", "exec" and "Popen" mentioned in the code of the cow above. Of course, there are other, such as PassThru and so on. PHP also supports "·" Character (the ESC key below) executes system commands directly. We can write the \b in this way. P<function>eval|proc_open|popen|shell_exec|exec|passthru|system) \b\s*\ (".
A relatively rigorous match for the Python script that detects PHP Webshell
Explain:
We all know that "\b\b" is used to match the positions on both sides of the word. To ensure that the middle of "\b\b" is a word, even if the function name preceded by a special character, but also by matching, such as add @ to block the error. The following "\s*" is used to match white space characters, including spaces, tab keys, 0 to countless times. Front of the "(? P) "is a capturing named group. A key used as a direct reference to a matching result as a Python code.
Also some netizens mentioned, if I put the code into the image extension file? Then you only test. php,.inc file, still can't find me. Well, yes, if the malicious code in GIF, JPG, PNG, AAA, such as a mess of expansion name files, can not be Apache, IIS and other Web services resolution, must be introduced through Include/require (_once). So, we just match the filename after Include/require (_once) is not a regular ". php", ". Inc" file. If it is not, it is a suspicious file. Just as follows "(?) p<function>\b (?: Include|require) (?: _once) \b) \s*\ (? \s*["'] (? P<filename>.*? (? <!\. (?:p hp|inc))) ["']】。
A more rigorous approach to detecting PHP Webshell python scripts
Explain:
First Look "(?) p<function>\b (?: Include|require) (?: _once)? \b) "," (?) p<name>) "is named capture for regular expressions, and is used in PHP. That is, the captured data within this bracket is assigned to value in the key of the result array "name". Then look inside the "\b (?: Include|require) (?: _once)" \b "," \b\b "does not explain, for the word boundary position. Inside the "(?: Include|require)" matching string "include", "require" two words, which precede the "(?:) "No group is allocated to improve efficiency, you can remove the"?: "into" (Include|require). In the latter one "(?: _once)" is also do not assign the operation of the group, easy to improve the regular expression of efficiency. Again, the following quantifier is "?" Represents this group as dispensable. To meet the "include", "include_once", "Require", "require_once" four of cases. Some friends may write "(include|include_once|require|require_once)" can also achieve the purpose. However, in order to be more efficient, we do this is optimized, for part of the string to make branch changes to the above "\b (?: Include|require) (?: _once)? \b".
Look at the following "\s*\ \s*[" '] (? P<filename>.+? (? <!\. (?:p hp|inc))) ['] ', "\s*" matches white space characters, including spaces, tab keys, and so on. Behind the "\ (?", matching character "("), followed by the quantifier "?" This half of the small cool bracket is optional. Prevent "incude" 123.php "" This is not a case of parentheses. Then "["] "matches double quotes, single quotes. Finally, too. And look at this "(?) P<filename>.+? (? <!\. (?:p hp|inc))) "and which," (? p<filename>) "described above, for named capture, put the result in the Match.group (" filename "). ". *?" is any character, the following quantifier is "ignore the first quantifier", which is usually said "not greedy." There is a minimum of 0 matches, (Prevent. AA,. htaccess This does not have a filename, only files that have file name extension are introduced). "(? <!\.:p hp|inc)", which uses a reverse 0-wide assertion (a lookup) that is not an operation (matches only the position, does not match the string, as "^$\b"). This expression works for the characters that are behind the position, which means that the front of "["] should not be preceded by ". php", ". Inc", which is the last extension of the filename. (in regular, you can use "^" to get the wrong character, but not the "string group", which is implemented with a 0-wide assertion.) )
To sum up, finally, I give the Python code as follows:
Copy Code code as follows:
#!/usr/bin/python
#-*-Encoding:utf-8-*-
###
# # @package
##
# # @author cfc4n <cfc4nphp@gmail.com>
# # @copyright Copyright (c) Www.cnxct.Com
# # @Version $Id: check_php_shell.py notoginseng 2010-07-22 09:56:28z cfc4n $
###
Import OS
Import Sys
Import re
Import time
def listdir (dirs,liston= ' 0 '):
Flog = open (OS.GETCWD () + "/check_php_shell.log", "A +")
If not Os.path.isdir (dirs):
Print "Directory%s is not exist"% (dirs)
Return
Lists = Os.listdir (dirs)
For list in lists:
filepath = Os.path.join (dirs,list)
If Os.path.isdir (filepath):
If Liston = = ' 1 ':
Listdir (filepath, ' 1 ')
Elif Os.path.isfile (filepath):
filename = os.path.basename (filepath)
If Re.search (r "\:p hp|inc|html?) $ ", filename, re. IGNORECASE):
i = 0
Iname = 0
f = open (filepath)
While F:
File_contents = F.readline ()
If not file_contents:
Break
i + 1
Match = Re.search (r "") (? p<function>\b (?: Include|require) (?: _once) \b) \s*\ (? \s*["'] (? P<filename>.*? (? <!\. (?:p hp|inc))) ["'] ', file_contents, re. ignorecase| Re. MULTILINE)
If match:
function = Match.group ("function")
filename = match.group ("filename")
If Iname = 0:
info = ' \n[%s]: \ n '% (filepath)
Else
info = '
info + = ' \t|--[%s]-[%s] line [%d] \ n '% (function,filename,i)
Flog.write (Info)
Print Info
Iname + 1
Match = Re.search (R ' \b) (? P<function>eval|proc_open|popen|shell_exec|exec|passthru|system) \b\s*\ (', file_contents, re. ignorecase| Re. MULTILINE)
If match:
function = Match.group ("function")
If Iname = 0:
info = ' \n[%s]: \ n '% (filepath)
Else
info = '
Info + + ' \t|--[%s] line [%d] \ n '% (function,i)
Flog.write (Info)
Print Info
Iname + 1
F.close ()
Flog.close ()
If ' __main__ ' = __name__:
Argvnum = Len (SYS.ARGV)
Liston = ' 0 '
If Argvnum = 1:
Action = Os.path.basename (Sys.argv[0])
print ' Command is like:\n%s D:\wwwroot\ \ n%s D:\wwwroot\ 1--recurse subfolders '% (action,action)
Quit ()
elif Argvnum = = 2:
Path = Os.path.realpath (Sys.argv[1])
Listdir (Path,liston)
Else
Liston = sys.argv[2]
Path = Os.path.realpath (Sys.argv[1])
Listdir (Path,liston)
Flog = open (OS.GETCWD () + "/check_php_shell.log", "A +")
isotimeformat= '%y-%m-%d%x '
Now_time = Time.strftime (Isotimeformat,time.localtime ())
Flog.write ("\ n----------------------%s checked---------------------\ n"% (Now_time))
Flog.close ()
# # The latest code is given in the link at the end of the article. Updated in 2010/07/31.
For reference only, welcome to treatise.
The screenshot below is for scanning the Discuz7.2, of course, also in error. Compared to the Python scripts circulated online, false positives are less and more accurate.
Detecting the results of a Python script that detects PHP Webshell
Q: Is this method perfect? Can you find all of the currently known hazard function files?
A: No, if the file introduced by include does not have an extended name, this will not match.
Q: How to solve?
Answer: Leave you to solve, clever you, certainly can handle.
PS: "'" "", "" "The command is not written, temporarily not a good way. Easy to confuse with inverted quotes in SQL statements. Not too good to match. If the light matches the inverted quotation mark, then the false alarm is too big. To be determined. (Specializing in surgery, do not deny a person's ability because of a bad code.) You know. Again, this article is only for code, not for people. Second, I give the Python code casually copied, casually spread, love to leave the copyright, do not love to stay on the deletion of the relevant characters, that is, you love why. )
I'll take a break, and I'll talk tomorrow. (the first half sentence for the Three Kingdoms kill Cao Ren's lines, ha.) )