MySQL wide byte injection

Source: Internet
Author: User
Tags sql injection

Although all programs are now calling for Unicode encoding, all websites use UTF-8 encoding for a unified international specification. However, there are still a lot of CMS, including domestic and foreign (especially non-English-speaking countries), still use a set of their own country code, such as GBK, as their default encoding type. There are also some CMS in order to consider the old users, so out of GBK and utf-8 two versions.

We will take the GBK character code as a demonstration, the curtain is opened. GBK is a multi-character encoding, specifically defined by Baidu itself. But there is one place in particular to note:

Typically, a GBK encodes a Chinese character, taking up 2 bytes. A utf-8 encoded Chinese character that occupies 3 bytes. In PHP, we can use the output

Echo strlen ("and");

To test. Output 2,utf-8 When the page encoding is saved to GBK is 3.

All ANSI encodings are 2 bytes apart from GBK. ANSI is just a standard, on the unused computer It may represent a different encoding, such as the Simplified Chinese system ANSI is represented as GBK.

The above is a little bit about multi-byte coding, only we have enough knowledge of its composition and characteristics, in order to better analyze its problems.

Having said so much nonsense, now let's look at the various problems that are caused by character encoding in SQL injection.

0x01 wide-character injection in MySQL

This is an old topic, and has been played countless times. But as the prelude to our passage, it is also the foundation that must be mentioned.

Let's build an experimental environment first. Call it a Phithon content management System v1.0, first create a new database, the following compression package in the SQL file import:

Test code and database: HTTP://PAN.BAIDU.COM/S/1EQMUARW extract Password: 75tu

Later, the Phithon content management system will be refined, but will always use this data sheet.

The source code is very simple (note that you first close your PHP environment MAGIC_QUOTES_GPC):

123456789101112131415161718192021222324 01  <?php02  //连接数据库部分,注意使用了gbk编码,把数据库信息填写进去03  $conn= mysql_connect(‘localhost‘‘root‘‘[email protected]#$‘ordie(‘bad!‘);04  mysql_query("SET NAMES ‘gbk‘");05  mysql_select_db(‘test‘$conn) OR emMsg("连接数据库失败,未找到您填写的数据库");06  //执行sql语句07  $id= isset($_GET[‘id‘]) ? addslashes($_GET[‘id‘]) : 1;08  $sql"SELECT * FROM news WHERE tid=‘{$id}‘";09  $result= mysql_query($sql$connordie(mysql_error()); //sql出错会报错,方便观察10  ?>11  <!DOCTYPE html>12  13  14  <meta charset="gbk"/>15  <title>新闻</title>16  17  <body>18  <?php19  $row= mysql_fetch_array($result, MYSQL_ASSOC);20  echo ";21  mysql_free_result($result);22  ?>23  </body>24 

  

The SQL statement is the SELECT * from news WHERE tid= ' {$id} ', which extracts the article from the news table based on the ID of the article.

In front of this SQL statement, we used a addslashes function to escape the value of $id. This is usually the CMS in the operation of SQL injection, as long as our input parameters in single quotation marks, will not escape the limit of single quotation marks, can not be injected, such as:

So how to escape the addslashes limit? It is well known that the effect of the addslashes function is to let ' become ', so that the quotation marks become no longer "single quotes", just one-off. The general way of bypassing is to try to handle \ ' front \:

1. Find a way to add a \ (or a single number), into \ \, so that \ was escaped, ' escaped the limit 2. Find a way to get it.

Our wide-byte injection here is a feature of MySQL, which, when used with GBK encoding, considers two characters to be a Chinese character (the previous ASCII code is greater than 128 before the range of Chinese characters). If we enter%DF ' See what happens:

We can see that the error has been made. We see an error stating an error in the SQL statement, and see that the error description can be injected.

Why from just now, just in the ' that is%27 in front add a%df on the error? And you can see that the cause of the error is more than one single quotation mark, and the single quotation mark before the backslash is missing.

This is the feature of MySQL, because GBK is multi-byte encoding, he thinks that two bytes represents a Chinese character, so%df and the back of the%5c become a Chinese character "", and ' escaped out.

Because two bytes represents a Chinese character, we can try "%df%df%27":

Not an error. Because%DF%DF is a Chinese character,%5c%27 is not a Chinese character, still is \ '.

So MySQL how to judge a character is not a kanji, according to GBK encoding, the first byte ASCII code is greater than 128, basically can be. For example, we do not need to%DF, with%A1 can also:

%a1%5c He may not be a Chinese character, but will be considered by MySQL to be a wide character, can let the back of the%27 escape out.

So I can construct an exp out, query the Administrator account password:

The difference between 0x02 GB2312 and GBK

There was a problem that had been bothering me for a long time.

GB2312 and GBK should all be part of a wide-byte family. But let's do a little experiment. Modify the set names in the Phithon content management system to gb2312:

The result is that it cannot be injected:

Some students do not believe, you can also change the database code to gb2312, is also unsuccessful.

Why, this is due to the range of gb2312 encoded values. Its high-level range is 0xa1~0xf7, the low range is 0xa1~0xfe, and \ is 0x5c, is not in the low range. So, 0x5c is not the code in the gb2312, so nature will not be eaten.

So, to extend this idea to all the multibyte encodings in the world, we can assume that wide-character injection can be done as long as the low range contains 0x5c encoding.

0x03 mysql_real_escape_string solve the problem?

Some CMS have an understanding of wide-byte injection, and then seek a solution. In the PHP documentation, you will find a function, mysql_real_escape_string, that the document says, considering the current character set of the connection.

As a result, some CMS replaced Addslashes with mysql_real_escape_string to protect against wide-character injection. We continue to do experiments, Phithon Content Management System v1.2:, with mysql_real_escape_string to filter input:

Let's try to inject:

Like no pressure to inject. Why, obviously I used mysql_real_escape_string, but still can't resist wide character injection.

The reason is that you did not specify a character set for PHP to connect to MySQL. We need to call the Mysql_set_charset function before executing the SQL statement, setting the current connection's character set to GBK.

You can avoid this problem:

the repair of 0x04 wide character injection

In 3 We talked about a fix, which is to call the Mysql_set_charset function to set the connection using the character set as GBK, and then call mysql_real_escape_string to filter the user input.

This way is feasible, but there are some old CMS, in many places using addslashes to filter strings, we can not go to a addslashes all modified to mysql_real_escape_string. Our second solution is to set the character_set_client to binary (binary).

Just specify before all SQL statements that the form of the connection is binary:

mysql_query ("SET character_set_connection=gbk, Character_set_results=gbk,character_set_client=binary", $conn);

What do these variables mean?

When our MySQL receives the data from the client, it will think that his code is character_set_client, then it will be replaced with character_set_connection code, then into the specific table and field, and then converted to the corresponding encoding of the field.

Then, when the result of the query is generated, it is converted from the table and field encoding to Character_set_results encoding, which is returned to the client.

Therefore, we set the character_set_client to binary, there is no wide-byte or multi-byte problem, all data in the form of binary transmission, can effectively avoid wide character injection.

For example, the v2.0 version of our Phithon Content management system is updated as follows:

has not been able to inject:

In the code I've audited, most CMS avoids wide-character injection in this way. This method can be said to be effective, but if the developer adds something to the lily, it will make the previous effort naught.

Fatal consequences of 0x05 Iconv

Many CMS, more than one, I do not mention the name, their GBK version is due to the character encoding caused by the injection. But some students said that they tested these CMS wide character injection, no effect, is not their posture is wrong?

Of course not. In fact, this chapter is no longer a wide-character injection, because the problem is not in MySQL, but in PHP.

A lot of CMS (really a lot of oh, do not believe everyone on the internet to find) will receive data, call such a function, transform its code:

Iconv (' Utf-8 ', ' GBK ', $_get[' word ');

The purpose is generally to avoid garbled characters, especially in the search box location.

For example, our Phithon Content management System v3.0

We can see that it sets the character_set_client to binary before the SQL statement is executed, so you can avoid the problem of wide character injection. But then it called the Iconv to convert the filtered parameters $id to a bit.

Let's try to inject it at this point:

Actually gave an error. Description can be injected. And I just entered a "Kam". What is the reason for this?

Let's analyze it for a moment. "Kam" This word, its utf-8 code is 0XE98CA6, its GBK code is 0xe55c.

Some students may have grasped it. The ASCII code is 5c. Then, when our Kam-iconv from Utf-8 converted to GBK, turned into a%e5%5c, and the back of the ' was addslashes into%5c%27, so that the combination is%e5%5c%5c%27, two%5c is \ \, just the anti-slash escaped, Cause ' escape out of single quotation marks, resulting in injection.

This is taking advantage of what I said before, the first of two ways to bypass addslashes: to escape.

So, what if I use Iconv to convert GBK into Utf-8?

Let's try it out:

Sure enough, it succeeded again. This time it is directly injected with a wide character, but the problem is actually in PHP instead of MySQL. We know a GBK Kanji 2 bytes, utf-8 Kanji 3 bytes, if we convert GBK to Utf-8, then PHP will convert every two bytes. So, if the characters in front of it are odd, it is bound to swallow \, ' Escape the limit.

So why didn't you use this posture before Utf-8 converted to GBK?

This is related to Utf-8 's rules, UTF-8 's coding rules are simple, only two:

1) for a single-byte symbol, the first bit of the byte is set to 0, and the next 7 bits are the Unicode code for the symbol. So for the English alphabet, the UTF-8 encoding and ASCII code are the same. 2) for n-byte notation (n>1), the first n bits are set to 1, the n+1 bit is set to 0, and the first two bits of the subsequent bytes are set to 10. The rest of the bits are not mentioned, all of which are Unicode codes for this symbol.

From 2 we can see that for multibyte symbols, its 2nd, 3, 4 bytes of the first two bits are 10, that is, \ (0x0000005c) does not appear in the Utf-8 encoding, so utf-8 conversion to GBK, if there is a \ then PHP will error:

But because the GBK code contains the \, so can still be used, but the use of different ways.

All in all, after we've processed the wide-character injection of MySQL, don't think it's safe to worry about. Be careful when calling iconv and avoid unnecessary hassles.

0X06 Summary

In the gradual internationalization of today, the implementation of UTF-8 coding is a major trend. In terms of security, I also feel that the use of UTF-8 encoding can avoid many multibyte-caused problems.

Not only is GBK, I just habitually put GBK as a typical example in the text with you to explain. There are many multi-byte coding in the world, especially in Korea, Japan and some non-English-speaking countries CMS, there may be a security problem caused by character encoding, we should have a scalable thinking.

Summarize the security issues raised by character encoding mentioned in this article and their solutions:

1.GBK encoding caused by wide character injection problem, the workaround is to set character_set_client=binary. 2. Correcting people's misconceptions about mysql_real_escape_string, it is not possible to avoid wide character injection problems by calling set NAME=GBK and mysql_real_escape_string alone. You also have to call Mysql_set_charset to set the character sets. 3. Use iconv sparingly to convert string encodings, which can be problematic. As long as we set the front html/js/css all encoding to gbk,mysql/php encoding to GBK, there will be no garbled problem. Do not use the superfluous to call the Iconv conversion code, causing unnecessary trouble.

This article is my own white box audit experience a little summary, but I do in many aspects of the lack of the text mentioned in the posture is inevitably flawed and wrong, I hope that the same hobby classmates can point out with me, common progress.

This article does not resemble the previous XSS, can cite many 0day examples to demonstrate the harm caused by wide characters. There are two reasons:

1. The wide character problem is not as good as rich text XSS is so common, GBK encoding CMS accounted for a relatively small proportion, blame me caishuxueqian, and can not find the corresponding instance of each chapter. 2. The risk of injection is much larger than XSS, and if sent as 0day, the impact is very bad. But I did find a lot of problems with the coding of CMS in writing articles as well as in the previous audit process.

So I use the form of experiments, I wrote the PHP small file, to everyone as an example, I hope not because of the lack of examples, affect the effect of learning.

Example PHP file and SQL file package download:

Link: Http://pan.baidu.com/s/1eQmUArw Extract password: 75tu

This document is in PDF version: Link: http://pan.baidu.com/s/1eprLs Password: yoyw

Reprint Address: https://www.leavesongs.com/PENETRATION/mutibyte-sql-inject.html, thank Phithon for sharing, especially good.

MySQL wide byte injection (RPM)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.