Network attack technology (iii) -- Denial Of Service

Last Update:2013-11-20 Source: Internet

Author: User

Tags rehash

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1.1.1 Summary

Recently, network security has become a focus. In addition to domestic plaintext password security events, there is also a major impact-Hash Collision DoS (Denial-of-service attacks through Hash collisions ), some malicious people will use this security vulnerability to make your server extremely slow. What measures do they use to make the server extremely slow? How can we prevent DoS attacks? This article provides a detailed introduction.

1.1.2 text

Before introducing the Hash Collision DoS attack, let's first review the Hash table ).

A Hash table (also called a Hash table) is a data structure that is directly accessed based on Key/Value values. That is to say, It maps the key value to a location in the table to access records to speed up the search. This ing function is called a hash function (which affects system performance), and the array storing records is called a hash table.

We all know that the hash function cannot avoid hash conflicts when calculating hash values.

Suppose we define a hash function hash (), m represents the original key without hash calculation, and h is the hash value obtained by m after hash.

Now we can calculate the hash of the original keys m1 and m2 to obtain the corresponding hash values: hash (m1) and hash (m2 ).

If the original key m1 = m2, the same hash value may be obtained, but the key m1! = M2 may also get the same Hash value, so Hash collision occurs. In most cases, Hash conflicts can only be minimized, but cannot be completely avoided.

When a hash conflict occurs, we can use the conflict resolution method to resolve the conflict. The main method of resolving the hash conflict is as follows:

Open address Method

Rehash

Link address Method

Create a public overflow Zone

When hash conflicts occur, we can indeed use the above methods to solve hash conflicts, but can we avoid hash conflicts as much as possible? If hash conflicts are reduced to a very small value, our system performance will be greatly improved. We use an algorithm with a lower chance of conflict to calculate the hash value.

Generally, an algorithm is measured by its optimal, General, and worst-case time-space complexity.

Ideally, the time complexity for inserting, searching, and deleting an element in a hash table is O (1 ), the time complexity of inserting, searching, and deleting n elements is O (n ), A hash value (Key) can be calculated for any data item within a time period unrelated to the hash table length, and then calculated based on the hash value (Key) locate to a slot in the hash table (term bucket, indicating a location in the hash table ). Ideally, we insert n elements into a hash table with a length of n, in addition, after hash calculation, their hash values are evenly distributed to each slot in the hash table without conflict. This is indeed ideal. However, this does not conform to the actual situation. We cannot predict the number of inserted elements and the length of the hash table is limited. Therefore, hash conflicts cannot be avoided.

Figure 1 time complexity of a hash table

There are two ways to solve the collision. The first strategy is to set the collision data to other slots based on some principle, such as linear exploration by the open address method, if a collision occurs when data is inserted, search the slots behind the slot in sequence and place them in the first slot that is not occupied; the second strategy is that each slot can not only accommodate the location of one data item, but also a data structure (such as a linked list or a red/black tree) that can accommodate multiple data items ), all collision data is organized in the form of a certain data structure (linear exploration: di =, 3 ,..., M-1 ).

Figure 2 open address Method

No matter which Collision Resolution Policy is used, the time complexity of insert, search, and delete operations is no longer O (1 ). Take the search as an example: if the slot ends when the hash value (Key) is not located, you also need to compare whether the original Key (that is, the Key without hash) is equal. If not, use the same algorithm as insert to continue searching until the matching value is found or the data is not in the hash table.

. NET uses the first policy to resolve hash conflicts. It locates collision data in other slots according to certain principles.

PHP uses a single-chain table to store collision data. Therefore, the average search complexity of the PHP hash table is O (L), where L is the average length of the bucket list; the worst complexity is O (N). At this time, all data is collided, And the hash table degrades to a single-chain table. It is a normal hash table and a degraded hash table in PHP.

Figure 3 normal hash table

Figure 4 degraded hash table

Through the normal hash table, we find that under normal circumstances, the probability of uniform conflicts of hash value distribution is very low, and all data in the degraded hash table conflicts on the same slot, this will change the time complexity of data insertion, query, and deletion to O (n2). Because the time complexity increases by an order of magnitude, a large amount of CPU resources will be consumed, and the system cannot respond to requests in a timely manner, in this way, DoS attacks are achieved.

Implementation of hash tables in. NET

Data Structure

Define a struct bucket in. NET to indicate the slot. It contains only three fields. The specific code is as follows:

/// <Summary> // Defines hash bucket. /// </summary> private struct bucket {// <summary> // The hask key. /// </summary> public object key; // <summary> // The data value. /// </summary> public object val; // <summary> // The key has hash collision. /// </summary> public int hash_coll ;}

In the event of a conflict, linear exploration and re-partitioning are prone to secondary aggregation of records during the processing process.. NET to reduce the occurrence of hash conflicts by re-hashing and dynamically increasing the length of the hash table.

/// <Summary> /// Rehashes the specified newsize. /// </summary> /// <param name = "newsize"> The newsize. </param> private void rehash (int newsize) {this. occupancy = 0; // Creates a new bucket. hashtable. bucket [] newBuckets = new Hashtable. bucket [newsize]; for (int I = 0; I <this. buckets. length; I ++) {Hashtable. bucket = this. buckets [I]; if (bucket. key! = Null) & (bucket. key! = This. buckets) {this. putEntry (newBuckets, bucket. key, bucket. val, bucket. hash_coll & 0x7fffffff) ;}} Thread. beginCriticalRegion (); this. isWriterInProgress = true; // Changes the bucket. this. buckets = newBuckets; this. loadsize = (int) (this. loadFactor * newsize); this. updateVersion (); this. isWriterInProgress = false; Thread. endCriticalRegion ();}

Through the rehash () method above, we know that when a conflict occurs,. NET will rehash and increase the length of the hash table to avoid further conflicts.

Hash Algorithm

Now let's take a look at what hash algorithm. NET uses and view the Object. GetHashCode () method. The specific code is as follows:

Public virtual int GetHashCode () {return InternalGetHashCode (this );}

We found that Object. the GetHashCode () method calls another method InternalGetHashCode (). Let's further look at the InternalGetHashCode () method and find that it is mapped to a method ObjectNative: GetHashCode in CLR. The specific implementation code is as follows:

FCIMPL1 INT32 (ObjectNative: GetHashCode, Object * obj) {CONTRACTL {THROWS; DISABLED (GC_NOTRIGGER); INJECT_FAULT (FCThrow (break); lost; SO_TOLERANT;} CONTRACTL_END; VALIDATEOBJECTREF (obj); DWORD idx = 0; if (obj = 0) return 0; OBJECTREF objRef (obj); HELPER_METHOD_FRAME_BEGIN_RET_1 (objRef ); // Set up a frame // Invokes another method to create hash code. idx = GetHashCodeEx (OBJECTREFToObject (objRef); HELPER_METHOD_FRAME_END (); return idx;} FCIMPLEND

Www.2cto.com

The implementation of this method is not complex, but we will soon find that the GetHashCodeEx () method is called in this method, which is the implementation of the specific hash algorithm, the implementation code is very long. If you want to view its C ++ source code, click here.

Currently, the hash algorithm used by mainstream programming languages is DJB (DJBX33A), while the NameValueCollection. GetHashCode () method in. NET uses the DJB algorithm.

The core of DJB's Algorithm Implementation is to calculate the hash value by multiplying the hash value (Key) by 33 (that is, moving the five digits left and adding the hash value). Let's take a look at the implementation of DJB's algorithm!

/// <Summary> /// Uses DJBX33X hash function to hash the specified value. /// </summary> /// <param name = "value"> The value. </param> // <returns> The hash string </returns> public static uint DJBHash (string value) {if (string. isNullOrEmpty (value) {throw new ArgumentNullException ("The hash value can't be empty. ");} uint hash = 5381; for (int I = 0; I <value. length; I ++) {// The value of (hash <5) + hash) the same as // the value of hash * 33. hash = (hash <5) + hash) + value [I] ;}}

We can see that the implementation of the DJB algorithm is very simple, but it is a very good hash algorithm, it generates a low probability of Hash Value Conflict, let's take a look. NET String. implementation of the GetHashCode () method-DEK algorithm.

/// <Summary> /// Returns a hash code for this instance. /// </summary> /// <returns> // A hash code for this instance, suitable for use in hashing algorithms // and data structures like a hash table. /// </returns> public override unsafe int GetHashCode () {// Pins the heap address, so GC can't collect it. fixed (char * str = (char *) this) {char * chPtr = str; int num = 0x15051505; int num2 = num; int * numPtr = (int *) chPtr; for (int I = this. length; I> 0; I-= 4) {// Uses DEK to generate hash code. num = (num <5) + num) + (num> 0x1b) ^ numPtr [0]; if (I <= 2) {break ;} // Uses DEK to generate hash code. num2 = (num2 <5) + num2) + (num2> 0x1b) ^ numPtr [1]; numPtr + = 2 ;} return (num + (num2*0x5d588b65 ));}}

The GetHashCode () method is rewritten in the String class. Because GetHashCode () involves some pointer operations, this method is defined as unsafe to indicate insecure context, maybe someone will wonder if C # can still operate like a pointer in C/C ++? We need to use the pointer operation in C #, then the fixed keyword will finally come in handy. The fixed keyword is used to pin a reference address, because we know that the CLR garbage collector will change the address of some objects, so the reference to those objects will change after the address is changed. This change is not intended for programmers, so it is not allowed in Pointer operations. Otherwise, the address we have reserved before cannot be found after GC (for details about fixed, refer to here ).

Hash collision attack

See the previous introduction. the GetHashCode () method in. NET. Now we have a preliminary understanding of the implementation algorithm. The principle of hash conflicts is to construct data based on a specific hash algorithm, this causes a collision between all data.

But how to construct data? Let's take an example. Suppose we insert data to an object of the NameValueCollection type.

Figure 5 insert data

We found that inserting 1000 pieces of data only requires 88 MS, when inserting 2000 pieces of data requires 345 MS, as the size of the inserted data increases, we found that the insertion time is getting longer and longer, isn't the insertion time complexity of the hash table O (n? We certainly know that this is because the time complexity cannot reach the linear level due to hash conflicts.

Here we use a simple method to construct conflicting data-the brute force method. (Low Efficiency)

Due to the low efficiency of the brute force method, we use a more efficient method to construct conflicting data, meet-in-the-middle attack or equivalent substrings.

Equivalent substring:

If the hash function has this feature, when the hash values of two strings conflict, for example, hash ("string1") = hash ("string2 "), the hash conflicts between the two substrings at the same position. For example, hash ("prefixstring1postfix") = hash ("prefixstring2postfix ").

If "EZ" conflicts with "FY" in the hash function, the strings "EzEz", "EzFY", "FYEz", and "FYFY" also conflict. For more information, see the example of using an equivalent substring.

Attack:

If an equivalent substring does not exist in a given hash function, the brute force method seems to be the only solution. However, we have introduced the low efficiency of the brute force method. Obviously, the probability of hitting the target using 32 bits is 1/(2 ^ 32 ).

Now we only need to calculate the 16-bit hash value, so the probability of hitting the target is 1/(2 ^ 16), which greatly improves the hit probability and shortens the data construction time.

We use the equivalent substring method to divide the string into two parts: the prefix substring (Length: n) and the suffix substring (Length: m). Then we enumerate the hash value of the prefix substring, and their hash values are equal.

Here we will review a mathematical knowledge-exclusive or operation.

Figure 6 exclusive or operation

Now we use the exclusive or operation to make the enumerated prefix substrings have the same hash value. First we multiply the prefix substrings by 1041204193 and then calculate the hash value through DJB33. Some may ask why we multiply the value by 1041204193?

Because 1041204193*33 = 34359738369

Binary: 00000000000000000000000000000000001 (Int32)

We know that 1041204193*33 = 1, so the hash value of the prefix substring is only related to its characters, which increases the chance of conflict.

The sample code of the HashBack () method is as follows:

/// <Summary> // The hash back function. /// </summary> /// <param name = "tmp"> The string need to hash back. </param> /// <param name = "end"> The hash back value. </param> /// <returns> The hash back string. </returns> [ReliabilityContract (Consistency. willNotCorruptState, Cer. mayFail)] private static unsafe int HashBack (string tmp, int end) {int hash = end; fixed (char * str = tmp) {char * suffix = str; int length = tmp. length; for (; length> 0; length-= 1) {hash = (hash ^ suffix [length-1]) * 1041204193;} return hash ;}}

We can see that the HashBack () method contains two parameters: one is the string to calculate the hash value, and the other is the hash value that finally conflicts.

Next, let's show the code for implementing the DJB33 hash function:

/// <Summary> // The hash function with DJB33 algorithm. /// </summary> /// <param name = "tmp"> The string need to hash. </param> /// <returns> The hash value. </returns> [ReliabilityContract (Consistency. willNotCorruptState, Cer. mayFail)] private static unsafe int Hash (string tmp) {int hash = 5381; fixed (char * str = tmp) {char * p = str; int tmpLenght = tmp. length; for (; tmpLenght> 0; tmpLenght-= 1) {hash = (hash <5) + hash) ^ * p ++;} return hash ;}}

Now we have completed the HashBack () and Hash () methods. First, we use the HashBack () method to calculate the Hash value of the prefix substring, and then use Hash () method To find out the substring that conflicts with the prefix substring, and finally splice the prefix and suffix to form a conflicting string.

It may sound awkward, so let's explain it through specific examples!

Suppose we find the prefix "BIS", and HashBack ("NBJ") = 147958270. Then we use the brute force method to find out the suffix Hash ("SKF0FTG") = 147958270 that conflicts with the prefix, then We splice them to calculate Hash ("NBJ" + "SKF0FTG") = 6888888. We can see that the Hash value of the concatenated string is specified in advance, therefore, we can continuously construct conflicting data using this method.

Then we use the above conflicting data for the insertion test. At the beginning, the CPU consumption for data insertion started to increase. Once it was consumed by 100%, the machine could not work, so it could not.

Figure 7 hash conflict Test

Defense limits CPU time

This is the easiest way to reduce the impact of such attacks. It is allowed to participate by reducing the CPU request time. You can set the max_input_time parameter value for PHP. in IIS (ASP. NET), you can set the "shutdown time limit" value (90 s by default ).

Limit the number of POST Request Parameters

The security update released by Microsoft is to limit ASP. NET to accept up to 1000 parameters when processing http post requests. (Patch)

If our Web application needs to accept more than 1000 parameters, you can modify the maximum number of parameters by setting the value of MaxHttpCollectionKeys in WebConfig. The specific settings are as follows:

<! -- Setting Max Http post value --> <shortettings> <add key = "aspnet: MaxHttpCollectionKeys" value = "1001"/> </shortettings>

Limit the length of a POST request using a random Hash Algorithm

Because we know the hash algorithm used in advance, constructing conflicting data is more targeted. However, once a random hash algorithm is available, we cannot predict the algorithm. Therefore, it is difficult to construct conflicting data, unfortunately, many mainstream programming languages use non-random hashing algorithms, except for Perl, such. NET, Java, Ruby, PHP, and Python all adopt non-random algorithms.

1.1.3 Summary

This article introduces the hash table, introduces the implementation of the hash table and the processing method when hash conflicts occur, and then constructs conflicting data (equivalent substring and intermediate correspondence) based on the specific hash algorithm ), finally, we will introduce how to defend against Hash Denial Of Service attacks.

Because we know the Hash algorithm used in advance, constructing conflicting data is more targeted, so we can use the random Hash algorithm to defend against Hash Denial Of Service attacks more effectively, it is estimated that many languages will redesign their hash functions.

Refer:

[1] 2007_28C3_Effective_DoS_on_web_application_platforms

[2] advisory28122011

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More