Several common hash algorithms and principles

Source: Internet
Author: User
Tags bitwise

In computational theory, there is no hash.Function of the argument, only one-way function of the argument. The so-called one-way function, is a complex definition, you can see the computational theory or cryptography data. WithPeopleClassThe language describes the one-way function: If a function is given input, it is easy to calculate the result, and when the given result is difficult to calculate the input, this is the single function. Various encryption lettersCan be thought of as the approximation of one-way functions. HashA function (or a hash function) can also be seen as an approximation of a one-way function. That is, it is close to satisfying the definition of one-way function.
HashThe function also has another meaning. The actual hashA function refers to a large range mapped to a small range. The goal of mapping a wide range to a small area is often to save space and make data easy to save. In addition, the HashFunctions are often applied to lookups. So, consider using a hashfunction, you need to understand several of its limitations:
1. HashThe main principle is to map a large range to a small range, so the actual number of values you enter must be equal to or smaller than the small range. Otherwise there will be a lot of conflict.
2.Due to hashApproximation of one-way functions, so you can use it to encrypt data.
3.Different applications to hashfunctions have different requirements; For example, a hash for encryptionThe function mainly considers the difference between it and the individual function, and the hash used to findThe function mainly considers its mapping to a small range of conflict rates.
Hash applied to the encryptionThe function has been discussed too much, in the author's blog has a more detailed introduction. Therefore, this article only discusses the hash used to findFunction.
HashThe main object of a function application is an array (for example, a string), and its target is generally an intType. Here's what we're going to show you in this way.
Generally speaking, the Hash 1. addition Hash 2. ;
3. ;
4. Division Hash 5. ;
6. mixed Hash addition hash
static int Additivehash (String key, int prime)
{

int hash, i;for (hash = key.length (), i = 0; i < key.length (); i++)
Hash +=key.charat (i);
Return (hash% prime);
}

The prime in this case is any prime number, and it can be seen that the value of the result is [0,prime-1].

Two -bit arithmetic hash

This type of hash function fully mixes the input elements by taking advantage of the various bitwise operations (common is shift and XOR). For example, the standard spin hash is constructed as follows:

static int Rotatinghash (String key, int prime)

{

int hash, I;

For (Hash=key.length (), i=0; I<key.length (); ++i)

hash = (hash<<4) ^ (hash>>28) ^key.charat (i);

Return (hash% prime);

}

The main feature of this type of hash function is to shift first and then perform various bit operations. For example, the above-calculated hash of the code can also have the following kinds of variants:
1. Hash = (hash<27) ^key.charat (i);
2.hash + = Key.charat (i);
Hash + = (hash << 10);
Hash ^= (hash >> 6);
3. if ((i&1) = = 0)
{
Hash ^= (HASH&LT;3);
}
Else
{
Hash ^= ~ ((hash<5));
}
4. Hash + = (hash<<5) + Key.charat (i);
5. hash = Key.charat (i) + (hash<16) –hash;
6. Hash ^= ((hash<2));
Three-multiplication Hash
This type of hash function takes advantage of the non-correlation of multiplication (this property of multiplication, the most famous is the random number generation algorithm of the square-tail, although the algorithm is not effective). Like what

The well-known hash functions in this way are:
32-bit FNV algorithm
int m_shift = 0;
public int Fnvhash (byte[] data)
{
int hash = (int) 2166136261L;
for (byte b:data)
hash = (Hash * 16777619) ^ b;
if (M_shift = = 0)
return hash;
Return (hash ^ (hash >> m_shift)) & M_mask;
}
And the improved FNV algorithm:
public static int FNVHash1 (String data)
{
Final int p = 16777619;
int hash = (int) 2166136261L;
Forint i=0;i
hash = (hash ^ data.charat (i)) * p;
Hash + = Hash << 13;
Hash ^= Hash >> 7;
Hash + = Hash << 3;
Hash ^= Hash >> 17;
Hash + = Hash << 5;
return hash;
}
In addition to multiplying by a fixed number, the common ones are multiplied by a constantly changing number, such as:
static int Rshash (String str)
{
int b = 378551;
int a = 63689;
int hash = 0;
for (int i = 0; i < str.length (); i++)
{
hash = hash * A + str.charat (i);
A = a * b;
}
Return (hash & 0x7FFFFFFF);
}
Although the application of the ADLER32 algorithm is not CRC32 widely, it is probably the most famous one in the multiplication hash. On its introduction, you can go to see the RFC1950 specification.
Four-Division Hash
Division, like multiplication, also has a seemingly non-correlation on the surface. However, because the division is too slow, this approach can hardly find a real application. It is important to note that the result of the hash we see in the previous divided by a prime is only intended to guarantee the scope of the result. If you don't need it to limit a range, you can use the following code instead of Hash%prime:hash = hash ^ (hash>>10) ^ (hash>>20).
Five look-up table hash
The most famous example of tabular hash is the CRC series algorithm. Although the CRC series algorithm itself is not a check table, but the table is one of its fastest way to implement. Notable examples of tabular hashes are: Universal Hashing and Zobrist Hashing. Their tables are randomly generated.
Six mixed hash
The hybrid hash algorithm takes advantage of these various methods. Various common hash algorithms, such as MD5 and Tiger, fall into this range. They are rarely used in search-oriented hash functions.
Evaluation of seven-pair hash algorithm
Http://www.burtleburtle.net/bob/hash/doobs.html This page provides an evaluation of several popular hash algorithms. Our suggestions for the hash function are as follows:
1. Hash of the string. The simplest can use the basic multiplication hash, when the multiplier is 33 o'clock, for English words have a good hash effect (less than 6 lowercase form can guarantee no conflict). A bit more complex can use the FNV algorithm (and its improved form), which is good for long strings, both in speed and in effect.
2. Hash of the long array. can use
Http://burtleburtle.net/bob/c/lookup3.c
This algorithm, which calculates multiple bytes at a time, is a good speed.
Eight PostScript
This paper briefly introduces a hash algorithm for searching in practical applications. In addition to the hash algorithm applied to this aspect, another notable application is the giant string match (at this point the hash algorithm is called: rolling hash, because it must be able to scroll the calculation). It's not easy to design a really good hash algorithm. As an application, choosing a suitable algorithm is the most important.

Nine Arrays


Note: Although the above hash can avoid the conflict to a great extent, the conflict is unavoidable. So no matter what hash function You use, you have to add a method to deal with the conflict.

Several common hash algorithms and principles

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.