JAVA Basics Small Details--equals () and hashcode () __java

Source: Internet
Author: User

The Equals method and the Hashcode method in Java are in object, so each object has these two methods, sometimes we need to implement the specific requirements, we may have to rewrite the two methods, today we will introduce some of the effects of these two methods.

The Equals () and Hashcode () methods are used in the same class for comparison purposes, especially in containers where the set holds the same class of objects to determine whether the object being put is duplicated.

Here we first need to understand a problem:

Equals () Equal two objects, hashcode () must be equal, equals () unequal two objects, but they do not prove that their hashcode () is not equal. In other words, the Equals () method is not equal to two objects, and hashcode () may be equal. (My understanding is that the hash code creates a conflict when it is generated)

Here hashcode is like the index of every word in the dictionary, Equals () is compared to the dictionary of the same word under the different words. It's like looking up the two words "self" in the word "from" in the dictionary, "spontaneous", if you use Equals () to determine the words of the query is equal to the same word, such as equals () compared to the two words are "own", then the Hashcode () method will be the same value If the Equals () method is compared to the "self" and "spontaneous" two words, then the result is not to wait, but the two words belong to the word "from" so the same in the search index, that is: Hashcode () the same. If you compare the two words "self" and "they" with equals () then the results are different, and hashcode () is also different.

Conversely: hashcode () will be able to introduce equals () equal, hashcode () equality, equals () may or may not be equal. In the object class, the Hashcode () method is a local method that returns the address value of an object:

public native int hashcode ();

The Equals () method in the object class compares the address values of two objects, and if Equals () is equal, the two object address values are equal, and of course hashcode () is equal. equals () and = =

1 for = = If the variable acting on the base data type directly compares its stored value to equality if it acts on a variable of a reference type, it compares the address of the object pointed to

2 for the Equals method (Note: The Equals method does not work on variables of the base data type) If the Equals method is not overridden, the comparison is the address of the object to which the reference type's variable is directed, such as String, date, and so on, if the Equals method is overridden That compares the contents of the object being pointed to

The main difference is that one is an operator is a method, = = is used to compare the primitive type and the Equals () method compares the equality of the objects. hashcode () function

In Java, when you rewrite equals, you need to rewrite hashcode at the same time. But how to achieve a hashcode? Returns a fixed value.

@Override public
int hashcode () {return
       ;
}

42 is the ultimate answer to life, the universe, and everything. To be an object of hashcode is more than sufficient. However, although it is lawful to return a fixed value. However, it is the least recommended approach, as this can cause constant conflict when the object is used as a HashMap Key . the complexity of get is degraded to O (n). So 42 is the ultimate answer to life, the universe and everything, but calculating the true meaning of 42 requires tens of thousands of years of Earth computing. The ordinary computer still can't use this kind of casually. the traditional way of implementation

The traditional implementation of a hashcode requires three steps to take a non 0 int as an initial value, such as 42, to save the hashcode in result that calculates the domain in which you are concerned (the Equals function). How to generate the hashcode of a Property Let's wait for a second, assuming that the generated hashcode is all of the fields of C combination, result = results + C Java 7+ implementation style

Objects is a new class in Java 7. There is a way to compute the hash:

public static int hash (Object ... values)

So the calculation hashcode only need to use the following code:

@Override public 
int hashcode () {
    reutrn Objects.hash (P1, p2, p3, p4);
how to calculate the hash value of each property

This Part I think the code to explain is the simplest, the code is automatically generated by IntelliJ. is primarily the generation rules for each of the original types.

public class TestClass {byte B;
    char c;
    Short S;
    int i;
    Long L;
    float F;
    Double D;

    string string;
        @Override public boolean equals (Object o) {if (this = O) return true;

        if (o = = NULL | | getclass ()!= O.getclass ()) return false;

        TestClass TestClass = (TestClass) o;
        if (b!= testclass.b) return false;
        if (c!= testclass.c) return false;
        if (S!= testclass.s) return false;
        if (i!= testclass.i) return false;
        if (l!= TESTCLASS.L) return false;
        if (Float.compare (TESTCLASS.F, F)!= 0) return false;
        if (Double.compare (TESTCLASS.D, D)!= 0) return false; Return string!= null?

    String.Equals (testclass.string): testclass.string = = null;
        @Override public int hashcode () {int. result;
        Long temp;
        result = (int) b;
        result = result + (int) C;
        result = result + (int) s;
result = * result + i;        result = result + (int) (l ^ (l >>> 32)); result = ~ Result + (f!= +0.0f?)
        Float.floattointbits (f): 0);
        temp = double.doubletolongbits (d);
        result = result + (int) (temp ^ (temp >>> 32));
        result = * result + (string!= null string.hashcode (): 0);
    return result; }
}

You can look inside the JDK, the implementation of Hashcode (), arrays is a good example:

public static int Hashcode (Boolean a[]) {
        if (a = = null) return
            0;

        int result = 1;
        For (Boolean element:a) result
            = to * result + (element? 1231:1237);

        return result;
    }
public static int hashcode (long a[]) {
        if (a = = null) return
            0;

        int result = 1;
        for (long element:a) {
            int elementhash = (int) (element ^ (element >>>));
            result = * result + Elementhash;
        }

        return result;
    }
int, short, char, byte public
static int hashcode (int a[]) {
        if (a = = null) return
            0;

        int result = 1;
        for (int element:a) Result
            = to * result + element;

        return result;
    }

See here then you may have the same doubts as me, why is it? why the Hashcode method selects the number 31 as the multiplier

This number is not a constant declaration, so it is impossible to infer the use of this number literally. Then with doubt and curiosity, go to the Internet to find information to inquire. After reading the material, silently sigh a sentence, the original is so ah. So what exactly is it? In the next chapters, please take a curiosity and I uncover the use of the number 31 puzzle.

Before detailing the reason for the string hashcode method to select the number 31 as a multiplier, let's take a look at how the string Hashcode method is implemented, as follows:

public int hashcode () {
    int h = hash;
    if (h = = 0 && value.length > 0) {
        char val[] = value;

        for (int i = 0; i < value.length i++) {
            h = * H + val[i];
        }
        hash = h;
    }
    return h;
}

The code above is the implementation of the String Hashcode method, which is simple. In fact, the Hashcode method core has only three lines of computational logic, which is the for loop in the code. We can derive a calculation formula from the above for loop, which is already given in the Hashcode method annotation. As follows:

s[0]*31^ (n-1) + s[1]*31^ (n-2) + ... + s[n-1]

Here, the S array above, the Val array in source code, is an array of char types maintained within String. Here I'll simply deduce the formula:

Suppose n=3
i=0-> h = * 0 + val[0]
i=1-> h = * (0 + val[0]) + val[1]
i=2-> h = 31 * (31 * (3 1 * 0 + val[0]) + val[1]) + val[2]
       h = 31*31*31*0 + 31*31*val[0] + 31*val[1] + val[2]
       h = 31^ (n-1) *val[0] + 31^ (n- 2) *val[1] + val[2]

The above formula, including the derivation of the formula is not the focus of this article, we understand. Next, the focus of this article is to choose the 31 reason. According to the information on the Internet, there are generally two reasons:

First, 31 is a moderate prime number, is one of the optimal prime numbers as a hashcode multiplier. Other similar prime numbers, such as 37, 41, 43, and so on, are also good choices. So why did you choose 31? Please see the second reason.

Second, 31 can be optimized by JVM, * i = (i << 5)-I.

Of the above two reasons, the first one needs to explain, the second is simpler, do not say. Let me explain the first reason. In general, when designing a hashing algorithm, a special prime number is selected. As for the choice of prime numbers, I think it is possible to reduce the collision rate of the hashing algorithm. As for the reason, this is about to be asked by mathematicians, and the mathematical level I can scarcely ignore explains this reason. As mentioned above, 31 is a moderate prime number, is the optimal multiplier. Why is it that the same number of 2 and 101 (or larger prime numbers) is not an optimal multiplier, as analyzed below.

Here we first analyze the prime number 2. First, assume n = 6 and then bring the prime numbers 2 and n into the formula above. And only the highest number of the formula is calculated, the result is 2^5 = 32, is not very small. So it can be concluded that when the length of the string is not very long, the value of prime number 2 as a multiplier of the hash values, the value is not very large. That is, the hash value is distributed in a smaller range of values, with poor distribution, which may eventually lead to a rise in conflict rates.

It says that prime number 2 as a multiplier causes the hash value to be distributed in a smaller range, so what happens if you use a larger large prime number, 101. According to the above analysis, I think we should be able to guess the results. Just don't worry about it. The hash value is distributed in a small range because 101^5 = 10,510,100,501. But note that this calculation is too large. If a hash value is represented with an int type, the result overflows, resulting in the loss of numeric information. Although the loss of numeric information does not necessarily lead to a rise in conflict rates, we think that Prime 101 (or larger prime numbers) is not a good choice. Finally, let's look at the results of Prime 31:31^5 = 28629151, and the resulting values are relative to 32 and 10,510,100,501. Isn't it nice?

It was proved by a rudimentary mathematical method that the number 31 is a moderate prime, and is one of the optimal prime numbers of hashcode multiplier. Next I will use detailed experiments to verify the above conclusions, but before verifying, let's take a look at the discussion on this issue on Stack Overflow, Why does Java ' s hashcode () in String as a multiplier? One of the top answers quotes a passage from effective Java, which is also quoted here:

The value is chosen because it's an odd prime. If it were even and the multiplication overflowed, information would be lost, as multiplication by 2 are equivalent to Shif Ting. The advantage of using a prime is less clear and but it is traditional.  A Nice property of the multiplication can be replaced by a shift and a subtraction for better performance:31 * i = = = (I << 5)-Modern VMs do this sort of optimization automatically.

Translation:

Select the number 31 because it is a singular prime, and if you select an even value, an overflow occurs in the multiplication operation, resulting in the loss of numeric information because multiplying by two is equivalent to a shift operation. The advantage of choosing prime numbers is not particularly obvious, but it is a tradition. At the same time, the number 31 has a good feature that multiplication can be replaced by shift and subtraction, to obtain better performance: * i = = = (I << 5)-I, modern Java virtual machine can automatically complete this optimization.

The answer to the second ranking is set out as follows:

As Goodrich and Tamassia point out, If your take over 50,000 中文版 words (formed as the Union of the word lists provided In two variants of Unix), using the constants, 7, a, and a would produce less than the collisions in each case. Knowing this, the it should come as no surprise that many Java implementations choose one of these constants.

Translation:

As Goodrich and Tamassia point out, if you do hash code on more than 50,000 English words (merged by two different versions of Unix dictionaries) and use constants 31, 33, 37, 39, and 41 as the multiplier, each constant calculates the The number of hash values conflicts is less than 7, so it is not surprising that the constants 31 are selected by the Java implementation in the above several constants.

The above two answers perfectly explain the reason for the number 31 in the Java source code. Verification of the second answer, interested friends can look at the original text I refer to: https://segmentfault.com/a/1190000010799123

A very dry article ~

Reference Links:

Https://segmentfault.com/a/1190000010799123

Http://www.jianshu.com/p/039c942b22c0

http://blog.csdn.net/jiangwei0910410003/article/details/22739953

JDK 8 Source Code

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.