Detailed explanation of hashcode () and equals () methods in Java

Source: Internet
Author: User

1. First, the equals () and hashcode () methods are inherited from the object class.
The equals () method is defined in the object class as follows:
Public Boolean equals (Object OBJ ){
Return (this = OBJ );
}
It is obvious that the address values of the two objects are compared (that is, whether the reference is the same ). But we must be clear that when string, math, and integer, double .... When using the equals () method, these encapsulation classes already overwrite the equals () method of the object class. For example, in the string class:
Public Boolean equals (Object anobject ){
If (this = anobject ){
Return true;
}
If (anobject instanceof string ){
String anotherstring = (string) anobject;
Int n = count;
If (n = anotherstring. Count ){
Char V1 [] = value;
Char V2 [] = anotherstring. value;
Int I = offset;
Int J = anotherstring. offset;
While (n --! = 0 ){
If (V1 [I ++]! = V2 [J ++])
Return false;
}
Return true;
}
}
Return false;
}
Obviously, this is the content comparison, but it is no longer the address comparison. And so on, double, integer, math .... And so on, these classes all overwrite the equals () method to compare the content. Of course, the basic type is to compare values. There is nothing to say about it.
We should also note that the requirements of the Java language for equals () are as follows, which must be followed:
? Symmetry: If X. Equals (y) returns "true", then Y. Equals (x) returns "true ".
? Reflex: X. Equals (x) must return "true ".
? Analogy: If X. equals (y) returns "true", and Y. equals (z) returns "true", then z. equals (x) should also return "true ".
? There is also consistency: If X. equals (y) returns "true", as long as the content of X and Y remains unchanged, No matter you repeat X. "True" is returned for the number of equals (y) times ".
? In any case, X. Equals (null) always returns "false"; X. Equals (and X objects of different types) always returns "false ".
The above five points are the rules that must be followed when the equals () method is rewritten. If any violation occurs, unexpected results will be observed.
2. The second is the hashcode () method, which is defined in the object class as follows:
Public native int hashcode ();
The description is a local method, which is implemented based on local machines. Of course, we can overwrite the hashcode () method in the class we write, such as string, integer, double .... And so on. For example, the hashcode () method defined in the string class is as follows:
Public int hashcode (){
Int H = hash;
If (H = 0 ){
Int off = offset;
Char Val [] = value;
Int Len = count;

For (INT I = 0; I <Len; I ++ ){
H = 31 * H + val [Off ++];
}
Hash = h;
}
Return h;
}
Explain this program (written in the string API ):
S [0] * 31 ^ (n-1) + s [1] * 31 ^ (n-2) +... + s [n-1]
The Int algorithm is used. Here, s [I] is the I character of the string, n is the length of the string, and ^ represents the power. (The hash code of the Null String is 0 .)

First, to understand the role of hashcode, you must first know the set in Java.
In general, collections in Java have two types: List and set.
Do you know the differences between them? The elements in the former set are ordered, and the elements can be repeated. The latter elements are unordered, but the elements cannot be repeated.
So here is a serious problem: to ensure that the elements are not repeated, what is the basis for determining whether the two elements are repeated?
This is the object. Equals method. However, if each added element is checked once, when there are many elements, the number of times that the elements added to the set are compared is very large.
That is to say, if there are already 1000 elements in the Set, it will call the 1,001st equals method when 1000 elements are added to the set. This will obviously greatly reduce the efficiency.
Therefore, Java uses the principle of hash tables. Hash is actually a personal name. As he proposed a hash algorithm, he named it.
A hash algorithm is also called a hash algorithm. It directly specifies an address based on a specific data algorithm. If you want to describe the hash algorithm in detail, more articles are required. I will not describe it here.
As a beginner, The hashcode method actually returns the physical address of the Object Storage (which may not actually be ).
In this way, when a set needs to add a new element, the hashcode method of this element is called first, and the physical location where it should be placed can be located at once.
If there are no elements in this position, it can be directly stored in this position without any comparison. If there are already elements in this position,
You can call its equals method to compare it with the new element. If it is the same, it will not be saved. If it is different, other addresses will be hashed.
Therefore, there is a conflict resolution problem. In this way, the number of actually called equals methods is greatly reduced, and it takes almost one or two times.
Therefore, Java specifies the eqauls method and hashcode method as follows:
1. If the two objects are the same, their hashcode values must be the same; 2. If the two objects have the same hashcode, they are not necessarily the same as the objects mentioned above. They are compared using the eqauls method.
Of course you can do it as required, but you will find that the same object can appear in the Set set. At the same time, the efficiency of adding new elements will be greatly reduced.

3. here we need to understand the following question:
Two objects with equal equals () must have equal hashcode;
Equals () is not equal to two objects, but it cannot prove that their hashcode () is not equal. In other words, hashcode () may be equivalent to two objects whose equals () method is not equal. (In my understanding, the hash code is generated in a conflict ).
In turn, hashcode () is not equal. Equals () is always available. hashcode () is equal. Equals () may be equal or not. To explain the scope of use at, I understand that it can be used in objects, strings, and other classes. In the object class, the hashcode () method is a local method and returns the address value of the object. The equals () method in the object class compares the address values of the two objects, if equals () is equal, the address values of the two objects are equal, and hashcode () is equal. In the string class, equals () returns a comparison of the content of the two objects, when two objects have the same content,
The hashcode () method analyzes the code based on the string class rewriting (analyzed in point 2nd). You can also know that the returned results of hashcode () are equal. Similarly, we can know that the overwritten equals () and hashcode () methods in integer and double encapsulation classes are also suitable for this principle. Of course, the class that has not been overwritten will also follow this principle after it inherits the equals () and hashcode () Methods of the object class.

4. When talking about hashcode () and equals (), we can't help but talk about the usage of hashset, hashmap, and hashtable. For details, see the following analysis:
Hashset inherits the set interface and implements the collection interface. This is a hierarchical relationship. So what principle does hashset use to access objects?
Repeated objects are not allowed in hashset, And the element location is also unknown. In hashset, how does one determine whether the elements are repeated? This is the key to the problem. After an afternoon's query and verification, I finally got some inspiration. I would like to share with you that in the Java Collection, the rules for determining whether two objects are equal are:
1), judge whether the hashcode of the two objects is equal
If they are not equal, the two objects are considered not equal.
If equal, transfer 2)
(This is only required to improve storage efficiency. In theory, it is not acceptable. However, if it is not, the actual usage of the aging rate will be greatly reduced. Therefore, we need it here. This issue will be highlighted later .)
2) determine whether two objects are equal using the equals operation
If they are not equal, the two objects are considered not equal.
If the two objects are equal, equals () is the key to determining whether the two objects are equal)
Why are there two principles? Can't I use the first one? No, because as mentioned earlier, the equals () method may not be equal when hashcode () is equal. Therefore, you must use the 2nd rules to ensure that non-repeating elements are added.
For example, the following code:

Public static void main (string ARGs []) {
String S1 = new string ("zhaoxudong ");
String S2 = new string ("zhaoxudong ");
System. Out. println (S1 = S2); // false
System. Out. println (s1.equals (S2); // true
System. Out. println (s1.hashcode (); // s1.hashcode () equals s2.hashcode ()
System. Out. println (s2.hashcode ());
Set hashset = new hashset ();
Hashset. Add (S1 );
Hashset. Add (S2 );
/* In essence, when adding S1 and S2, we can use the two principles mentioned above to understand that hashset considers S1 and S2 to be equal and that duplicate elements are added, so let S2 overwrite S1 ;*/
Iterator it = hashset. iterator ();
While (it. hasnext ())
{
System. Out. println (it. Next ());
}
At last, only a "zhaoxudong" is printed during the while loop ".
The output result is: false.
True
-967303459
-967303459
This is because the string class has already overwritten the equals () and hashcode () methods. Therefore, according to the above article 1.2, hashset considers them to be equal objects, added again.
But look at the following program:
Import java. util .*;
Public class hashsettest
{
Public static void main (string [] ARGs)
{
Hashset HS = new hashset ();
HS. Add (new student (1, "zhangsan "));
HS. Add (new student (2, "Lisi "));
HS. Add (new student (3, "wangwu "));
HS. Add (new student (1, "zhangsan "));

Iterator it = HS. iterator ();
While (it. hasnext ())
{
System. Out. println (it. Next ());
}
}
}
Class student
{
Int num;
String name;
Student (INT num, string name)
{
This. num = num;
This. Name = Name;
}
Public String tostring ()
{
Return num + ":" + name;
}
}
Output result:
1: zhangsan
1: zhangsan
3: wangwu
2: Lisi
The problem arises. Why does hashset add equal elements? Is this contrary to the hashset principle? The answer is: no
Because when we compare the newly created student (1, "zhangsan") Objects Based on hashcode (), different hash code values are generated, so hashset treats him as a different object. Of course, the values returned by the equals () method at this time also vary (this does not need to be explained ). So why does it generate different hash code values? Didn't we generate the same hash code when comparing S1 and S2? The reason is that the student class we wrote does not repeat the hashcode () and equals () methods. Therefore, during comparison, it is the hashcode () method in the inherited object class, remember what the hashcode () method in the object class compares !!
It is a local method that compares the object address (reference address) and creates an object using the new method, of course, the two generated objects are different (you can understand this ...), The result is that the values returned by the hashcode () of the two objects are different. Therefore, according to the first criterion, hashset treats them as different objects, and naturally does not need the second criterion for determination. So how can we solve this problem ??
The answer is: Re-hashcode () and equals () methods in the student class.
For example:
Class student
{
Int num;
String name;
Student (INT num, string name)
{
This. num = num;
This. Name = Name;
}
Public int hashcode ()
{
Return num * name. hashcode ();
}
Public Boolean equals (Object O)
{
Student s = (student) O;
Return num = S. Num & name. Equals (S. Name );
}
Public String tostring ()
{
Return num + ":" + name;
}
}
Based on the override method, even if new student (1, "zhangsan") is called twice, when we obtain the object's hash code, according to the override method hashcode (), the obtained hash code must be the same (there is no doubt about this ).
Of course, based on the equals () method, we can also judge that it is the same. So they are treated as repeated elements when added to the hashset set. So when we run the modified program, we will find that the running result is:
1: zhangsan
3: wangwu
2: Lisi
We can see that the duplicate element problem has been eliminated.
In the pojo class of hibernate, the problem of re-equals () and hashcode () is as follows:
1) the focus is on equals. Rewriting hashcode is only a technical requirement (to improve efficiency)
2) Why rewrite equals? In Java's collection framework, equals is used to determine whether two objects are equal.
3) In hibernate, the Set set is often used to store related objects, and the Set set cannot be repeated. Let's talk about how to judge whether the object is the same when adding elements to a hashset set. We mentioned two principles above, but we only need to rewrite equals.
However, when there are many elements in a hashset, or the rewritten equals () method is complicated, we only use the equals () method for comparison and judgment, and the efficiency will be very low, therefore, the hashcode () method is introduced to improve efficiency, but I think this is very necessary (so we will judge whether the elements of the hashset are repeated using the previous two principles ).
For example, you can write as follows:
Public int hashcode (){
Return 1;} // equivalent to invalid hashcode
The result of this operation is that it cannot be judged when comparing hash codes, because each object returns a hash code of 1 and each time it must be compared to equals () the method can be used to determine whether it is repeated, which will greatly reduce the efficiency.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.