A detailed explanation of the Equals and Hashcode methods in Java set--java _java

Source: Internet
Author: User
Tags hash prev set set

The Equals method and the Hashcode method in Java are in object, so each object has these two methods, sometimes we need to implement the specific requirements, we may have to rewrite the two methods, today we will introduce some of the effects of these two methods.

The Equals () and Hashcode () methods are used in the same class for comparison purposes, especially in containers where the set holds the same class of objects to determine whether the object being put is duplicated.

Here we first need to understand a problem:

Equals () Equal two objects, hashcode () must be equal, equals () unequal two objects, but they do not prove that their hashcode () is not equal. In other words, the Equals () method is not equal to two objects, and hashcode () may be equal. (My understanding is that the hash code creates a conflict when it is generated)

Here hashcode is like the index of every word in the dictionary, Equals () is compared to the dictionary of the same word under the different words. It's like looking up the two words "self" in the word "from" in the dictionary, "spontaneous", if you use Equals () to determine the words of the query is equal to the same word, such as equals () compared to the two words are "own", then the Hashcode () method will be the same value If the Equals () method is compared to the "self" and "spontaneous" two words, then the result is not to wait, but the two words belong to the word "from" so the same in the search index, that is: Hashcode () the same. If you compare the two words "self" and "they" with equals () then the results are different, and hashcode () is also different.

Conversely: hashcode () will be able to introduce equals () equal, hashcode () equality, equals () may or may not be equal. In the object class, the Hashcode () method is a local method that returns the address value of an object, whereas the Equals () method in the object class compares the address values of two objects, and if Equals () is equal, the two object address values are equal, of course hashcode ( ) is equal;

At the same time, hash algorithm provides high efficiency for finding elements.

If you want to find out whether an object is contained in a collection, how does the approximate program code write?

You usually take each element out and compare it to the object you're looking for. When an element is found to be equal to the result of the Equals method of the object being looked up, stop the lookup and return the positive information, otherwise, return the negative information if there are many elements in a collection, such as 10,000 elements, And there is no object to find, it means that your program needs to take 10,000 elements out of the collection to make a comparison to get a conclusion.

A hashing algorithm has been developed to increase the efficiency of finding elements from the collection, which divides the collection into several storage areas, each of which can compute a hash code that can be grouped (computed using a different hash function), each corresponding to a storage area, Depending on the hash of an object, you can determine which area the object should be stored in. HashSet is a hashing algorithm to access the collection of objects, which uses a number of n for the remainder (the hash function is the simplest) way to group and partition the object of the storage area of the hash code The object class defines a hashcode () method that returns the hash code for each Java object, and when an object is looked up from the HashSet collection, the Java system first invokes the object's Hashcode () method to obtain its hash code table. Then, according to the hash, the corresponding storage area is found, and each element in the storage area is then compared to the Equals method of the object, so that we can get the conclusion without traversing all the elements in the collection, which shows that the HashSet collection has good object retrieval performance, but HashSet Collection storage objects are less efficient because when an object is added to the HashSet collection, the hash code of the object is computed and the object is stored in the collection according to the hash code to ensure that an instance object of a class can be stored normally in HashSet. Requires that the two instance objects of this class have equal results when compared with the Equals () method, their hash code must also be equal; that is, if the result of Obj1.equals (OBJ2) is true, the result of the following expression is also true:
Obj1.hashcode () = = Obj2.hashcode ()

In other words: when we rewrite the Equals method of an object, we have to rewrite his hashcode method, but without rewriting his hashcode method, the Hashcode method in object objects always returns the hash address of an object. And this address is never equal. So this time even rewrite the Equals method, will not have a specific effect, because the Hashcode method if you do not want to wait, it will not call the Equals method to compare, so meaningless.

If the Hashcode () method of a class does not comply with the above requirements, then when the two instance objects of the class have equal results compared with the Equals () method, they should not be stored in the set collection at the same time, but if they are stored in the HashSet collection, Because the return value of their hashcode () method is different (the return value of the Hashcode method in object is always different), the second object, first computed by hash code, may be placed in a different area than the first object, so that It is not possible to compare the Equals method with the first object, and it may be stored in the HashSet collection, and the Hashcode () method in the object class does not satisfy the requirement that the object be deposited into the hashset because its return value is inferred from the object's memory address. , the hash value returned by the same object at any time during the run of the program is always constant, so as long as there are two different instance objects, the return value of their default Hashcode method is different, even if their equals method compares the results equally.

Let's take a look at a specific example:

Rectobject object:
package Com.weijia.demo; 
 
public class Rectobject {public 
  int x; 
  public int y; 
  Public rectobject (int x,int y) { 
    this.x = x; 
    This.y = y; 
  } 
  @Override public 
  int hashcode () { 
    final int prime =; 
    int result = 1; 
    result = Prime * result + x; 
    result = Prime * result + y; 
    return result; 
  } 
  @Override public 
  boolean equals (Object obj) { 
    if (this = = obj) return 
      true; 
    if (obj = null) return 
      false; 
    if (GetClass ()!= Obj.getclass ()) return 
      false; 
    Final Rectobject other = (rectobject) obj; 
    if (x!= other.x) {return 
      false; 
    } 
    if (y!= other.y) {return 
      false; 
    } 
    return true; 
  } 
 

We've rewritten the hashcode and Equals methods in the parent object, and see the Hashcode and Equals methods, if the x,y values of two Rectobject objects are equal, their hashcode values are equal, At the same time equals returns true;

Here is the test code:

Package Com.weijia.demo; 
Import Java.util.HashSet; 
public class Demo {public 
  static void Main (string[] args) { 
    hashset<rectobject> set = new Hashset<rectob Ject> (); 
    Rectobject r1 = new Rectobject (3,3); 
    Rectobject r2 = new Rectobject (5,5); 
    Rectobject R3 = new Rectobject (3,3); 
    Set.add (R1); 
    Set.add (R2); 
    Set.add (R3); 
    Set.add (R1); 
    SYSTEM.OUT.PRINTLN ("Size:" +set.size ()); 
  } 
} 

We deposited four objects into the hashset, printing the size of the set set, and how much is the result?

Run Result: size:2

Why would it be 2? This is very simple, because we rewrite the Rectobject class Hashcode method, as long as the Rectobject object's X,y property value is equal so his hashcode value is equal, so first compare the hashcode value, r1 and R2 object X, Y attribute values are unequal, so their hashcode are different, so the R2 object can be put in, but the X,y attribute value of the R3 object is the same as that of the R1 object, so the hashcode is equal, at which point the Equals method of R1 and R3 is compared, because of his two x, The Y value is equal, so the R1,r3 object is equal, so the R3 cannot be put in, and the last addition of a R1 is not added, so there is only one R1 and two objects in the set set

Here we will comment on the Hashcode method in the Rectobject object, that is, not to rewrite the Hashcode method in object objects, to run the code:

Run Result: Size:3

This result is also very simple, first of all, judge the R1 object and the R2 object Hashcode, because the Hashcode method in object returns the result of the conversion of the local memory address of the objects, the hashcode of different instance objects are not the same. Also because the hashcode of R3 and R1 is not equal, but r1==r1, so the last set set has only r1,r2,r3 these three objects, so the size is 3

Below we take the content annotation in the Equals method in the Rectobject object to return false directly, without commenting the Hashcode method and running the code:

Run Result: Size:3

The result is a bit unexpected, let's analyze:

First R1 and R2 object compare hashcode, not equal, so R2 into set, and then look at R3, compare R1 and R3 method, is equal, and then compare them two Equals method, because the Equals method always returns False, So R1 and R3 are not equal, R3 and R2 Needless to say, their hashcode is not equal, so R3 put in set, see R4, compare R1 and R4 found that hashcode is equal, in the comparison of Equals method, because the Equals return false, So R1 and R4 are not equal, the same R2 and R4 are not equal, R3 and R4 are not equal, so R4 can be placed in the set set, then the result should be size:4, then why is it 3?

At this time we need to see the source of the HashSet, the following is the HashSet Add method source:

/** 
   * Adds The specified element to this set if it isn't already present. 
   * More formally, adds the specified element <tt>e</tt> "to" set if * This set 
   contains no element <t T>e2</tt> such that 
   * <tt> (E==null e2==null:e.equals (E2)) </tt>. 
   * If This set already contains the element, the call leaves the set 
   * unchanged and returns <tt>false</tt> . 
   * 
   * @param e element to is added to the set 
   * @return <tt>true</tt> If this set did not already Conta In the specified 
   * element/Public 
  Boolean add (E e) {return 
    map.put (E, PRESENT) ==null; 
  

Here we can see in fact HashSet is based on HASHMAP implementation, we click on the HashMap put method, the source code is as follows:

/** * Associates The specified value with the specified key into this map. 
   * If The map previously contained a mapping for the "key", the old * value is replaced. * * @param key key with which the specified value being associated * @param value value of associated with T He specified key * @return The previous value associated with <tt>key</tt>, or * <tt>null&lt 
   ;/tt> If there is no mapping for <tt>key</tt>. * (A <tt>null</tt> return can also indicate the map * previously associated &LT;TT&GT;NULL&LT 
   ;/tt> with <tt>key</tt>.) 
    * * Public V-put (K key, V value) {if (key = null) return Putfornullkey (value); 
    int hash = hash (key); 
    int i = indexfor (hash, table.length); 
      for (entry<k,v> e = table[i]; e!= null; e = e.next) {Object K; if (E.hash = = Hash && ((k = e.key) = = Key | | key.equals (k))) {V OldValue= E.value; 
        E.value = value; 
        E.recordaccess (this); 
      return oldValue; 
    }} modcount++; 
    AddEntry (hash, key, value, I); 
  return null;  }

Let's take a look at the judgment conditions of IF,

The first is to determine whether the hashcode is equal, not equal, skip directly, equal, and then compare whether the two objects are equal or the Equals method of these two objects, because it is done or manipulated, so as long as there is a set up, then we can explain here, In fact, the size of the set above is 3, because the last R1 did not put in, thought R1==r1 return True, so did not put in. So the size of the set is 3, and if we set the Hashcode method to always return false, this set is 4.

Finally, let's look at the memory leak caused by hashcode: Look at the code:

Package Com.weijia.demo; 
Import Java.util.HashSet; 
public class Demo {public 
  static void Main (string[] args) { 
    hashset<rectobject> set = new Hashset<rectob Ject> (); 
    Rectobject r1 = new Rectobject (3,3); 
    Rectobject r2 = new Rectobject (5,5); 
    Rectobject R3 = new Rectobject (3,3); 
    Set.add (R1); 
    Set.add (R2); 
    Set.add (R3); 
    R3.Y = 7; 
    SYSTEM.OUT.PRINTLN ("Size before deletion:" +set.size ()); 
    Set.remove (R3); 
    SYSTEM.OUT.PRINTLN ("Size after deletion:" +set.size ()); 
  } 
 

Run Result:

The size before the deletion size:3
Size after deletion size:3

Rub, found a problem, and is a big problem yes, we called remove remove R3 object, thought deleted R3, but in fact did not delete, this is called memory leak, is not the object but he is still in memory. So after we have done this many times, the memory exploded. Take a look at the source of the Remove:

/** 
   * Removes the specified element from this set if it is present. 
   * More formally, removes a element <tt>e</tt> such that 
   * <tt> (o==null? E==null:o.equals (e)) < /tt>, 
   * If this set contains such a element.  Returns <tt>true</tt> If 
   * This set contained the element (or equivalently, if this set 
   * changed as a Result of the call). (This set is not contain the 
   * element once the call returns.) 
   * 
   @param o object to is removed from this set, if present 
   * @return <tt>true</tt> if the set Contai Ned the specified element 
   */Public 
  boolean remove (Object o) {return 
    map.remove (o) ==present; 
  } 

Then take a look at the source of the Remove method:

/** 
   * Removes the mapping for the specified key from this map if present. 
   * 
   * @param key key whose mapping is to removed from the map 
   * @return The previous value associated with <TT >key</tt>, or 
   *     <tt>null</tt> If there is no mapping for <tt>key</tt>. 
   *     (A <tt>null</tt> return can also indicate the map 
   *     previously associated <tt>null </tt> with <tt>key</tt>.) 
   */Public 
  V Remove (Object key) { 
    entry<k,v> e = Removeentryforkey (key); 
    return (E = null. null:e.value); 
  } 

Look at the Removeentryforkey method source code:

/** * Removes and returns the entry with the associated key * in the H Ashmap. 
   Returns NULL if the HASHMAP contains no mapping * for this key. 
    * * Final entry<k,v> Removeentryforkey (Object key) {int hash = (key = null)? 0:hash (key); 
    int i = indexfor (hash, table.length); 
    Entry<k,v> prev = table[i]; 
 
    entry<k,v> e = prev; 
      while (e!= null) {entry<k,v> next = E.next; 
      Object K; if (E.hash = = Hash && (k = e.key) = = Key | | (Key!= null && key.equals (k))) 
        {modcount++; 
        size--; 
        if (prev = = e) Table[i] = next; 
        else Prev.next = next; 
        E.recordremoval (this); 
      return e; 
      } prev = e; 
    e = next; 
  return e; } 

We see that when we call the Remove method, we first use the object's Hashcode value to find the object and then delete it because we are modifying the value of the Y property of the R3 object, and because the Rectobject object's Hashcode method has the Y value involved in the operation , so the hashcode of the R3 object changes, so the Remove method does not find the R3, so the deletion fails. That is, R3 's hashcode changed, but his storage location is not updated, still in the original position, so when we use his new hashcode to find certainly is not found.
In fact, the above method implementation is very simple: the following figure:

Very simple a linear hash table, the use of the hash function is mod, the source code is as follows:

/** 
  * Returns index for hash code h. 
  */ 
  static int indexfor (int h, int length) {return 
    H & (length-1); 
  

This is actually the MoD operation, but this operation is more efficient than the% operation.

1,2,3,4,5 is the result of the MoD, each element corresponds to a list structure, so if you want to delete a entry<k,v>, first get the hashcode, and then get to the list of the head node, and then traverse the list, Delete this element if the hashcode and equals are equal.
This memory leak above tells me a message: If we take the object's attribute value into the hashcode operation, we cannot modify its property value when we delete it, otherwise there will be a serious problem.

In fact, we can also look at the 8 basic data types corresponding to the object type and string type Hashcode method and the Equals method.

One of the basic types of hashcode in 8 is simply to return their numeric size directly, string objects through a complex calculation, but this method of calculation guarantees that if the value of the string is equal, their hashcode is equal. The Equals method for 8 basic types is to directly compare values, and the string equals method is to compare the values of strings.

The above is the entire content of this article, I hope to help you learn, but also hope that we support the cloud habitat community.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.