Effective Java Third edition--11. Overriding the Equals method also overrides the Hashcode method

Source: Internet
Author: User
Tags manual writing

Tips
"Effective Java, third Edition" an English version has been published, the second edition of this book presumably many people have read, known as one of the four major Java books, but the second edition of 2009 published, to now nearly 8 years, but with Java 6, 7, 8, and even 9 of the release, the Java language has undergone profound changes.
In the first time here translated into Chinese version. For everyone to learn to share.

11. Overriding the Equals method while overriding the Hashcode method

In each class, be sure to override the Hashcode method when overriding the Equals method . If you do not, your class violates the general conventions of hashcode, which prevents it from working properly in collections such as HashMap and HashSet. According to the Object specification, the following are specific conventions.

    1. When the Hashcode method is repeatedly called on an object when no information is modified in the Equals method comparison during the execution of an application, it must always return the same value. The value returned from one application to another can be inconsistent for each execution.
    2. If two objects are equal according to the Equals (object) method, then calling Hashcode on two objects must produce the same integer as the result.
    3. If two objects are not equal according to the Equals (object) method, it is not required to call Hashcode on each object to produce different results. However, programmers should be aware that generating different results for unequal objects may improve the performance of hash tables.

when Hashcode cannot be overridden, the second key clause is violated: An equal object must have an equal hash code (? Hash codes). Depending on the Equals method of the class, two different instances may be logically the same, but for the Hashcode method of the object class, they are only two objects that have nothing in common. Therefore, the Hashcode method of the Object class returns two seemingly random numbers instead of two equal numbers as required by the Convention.

For example, suppose you use an instance of the class in entry 10 PhoneNumber as the HashMap key (key):

Map<PhoneNumber, String> m = new HashMap<>();m.put(new PhoneNumber(707, 867, 5309), "Jenny");

You might expect the m.get(new PhoneNumber(707,?867,?5309)) method Jenny to return a string, but in fact, NULL is returned. Note that there are two PhoneNumber instances involved: one instance is inserted into HashMap, and the other is used as an instance of judgment equality to retrieve. PhoneNumberclass does not override the Hashcode method, which causes two equal instances to return different hash codes, violating the Hashcode convention. The Put method PhoneNumber saves the instance in a hash bucket (? hash buckets), but the Get method is looked up from a different hash bucket, even if exactly two instances are placed in the same hash bucket, the Get method will almost certainly return null. Because HashMap is optimized, the hash code associated with each item (entry) is cached, and if the hash code does not match, the object is not checked for equality.

It is easy to solve this problem by simply PhoneNumber rewriting a suitable hashcode method for the class. What is the Hashcode method? It is very simple to write an irregular method. The following example, although always legal, must not be used in this way:

// The worst possible legal hashCode implementation - never use!@Override public int hashCode() { return 42; }

This is legal because it ensures that equal objects have the same hash code. This is bad because it ensures that each object has the same hash code. Therefore, each object is hashed into the same bucket, and the hash table is degraded to a linked list. A program that should run in linear time, and run time becomes the square level. For a large data hash table, it will affect the ability to work properly.

A good hash method tends to generate unequal hash codes for unequal instances. This is also the expression of the third clause in Hashcode's agreement. Ideally, the hash method distributes the hashes in the range of int evenly to the unequal instances in the collection. It may be difficult to achieve this ideal situation. Fortunately, it is not difficult to get a reasonable approximation of the method. Here's a simple recipe:

    1. Declares a variable of type int to result and initializes it to the hash code of the first important property in the object c , as calculated in step 2.a below. (Recalling entry 10, the important attribute is the area that affects the comparison equally.) )
    2. For the important attributes remaining in the object f , do the following:

      A. Compare the f c hash code of the int type of the property with the property:
      --I. If this property is of a basic type, use the ?Type.hashCode(f) method calculation, where the Type class is the wrapper class for the corresponding property F base type.
      --II? If the property is an object reference, and the Equals method of the class compares the property by recursively calling equals, and calls the Hashcode method recursively. If more complex comparisons are required, the "normal form (" canonical representation) "of this field is computed and hashcode is called on the paradigm. If the value of the field is empty, 0 is used (other constants can also be used, but typically 0 is used).
      --III If the attribute f is an array, think of it as a separate attribute for each important element. That is, the hash code for each important element is calculated recursively by applying these rules, and the value of each step 2.b is merged. If the array has no important elements, a constant is used, preferably not 0. If all elements are important, use the Arrays.hashCode method.

      B. Combine the hash code computed by the attributes in step 2.a c into the following result:result = 31 * result + c;

    3. Returns the result value.

When you have finished writing the Hashcode method, ask yourself if the instances of equality have the same hash code. Write unit tests to validate your intuition (unless you use the Autovalue framework to generate your equals and Hashcode methods, in which case you can safely ignore these tests). If the same instance has unequal hash codes, identify the cause and resolve the problem.

Derived attributes (derived fields) can be excluded from hash code calculations. In other words, if the value of one property can be computed from other property values participating in the calculation, such a property can be ignored. You must exclude any attributes that are not used in the equals comparison, or you may violate the second clause of the Hashcode Convention.

The multiplication calculation in step 2.b depends on the order of the attributes, and if there are multiple similar attributes in the class, a better hash function is generated. For example, if the multiplication calculation is omitted from a string hash function, all characters will have the same hash code. 31 is chosen because it is an odd number of primes. If it is an even number and the multiplication overflows, the information will be lost because multiplying by 2 is equivalent to shifting. The benefits of using prime numbers are not obvious, but it is customary to do so. A good feature of 31 is that in some architectures multiplication can be replaced by shift and subtraction to achieve better performance: 31 * i ==(i << 5) - i . Modern JVMs can automate this optimization.

Let's apply the above approach to the PhoneNumber class:

// Typical hashCode method@Override public int hashCode() {????int result = Short.hashCode(areaCode);????result = 31 * result + Short.hashCode(prefix);????result = 31 * result + Short.hashCode(lineNum);????return result;}

Because this method returns the result of a simple deterministic calculation, its only input is the PhoneNumber three important attributes in the instance, so it is clear that equal PhoneNumber instances have the same hash code. In fact, this approach is PhoneNumber a very good hashcode implementation, just like the implementation in the Java Platform Class Library. It's very simple, it's pretty fast, and it's reasonable to distribute different phone numbers to separate hash bins.

Although the method in this project produces a fairly good hash function, it is not the most advanced. Their quality is comparable to the hash function found in the value type of the Java Platform Class library and is sufficient for most purposes. If you really need a hash function and are unlikely to collide, see the com.google.common.hash.Hashing [Guava] method of the guava framework.

ObjectsThe class has a static method that accepts any number of objects and returns a hash code for them. This method, called hash, allows you to write a line of hashcode methods that are equivalent in quality to the method written above in the project. Unfortunately, they run more slowly because they need to create arrays to pass a variable number of arguments, and if any of the parameters are basic types, they are boxed and unboxed. The style recommendations for this hash function are only used in situations where performance is not important. The following is a hash function written using this technique PhoneNumber :

// One-line hashCode method - mediocre performance@Override public int hashCode() {???return Objects.hash(lineNum, prefix, areaCode);}

If a class is immutable and the hash code is computationally expensive, consider caching the hash code in the object instead of recalculating the hash code on each request. If you think most of this type of object will be used as a hash key, then the hash code should be computed when the instance is created. Otherwise, you can choose to defer initialization (lazily initialize) hash code when you first call hashcode. It is important to ensure that the class maintains thread safety with deferred initialization properties (item 83). PhoneNumberclass is not suitable for this situation, but just to show how it is done. Note that the initial value of the property hashcode (in this case, 0) should not be the hash code for the typically created instance:

// hashCode method with lazily initialized cached hash codeprivate int hashCode; // Automatically initialized to 0@Override public int hashCode() {????int result = hashCode;????if (result == 0) {????????result = Short.hashCode(areaCode);????????result = 31 * result + Short.hashCode(prefix);????????result = 31 * result + Short.hashCode(lineNum);????????hashCode = result;????}????return result;}

do not attempt to improve performance by excluding important attributes from hash code calculations . The resulting hash function may run faster, but its poor quality may degrade the performance of the Hashtable and make it unusable. In particular, hash functions may encounter a number of different instances, which are primarily different in areas that you ignore. If this happens, the hash function maps all of these instances to a small hash code, and the program that should run at a linear time will run the time of the square.

This is not just a theoretical question. Before Java 2, the String class hash function used up to 16 characters in the entire string, starting with the first character and selecting evenly throughout the string. For a large number of collections with hierarchical names, such as URLs, this feature shows exactly the morbid behavior described earlier.

do not provide a detailed specification for the value returned by Hashcode, so the client cannot rely on it reasonably; you can change its flexibility . Many classes in the Java class library, such as String and integer, specify the exact value returned by the Hashcode method as the function of the instance value. This is not a good idea, but a mistake we have to endure: it hinders the ability to improve the hash function in a future release. If you do not specify details and you find a flaw in the hash function, or if you find a better hash function, you can change it in subsequent versions.

In summary, you must override the Hashcode method every time you override the Equals method, or the program will not run correctly. Your hashcode method must conform to the general conventions specified by the object class, and must perform reasonable work to assign unequal hash codes to unequal instances. This is easy to implement if you use the formula on page 51st. As described in entry 10, the Autovalue framework provides a good choice for manual writing of the Equals and Hashcode methods, and the IDE provides some of these features.

Effective Java Third edition--11. Overriding the Equals method also overrides the Hashcode method

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.