Java Theory and practice--effective and correct definition of hashcode () and Equals () __java

Source: Internet
Author: User
Each Java object has a hashcode () and Equals () method. Many classes ignore (Override) default implementations of these methods to provide deeper semantic comparability between object instances. In the Java Philosophy and Practice section, Java Developer Brian Goetz introduces you to the rules and guidelines you should follow when creating Java classes to effectively and accurately define hashcode () and Equals (). You can in discussion forums with the author and other readers to explore your views on this article. (You can also click on the discussion at the top or bottom of this article to enter the Forum.) )

Although the Java language does not directly support associative arrays--you can use any object as an array of indexes--the use of the Hashcode () method in the root object class makes it clear that you expect extensive use of HASHMAP (and its predecessor Hashtable). Ideally, a hash-based container provides effective insertion and efficient retrieval, and hashing directly in object mode facilitates the development and use of a hash-based container.

define the equality of objects Memory leak due to improper hashcode overload

The object class has two methods to infer the identity of the objects: Equals () and hashcode (). In general, if you ignore one of these, you must ignore both, as there is a vital relationship that must be maintained between the two. The special case is based on the Equals () method, and if two objects are equal, they must have the same hashcode () value (although this is not usually true).

The semantics of Equals () for a particular class are defined on the left side of the implementer, and the definition of equals () for a particular class means what is part of its design effort. The default implementation of Object provides a simple reference to the following equation:


In this default implementation scenario, the two references are equal only if they refer to the true same object. Similarly, the default implementation of the Hashcode () provided by object is generated by mapping the memory address of an object to an integer value. Since the address space is greater than the range of int values on some architectures, it is possible for two different objects to have the same hashcode (). If you ignore Hashcode (), you can still use the System.identityhashcode () method to access such defaults.

ignore Equals ()--Simple instance

By default, Equals () and Hashcode () are based on identity enforcement, but for some classes they want to loosen the definition of the equation. For example, the Integer class defines equals () similar to the following:


In this definition, the two integer objects are equal only if they contain the same integer value. Binding will be an integer that cannot be modified, which makes it practical to use integer as the keyword in HashMap. This equal method can be used by all the original encapsulated classes in the Java class library, such as integers, Float, Character, and Boolean and string (if two String objects contain the same sequence of characters, they are equal). Because these classes are not modifiable and can implement Hashcode () and Equals (), they can all be good hash keywords.

why ignore Equals () and Hashcode ()?

What if the integer does not ignore equals () and hashcode ()? If we never use integer as a keyword in HashMap or other hash-based collections, nothing happens. However, if we use such an integer object as a keyword in hashmap, we will not be able to reliably retrieve the associated value unless we use an integer instance that is extremely similar to the put () call in the Get () call. This requires ensuring that only one instance of an integer object corresponding to a particular integer value is used in our entire program. Needless to say, this method is extremely inconvenient and frequently wrong.

The interface contract of object requires that if the Equals () two objects are equal, they must have the same hashcode () value. Why our Root object class needs to be hashcode () when its ability to recognize is entirely contained in equals (). The Hashcode () method is simply used to improve efficiency. Java Platform designers anticipate the importance of hash-based collection classes (Collection Class) In typical Java applications-such as Hashtable, HashMap, and hashset, and use Equals () Comparing with many objects is very expensive in terms of calculation. Enables all Java objects to support Hashcode () and use a hash based collection to enable efficient storage and retrieval.

Requirements for Implementing Equals () and hashcode ()

There are some limitations to implementing equals () and Hashcode (), which are listed in the Object file. In particular, the Equals () method must display the following properties: Symmetry: Two references, A and B, a.equals (b) if and only if B.equals (a) reflexivity: All non-null references, a.equals (a) Tra Nsitivity:if A.equals (b) and B.equals (c), then A.equals (c) consistency with hashcode (): Two equal objects must have the same hashcode () value

The specification of object does not explicitly require that equals () and hashcode () must be consistent-their results will be the same in subsequent calls, assuming "do not alter any information used in object equality comparisons." "It sounds like" the result of the calculation will not change unless it is actually the case. "This vague statement usually explains that equality and hash value calculations should be the deterministic function of an object, not the other."



what the object equality means.

It is easy to meet the requirements of the object class specification for Equals () and hashcode (). Decide whether and how to ignore equals (), in addition to judgment, requires other. In a simple repairable value class, such as Integer (in fact, almost any class that cannot be modified), the choice is quite obvious-equality should be based on the equality of the base object state. In an integer case, the unique state of an object is a basic integer value.

For modifiable objects, the answer is not always so clear. Equals () and hashcode () should be based on the identity of the object (like the default implementation) or the state of the object (like Integer and string). There is no simple answer-it depends on the class's planned use. For containers like the List and Map, people are arguing about it. Most classes in the Java class Library, including the container class, are now provided with the Equals () and Hashcode () implementations based on the state of the object.

If the hashcode () value of an object can be changed based on its state, then we have to be careful when using such objects as keywords in a hash collection, and we must be aware that we are not allowed to change their state when they are used as hash keywords. All hashing based sets assume that the object's hash value does not change when it is used as a keyword in the collection. If its hash code is changed when the keyword is in the collection, some unpredictable and confusing results are generated. This is usually not a problem in practice--we don't often use modifiable objects like the list as keywords in HashMap.

An example of a simple modifiable class is point, which defines equals () and hashcode () according to the state. If two point objects refer to the same (x, y) coordinates, the hash value of point comes from the IEEE 754-bit representation of the x and Y coordinates, then they are equal.

For more complex classes, the behavior of equals () and hashcode () may even be affected by superclass or interface. For example, the list interface requires that if and only another object is a list, and they have the same elements in the same order (defined by the Object.Equals () on the element), the list object equals another object. The requirements for hashcode () are more special--list the hashcode () value must meet the following calculation:


Not only does the hash value depend on the contents of the list, but it also provides a special algorithm that combines the hash values of each element. (The String class stipulates that a similar algorithm is used to compute a String's hash value.) )



Write your own Equals () and Hashcode () methods

Ignoring the default equals () method is simple, but ignoring the Equals () method that has been ignored is extremely tricky without violating symmetric (symmetry) or transitive (transitivity) requirements. When you ignore equals (), you should always include some Javadoc comments in equals () to help those users who want to expand your class correctly.

As a simple example, consider the following classes:


How we should write the Equals () method of the class. This approach applies to many situations:


Now that we've defined equals (), we have to define hashcode () in a uniform way. A uniform, but not always valid, definition of hashcode () is as follows:


This method generates a large number of entries and significantly reduces the performance of HashMap s, but it conforms to the specification. A more reasonable hashcode () implementation should be this:


Note: Both implementations reduce the amount of computing power of the Equals () or Hashcode () method of the Class state field. Depending on the class you are using, you may want to reduce the superclass equals () or hashcode () functionality as part of the computational power. For the original field, there is a helper feature in the related encapsulation class that can help create hash values, such as float.floattointbits.

It is not realistic to write a perfect Equals () method. Generally, when extending a instantiable class that itself ignores equals (), it is impractical to ignore equals (), and writing The Equals () method that will be ignored (as in an abstract class) is different from writing the Equals () method for the specific class. For more information about instances and descriptions, see effective Java programming Language Guide, Item 7 (resources).



need to be improved?

Building hashing into the root object class of a Java class library is a very sensible design compromise-it makes it so simple and efficient to use a hash based container. However, many criticisms have been made on the method and implementation of hashing algorithm and object equality in Java class Library. Hash-based containers in Java.util are convenient and easy to use, but may not apply to applications that require very high performance. While most of these will not change, these factors must be taken into account when you design applications that are heavily dependent on hash-based container efficiency, including: too small a hash range. using int instead of long as the return type of hashcode () increases the probability of a hash conflict.
Bad hash value assignment. The hash values for short strings and small integers are their own small integers, close to the hash values of other "neighboring" objects. A well-behaved hash function that distributes the hash value more evenly within the hash range is specified. No defined hash operation. Although some classes, such as String and List, define a hash algorithm that combines the hash value of its element with a hash value, the language specification does not define any approved method that binds the hash value of multiple objects to the new hash value. The tips we used to write the list, String, or instance Class A we discussed in our Equals () and Hashcode () methods earlier are simple, but they are far from perfect in arithmetic. The class library does not provide easy implementation of any hashing algorithms, and it simplifies the creation of more advanced hashcode () implementations.
It is difficult to write equals () when the extension has ignored the instantiable class of Equals (). When an extension has ignored the instantiable class of Equals (), the "obvious" way to define equals () does not satisfy the symmetric or transitive requirements of the Equals () method. This means that when you ignore equals (), you must understand the structure and implementation details of the class you are expanding, and even expose the secret fields in the base class, which violates the principle of object-oriented design.

Concluding remarks

By unifying the Equals () and hashcode (), you can promote the use of a class as a keyword in a hash-based collection. There are two ways to define an object's equality and hash value: Based on the identity, it is the default method provided by object; Based on the state, it requires that equals () and hashcode () be ignored. If the hash value of the object changes when the state of the object changes, you are sure that you do not allow you to change its state when the state is used as a hash key

about the author

Brian Goetz has been a professional software developer for the past 15 years.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.