Java Theory and Practice: Hash

Source: Internet
Author: User
Every Java object has hashCode()And equals()Method. The default implementation of many class-ignore (override) methods to provide deeper semantic comparability between object instances. In Java concepts and practicesIn this section, Java developer Brian Goetz introduces you to creating Java classes to effectively and accurately define hashCode()And equals()Rules and guidelines to be followed. You can discuss your views on this article with the author and other readers in the Forum. (You can also click DiscussionEnter the Forum .)

Although JAVA does not directly support associating Arrays-any object can be used as an index array-ObjectClasshashCode()The method clearly indicates the expected wide applicationHashMap(And its predecessors)Hashtable). In ideal cases, the efficient insertion and retrieval of containers based on hashes are provided. The support of hashes in object mode can promote the development and use of hashes-based containers.

Define object equality
ObjectThere are two ways to deduce the object identifier:equals()AndhashCode(). In general, if you ignore either of them, you must ignore both of them at the same time, because there must be a crucial relationship between the two. In special casesequals()Method. If two objects are equal, they must have the samehashCode()Value (although this is not true ).

For a specific classequals()In the left-side of implementer.equals()What it means is part of its design work.ObjectThe following equation is referenced in the provided default implementation:

  public boolean equals(Object obj) { return (this == obj); }

In this default implementation, the two references are equal only when they reference the same object. Similarly,ObjectProvidedhashCode()By default, the object memory address is mapped to an integer. In some architectures, the address space is largerintValue range. Two different objects have the samehashCode()Yes. If you ignorehashCode(), You can still useSystem.identityHashCode()Method.

Ignore equals ()-simple instance
By default,equals()AndhashCode()Logo-based implementation is reasonable, but for some classes, they want to relax the definition of equations. For example,IntegerClass Definitionequals()Similar to the following:

  public boolean equals(Object obj) {    return (obj instanceof Integer             && intValue() == ((Integer) obj).intValue());  }

In this definition, only when the two values contain the same integerIntegerObjects are equal. The combination will not be modifiableInteger, Which enables the useIntegerAsHashMapIs feasible. This value-based equal method can be used by all original encapsulation classes in the Java class library, suchInteger,Float,CharacterAndBooleanAndString(If twoStringObjects that contain characters in the same order are equal ). Because these classes are unchangeable and can be implementedhashCode()Andequals()They can all be used as good hash keywords.

Why ignore equals () and hashcode ()?
IfIntegerIgnoreequals()AndhashCode()What will happen? If we have neverHashMapOr other hash-based sets.IntegerAs a keyword, nothing happens. HoweverIn hashmapUse this typeIntegerObjects as keywords cannot be reliably retrieved unlessget()Used in the call andput()SimilarIntegerInstance. This requires that only the corresponding integer values can be used in our entire program.IntegerAn instance of the object. Needless to say, this method is extremely inconvenient and has frequent errors.

ObjectInterface contract.equals()The two objects are equal, so they must have the samehashCode()Value. When the entire recognition capability is included inequals()Why does our root object class needhashCode()?hashCode()The method is purely used to improve efficiency. Java platform designers predict the importance of the collection class in typical Java applications-for exampleHashtable,HashMapAndHashSetAnd useequals()Comparing with many objects is very expensive in computing. Make all Java objects supportedhashCode()Combined with a hash-based set, you can effectively store and retrieve data.

Implement equals () and hashcode () Requirements
Implementationequals()AndhashCode()There are some restrictions,ObjectThese restrictions are listed in the file. Especiallyequals()The method must display the following attributes:

  • Handle ry: two references,aAndb,a.equals(b) if and only if b.equals(a)
  • Reflexivity: all non-empty references,a.equals(a)
  • Transiti.pdf: Ifa.equals(b)Andb.equals(c), Thena.equals(c)
  • ConsistencyhashCode(): Two equal objects must have the samehashCode()Value

ObjectIs not explicitly requiredequals()AndhashCode()RequiredConsistent-- Their results will be the same in subsequent calls, assuming that "No information used in object equality comparison is changed ." It sounds like "The calculation results will not change unless the actual situation is the case ." This Fuzzy statement is generally interpreted as equal and hash value calculation should be the object's deterministic function, rather than other.

What does object equality mean?
It is easy for people to meet the object class specificationequals()AndhashCode(). Determine whether or not and how to ignoreequals()In addition to judgment, other requirements are also required. In a simple unrecoverable value class, for exampleInteger(In fact, almost all classes that cannot be modified), the choice is quite obvious-equality should be based on the equality of the basic object state. InIntegerIn this case, the unique state of an object is a basic integer.

For modifiable objects, the answer is not always so clear.equals()AndhashCode()Should the status of the object be based on the identifier of the object (such as the default implementation) or the object (such as integer and string )? There is no simple answer-it depends on the plan of the class. ForListAndMapFor such containers, people are arguing about this. Most classes in the Java class library, including the container class. errors are provided based on the object status.equals()AndhashCode()Implementation.

IfhashCode()Values can be changed based on their statuses. When using such objects as keywords in a hash-based set, we must note that when they are used as hash keywords, we are not allowed to change their statuses. All hash-based set assumptions do not change when the object's hash value is used as a key word in the set. If the hash code of a keyword is changed when it is in a set, unpredictable and confusing results are generated. In practice, this is usually not a problem-we do not often use imagesListSuch modifiable objectHashMapKeyword.

A simple example of modifiable class is point, which is defined according to the state.equals()AndhashCode(). If twoPointThe object references the same(x, y)Coordinates,PointThe hash value of comes fromxAndyThe IEEE 754-bit pairs of coordinate values are equal.

For complex classes,equals()AndhashCode()May even be affected by superclass or interface. For example,ListThe interface requires that if only one object isList,And they have the same elements in the same order (from the elementsObject.equals()Definition ),ListThe object is equal to another object.hashCode()More special -- listhashCode()The value must meet the following calculation criteria:

  hashCode = 1;  Iterator i = list.iterator();  while (i.hasNext()) {      Object obj = i.next();      hashCode = 31*hashCode + (obj==null ? 0 : obj.hashCode());  }

Not only does the hash value depend on the content of the list, but it also specifies a special algorithm that combines the hash value of each element. (StringClass rules similar algorithms for computingString.)

Compile your own equals () and hashcode () Methods
Ignore the defaultequals()The method is relatively simple, but ignore ignoredequals()The method is extremely tricky. When ignoreequals()You should alwaysequals()To help users who want to correctly extend your class.

As a simple example, consider the following classes:

  class A {    final B someNonNullField;    C someOtherField;    int someNonStateField;  }

How should we compileequals()? This method applies to many situations:

  public boolean equals(Object other) {    // Not strictly necessary, but often a good optimization    if (this == other)      return true;    if (!(other instanceof A))      return false;    A otherA = (A) other;    return       (someNonNullField.equals(otherA.someNonNullField))        && ((someOtherField == null)             ? otherA.someOtherField == null             : someOtherField.equals(otherA.someOtherField)));  }

Now we have definedequals(), Which must be defined in a uniform way.hashCode(). A uniform but not always valid definitionhashCode()The method is as follows:

  public int hashCode() { return 0; }

This method will generate a large number of entries and significantly reduceHashMapS performance, but it complies with specifications. A more reasonablehashCode()The implementation should be like this:

  public int hashCode() {     int hash = 1;    hash = hash * 31 + someNonNullField.hashCode();    hash = hash * 31                 + (someOtherField == null ? 0 : someOtherField.hashCode());    return hash;  }

Note: Both of these implementations reduce the class status Fieldequals()OrhashCode()The calculation capability of the method is proportional. Depending on the class you use, you may want to lowerequals()OrhashCode()Computing power of the function. For Original fields, the Helper function is available in the relevant encapsulation class to help create hash values, as shown inFloat.floatToIntBits.

Write a perfectequals()The method is unrealistic. Generally, when an extension is ignoredequals()When instantiable class of, ignoreequals()It is impractical, and the writing will be ignored.equals()Methods (such as in abstract classes) are different from writing for specific classes.equals()Method. For more information about instances and descriptions, seeValid Java programming language guide, Item 7 (references ).

To be improved?
Constructing the hash Method to the root object class of the Java class library is a wise design compromise-it makes it so simple and efficient to use a hash-based container. However, many people have criticized the methods and implementation of hash algorithms and object equality in Java class libraries.java.utilThe hash-based containers in are very convenient and easy to use, but may not be suitable for applications that require high performance. While most of them will not change, you must consider these factors when designing applications that depend heavily on the efficiency of hashed containers, including:

  • The hash range is too small. UseintInsteadlongAshashCode()The return type increases the probability of hash conflicts.

  • Bad hash value allocation. The hash values of short strings and small integers are their own small integers, which are close to the hash values of other "adjacent" objects. A well-behaved hash function distributes hash values more evenly within the hash range.

  • Undefined hash operation. Although some classes, suchStringAndList, Defines the hash algorithm used to combine the hash value of its element into a hash value, however, the language specification does not define any method for combining hash values of multiple objects into new hash values. We have discussed in writing our own equals () and hashcode () methods.List,StringOr instance typeAIt is easy to use, but it is far from perfect in arithmetic. The class library does not provide any hash algorithm for easy implementation, it can simplify more advancedhashCode()Implementation creation.

  • When the extension has been ignoredequals()It is difficult to compile the instantiable classequals(). When the extension has been ignoredequals()Instantiable class, definedequals()Neither of the "obvious" methods can meetequals()Symmetric or pass-through requirements of methods. This means that when you ignoreequals()You must understand the structure and implementation details of the class you are extending, or even expose confidential fields in the basic class, which violates the object-oriented design principles.

Conclusion
Unified Definitionequals()AndhashCode(),You can improve the usage of a class as a keyword in a hash-based set. There are two ways to define the equality and hash value of an object: based on the identity, it isObjectThe default method provided. It must be ignored Based on the status.equals()AndhashCode(). When the object state changes, if the object's hash value changes, you are sure that you cannot change the state when the State is used as a hash keyword.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.