Overview
Java built-in hash code [hash code] concept is limited to 32-bit, and there is no discrete column algorithm and the data they are acting, it is difficult to replace the alternative algorithm. In addition, hash codes implemented using the Java built-in approach are often inferior, in part because they ultimately depend on the inferior hash codes already in the JDK class.
Object.hashcode tend to be fast, but they are weak in preventing collisions, and there is no expectation of dispersion. This makes them well suited for use in a hash table, as the additional collisions will only result in a slight performance penalty, while the poor dispersion can easily be corrected by re-hashing (all reasonable hash lists in Java are hashed). However, in a hash application other than a simple hash table, Object.hashcode almost always does not reach the requirements--hence, the Com.google.common.hash package.
Composition of the hash package
In this package of Java doc, we can see a lot of different classes, but the documentation doesn't clearly show how they work together. Before introducing the classes in the hash package, let's take a look at the following code example:
Hashfunction HF = HASHING.MD5 (); Hashcode HC = Hf.newhasher () . Putlong (ID). putstring (name, Charsets.utf_8) . PutObject (Person, Personfunnel) . hash ();
Hashfunction
Hashfunction is a simple (transparent), stateless method that maps arbitrary blocks of data to a fixed number of bits, and guarantees that the same input will produce the same output, and that different inputs produce different outputs as much as possible.
Hasher
An instance of Hashfunction can provide a stateful hasher,hasher that provides a smooth syntax for adding data to a hash operation and then fetching the hash value. Hasher can accept all native types, byte arrays, fragments of byte arrays, character sequences, character sequences for specific character sets, and so on, or any object given a funnel implementation.
Hasher implements the Primitivesink interface, which defines the fluent-style API for objects that accept native-type streams
Funnel
Funnel describes how to break a specific object type into the original Word segment value, thus writing to the Primitivesink. For example, if we have such a class:
Class Person { final int id; Final String FirstName; Final String LastName; final int birthyear;}
The corresponding funnel implementations may be:
funnel<person> Personfunnel = new funnel<person> () { @Override public void Funnel Primitivesink into) { into . Putint (person.id) . putstring (Person.firstname, Charsets.utf_8) . Putstring (Person.lastname, Charsets.utf_8) . Putint (birthyear); }}
Note: putstring ("abc", Charsets.utf_8). Putstring ("Def", Charsets.utf_8) is exactly the same as putstring ("AB", Charsets.utf_8). putstring ("Cdef", charsets.utf_8), because they provide the same sequence of bytes. This can lead to unexpected hash conflicts. Adding some form of delimiter helps eliminate hash conflicts.
Hashcode
Once the hasher is given all input, it is possible to get the hashcode instance by means of the hash () method (the result of multiple calls to the hash () method is indeterminate). Hashcode can do equality detection through the Asint (), Aslong (), Asbytes () methods, in addition, Writebytesto (array, offset, maxLength) writes the first maxLength bytes of the hash value to the byte array.
Bloom Filter [Bloomfilter]
Bloom Filter is an elegant use of hashing, which can be implemented simply based on Object.hashcode (). In short, the Bloom filter is a probabilistic data structure that allows you to detect whether an object is definitely not in the filter, or it may have been added to the filter. This is a comprehensive introduction to the Bloom Filter Wiki page, and we recommend a tutorial on GitHub.
The Guava Hash pack has an built-in bloom filter implementation that you can use as long as you provide funnel. You can get bloomfilter<t> by using the Create (Funnel Funnel, int expectedinsertions, double falsepositiveprobability) method, and the default false detection rate [ Falsepositiveprobability] is 3%. Bloomfilter<t> provides Boolean mightcontain (T) and void put (t), and their meanings are self-explanatory.
Bloomfilter<person> friends = Bloomfilter.create (Personfunnel, $, 0.01); for (person friend:friendslist) { Friends.put (friend);} A long time later if (Friends.mightcontain (dude)) { //dude is not a friend and the probability of running here is 1% //Here we can trigger some asynchronous loading while doing further precise checking}
Hashing class
The hashing class provides a number of hash functions, as well as tool methods for operating Hashcode objects.
The provided hash function
MD5 () |
murmur3_128 () |
Murmur3_32 () |
SHA1 () |
SHA256 () |
SHA512 () |
Goodfasthash (int bits) |
|
Hashcode Operations
Method |
Describe |
hashcode combineordered (iterable |
To join the hash code in an orderly manner, if the two hash sets are the same as the hash code that is joined by the method, the elements of the hash set may be of equal order |
Hashcode combineunordered (iterable |
To join the hash code in an unordered manner, if the two hash sets are the same as the hash code that is joined by the method, then the elements of the hash set may be equal in some sort |
int Consistenthash (hashcode, int buckets) |
Returns a consistent hash value for the given bucket size. This method guarantees a minimum consistent hash value when the bucket grows. See consistent hashes. |
original articles, reproduced please specify: reproduced from the Concurrent programming network –ifeve.com This article link address: [Google Guava] 10-hash
Guava 10-Hash