Thinking in Java: container in-depth study

Source: Internet
Author: User
Tags addall bitset comparable new set concurrentmodificationexception


1. The dashed box represents the abstract class, and the names of a large number of classes in the diagram begin with abstract, and they are simply tools that partially implement a particular interface, so you can choose to inherit from abstract when you create it.

A practical approach in collections: Pick a few frequently used:
1. Reverse (List): Reversal order
2. Rotate (list,int distance) all elements move backwards distance positions, looping the end elements to the front (three times reverse)
3.sort (list,comparator) sorting, according to their own Comparator
4.copy (List dest,list src) copy Element (reference)
5.fill (list,t x) same as arrays. Copy is the same reference
6.disjoint (Collection. Collection) Two sets no matter what element is returned ture
7.frequency (Collection, Object x) returns the number of elements in Collection equal to X
8.binarySearch () binary lookup (order required)

Functional Methods of collection:
The Boolean Add (T) Optional method . False if the number of parameters is not added into the container
Boolean AddAll (Collection) Optional method , just add a random element to True
void Clear () Optional method
Boolean contains (t) if the container already holds a generic T-parameter. True
Boolean Containsall (Collection)
Boolean IsEmpty ()
Iterator Iterator ()
Boolean Remove (Object) Optional method
Boolean RemoveAll (Collection) Optional method
Boolean Retainall (Collection) container intersection, Optional method
int size ()
Object[] ToArray ()
T[] ToArray (t[] a) returns an array that includes all the elements in the container
PS: This does not include random access to the Get () method of the selected element, since collection includes set. and set is maintaining the internal order itself (which makes random access meaningless)

2 The above implementation of various methods of joining and removing is optional in the collection interface, meaning that implementation classes do not need to provide functional definitions for these methods

Unsupported operations
1.arrays.aslist () generates a list that is based on a fixed-size array. Only operations that do not alter the size of the array are supported, and any method that causes changes to the dimensions of the underlying data structure produces a unsupportedoperationexception exception ( here Arrays.aslist returns the proxy class for list ). If you want to generate an ordinary container that uses all the methods in sequence, you can pass it as a parameter into a new container (such as AddAll, or a constructor, etc.)
The "immutable" method in the 2Collections class (the Unmodified* method) wraps the container into a proxy class that does not support any attempt to alter the container's operation

Set and storage order:
1:set (interface): The element that joins set must define the Equals () method to ensure the uniqueness of the object, and the set interface does not guarantee the order of the elements being maintained
2:hashset: The element that is deposited into the HashSet must be defined hashcode ()
3:treeset: Guaranteed order, element must implement comparable interface
4:linkedhashset has hashset query speed, and internally uses the chain list to maintain the order of elements, the element must be defined hashcode ()
Although Hashcode is not required to be implemented. However, for a good programming style, the Hashcode () method should always be overwritten at the same time when overriding the Equals method.

3. Suppose an object is used in a sort container of whatever kind. such as SortedSet (TreeSet is its only implementation), it must implement the comparable interface
PS: In the implementation of the interface CompareTo (), should not use the return i-i2 this form, error programming. Since this does not take into account the problem of i-i2 numerical overflow, it should be
Return (ARG.I < I?

-1: (arg.i = = I? 0:1))

4 If the elements in the HashSet container do not define HASHCODE () again, placing them in any hash implementation may result in repeated values that do not even produce an execution-time error.

The only reliable way to do this is to write a single measurement.

5.SortedSet means "sort elements according to their comparative functions" rather than "the order in which elements are inserted."

6. Priority queue. Its element sort order is also controlled by implementing comparable.

Understanding Map:
1:hashmap: Hash list-based implementation.

Ability to set capacity and load factors through the constructor to adjust container performance.
2:linkedhashmap: With HashMap, it's not about iterating over the duration. The order in which key-value pairs are obtained is the insertion order. or LRU order. The rest of the performance is slower in addition to iterative access
3:treemap: Based on the implementation of red and black trees. When viewing "key" or "key-value pairs", they are sorted (according to the comparable method), and their only map with the Submap () method can return a subtree.
4:weakhashmap: Weak key mapping. Agree to release the object that the map points to; if there is no reference outside the map to a key, this key can be reclaimed by the garbage collector
5:identityhashmap: Use = = instead of equals () to compare hash mappings for "keys"
PS: No matter what the key must have a equals method. Suppose the key is used to hash the map. You must also have an appropriate hashcode method. If the key is used for TREEMAP, it must implement comparable

The results of the values method can be printed directly in 7.map, and the method produces a collection that includes all of the maps. Since these collection are supported by map, any changes to collection will be reflected in the map associated with them.

8.LinkedHashMap hashes all elements, but returns key-value pairs in the order in which the elements are inserted, while traversing the key-value pairs. In addition, the ability to set Linkedhashmap in the constructor makes it possible to use an access-based LRU algorithm, so that elements that have not been interviewed will appear in front of the queue today. This feature makes it easy to implement programs that require regular cleanup of elements to save space.

Hash and hash code:
The 1.Object hashcode () method generates a hash code, which by default computes the hash code using the object's address.
2. You may feel that just write the appropriate hashcode () method to overwrite the version number. But it still doesn't work, unless you override the Equals () method at the same time. He is also part of the object.

HashMap uses Equals () to infer whether the current key is the same as the key that exists in the table.


PS: The correct equals () method must meet the following 5 conditions:
1) reflexivity: x.equals (x) must return True
2) Symmetry: x.equals (y) returns true if and only if Y.equals (x)
3) transitivity: X.equals (y) and Y.equals (z). Then X.equals (z) is True
4) Consistency: if X.equals (y) returns true, then multiple calls X.equals (Y) will return True if x, Y is not changed
5) for random non-null reference value X. X.equals (NULL) must return FALSE.

To hash for speed:
1. Because the bottleneck is at the query speed of the key. So one of the solutions is to keep the key sort state, and then use collections. BinarySearch () to query
2. The hash is further, it saves the key somewhere so that it can be found very quickly.

The fastest data structure for storing a set of elements is an array. So use it to represent the key information. Information is a hash code.
3. Usually. The conflict is handled by an external link, the array is saved directly to, but the list is saved, and the values in the list are linearly queried using the Equals method.
4. In, Java's hash function uses 2 of the entire number of times.

For modern processors, division and finding the remainder are the slowest operations. Using a 2 hash of the entire number of square lengths, the mask can be used instead of division. Because get is the most used operation. The% operation to find the remainder is the most expensive part. Instead, it is possible to eliminate this overhead by using 2 of the entire number of times.

9. the most important factor when designing the Hashcode () is: whenever. Calling Hashcode () on the same object should generate the same value .

If your Hashcode () method relies on the variable data in the object, the user should be careful.

At the same time, you should not make hashcode () dependent on unique object information, especially with the value of this, which can only produce very bad hashcode () because it is not possible to generate a new key that is identical to the key in the original key-value pair in the put.

10. To make hashcode use. He must be fast and meaningful, that is, he must generate hash codes based on the content of the object .

The hash code does not have to be unique. However, by Hashcode () and equals, you must be able to fully determine the identity of the object.

Performance:
1. Use of vectors should be avoided (the only reason to work properly is to be compatible with the list only for forward compatibility)
2TreeSet iterations are usually faster than using HashSet. Other than that. For insert operations, the Linkedhashset is more expensive than hashset, which is caused by the additional overhead associated with maintaining the list.
The performance of 3.Hashtable is roughly equivalent to that of HashMap.

Because HashMap is used to replace the Hashtable.

So they use the same underlying storage and lookup mechanisms.


4.IdentityHashMap has a completely different performance, because it uses = = rather than equals to compare elements.

Performance factors for HashMap:
Capacity: Number of buckets in a table (slot)
Initial capacity: The number of bucket bits that the table has at the time of creation. Both HashMap and HashSet have constructors that agree that you can only have the initial capacity.
Dimensions: The number of items currently stored in the table.
Load factor: Size/capacity.

Both HashMap and HashSet have constructors that agree with the load factor you specify. When the load condition reaches the level of the load factor, the container will voluntarily join its capacity (number of buckets). The implementation is to roughly double the capacity and once again distribute the existing objects to the new set of buckets (this is called re-hash)

Synchronization control for collection or map:
The 1.Collections class has the means to synchronize the entire container on its own initiative. Its syntax is similar to "unmodified*".


List a = collections.synchronizedlist (new ArrayList (data));
As you can see, it is best to pass the newly generated container directly to the appropriate "synchronous" method. In doing so, there will be no chance of a different version number being missed.

High Speed Error:
The Java container has a protection mechanism that prevents multiple processes from altering the contents of the same container at the same time. The Java Container Class library uses a high-speed error (fail fast) mechanism.

It will probe all changes in the container, regardless of what you do with the process, and when the other processes change the container, the concurrentmodificationexception exception is thrown immediately. For example: After the container gets an iterator, something is added to the container.
Concurrenthashmap,copyonwritearraylist and Copyonwritearrayset have used techniques to avoid concurrentmodificationexception.

Hold references:
The 1.java.lang.ref class library includes a set of classes. These classes are especially useful when there are large objects that can run out of memory. There are SoftReference, WeakReference, Phantomreference.

these different derived classes provide different levels of indirection for the garbage collector when the object being inspected during the garbage collection period can only be "available" through a reference object .
2. Assume that you want to continue to hold a reference to an object. Hopefully you'll be able to access that object later. But you also want to be able to agree with the garbage collector to release it, you should use the Reference object. In this way, you can continue to use the object and agree to release it when the memory is exhausted.
3. Use the Reference object as the medium between you and the normal reference (proxy), and there must be no ordinary reference to that object. This will do the same for this purpose.
4. For SoftReference, WeakReference, phantomreference Three reference differences and functions, see the JVM.
5. When using SoftReference, WeakReference, you can choose whether you want to put them in referencequeue (as a tool for "clean up before recycling", see JVM). And phantomreference can only rely on referencequeue.
6.WEAKHASHMAP: See JVM. It is used to save WeakReference. In this mapping, each value saves only one copy of the instance to save storage space. And Weakhashmap agrees that the garbage collector proactively cleans up keys and values. So it seems very convenient.

Java 1.0/1.1 Container (Learn):
1.Enmuberation: Iterator of the old version number. Its interface is smaller than iterator, but in any case the code should try to use iterator. Ability to generate a enumeration from collection by using the Collections.enumeration () method
2.Hashtable: As previously mentioned, the main Hashtable are very similar to HashMap.

Try to use HashMap (there are still a lot of other options when multithreading is synchronized)
3.BisSet: Bitset is a good choice if you want to efficiently store large amounts of "on/off" information. But its efficiency is only for the space. Given the need for efficient access time, bitset is slightly slower than the local array, and the Bitset minimum capacity is long:64 bit, if the contents are smaller, such as 8 bits. Then Bitset wasted some space. Bitset will also want to expand its capacity as the element joins the normal container. overall. If you have a collection of flags that you can name. Then Enumset is generally a better choice , because Enumset agrees that you will be able to reduce the error by agreeing to operate in the position of the name rather than the digit.

Thinking in Java: container in-depth study

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.