This article is reproduced in Http://www.ibm.com/developerworks/cn/java/j-lo-set-operation/index.html#ibm-pcon
In the actual project development, there will be many objects, how to manage the object efficiently and conveniently, and become an important link that affects the performance and maintainability of the program. Java provides a set of framework to solve such problems, linear table, linked list, hash tables, etc. is a common data structure, in Java development, the JDK has provided us with a series of corresponding classes to implement the basic data structure, all classes in Java.util this package, listing 1 describes the relationship of the collection class.
Listing 1: Relationship between collection classes
Collection├list│├linkedlist│├arraylist│└vector│└stack└setmap├hashtable├hashmap└weakhashmap
This article is a summary of the use of the collection framework, note that all the code in this article is based on JDK7.
Collection interface
Collection interface
Collection is the most basic set interface, and a Collection represents a set of Object, the Collection element (Elements). Some Collection allow the same elements, support ordering of elements, and others. The JDK does not provide classes that inherit directly from Collection, and the classes provided by the JDK are sub-interfaces that inherit from Collection, such as List and Set. All classes that implement the Collection interface must provide two standard constructors, the parameterless constructor is used to create an empty Collection, and a constructor with a Collection parameter is used to create a new Collection, the new Collection has the same element as the incoming Collection, and the latter constructor allows the user to copy a Collection.
How do I traverse every element in the Collection?
Regardless of the actual type of Collection, it supports a iterator () method that returns an iteration that uses the iteration to access each element of the Collection one at a time. Typical usage is as follows:
Iterator it = Collection.iterator (); Get an iteration child
while (It.hasnext ()) {
Object obj = It.next (); Get the next element
}
The two interfaces derived from the Collection interface are List and Set.
The main methods provided by the Collection interface are:
- A Boolean Add (Object o) Adds an object to the collection;
- A Boolean remove (object o) deletes the specified object;
- int size () returns the number of elements in the current collection;
- A Boolean contains (object O) finds whether the specified object is in the collection;
- Boolean IsEmpty () determines whether the set is empty;
- Iterator Iterator () returns an iterator;
- Boolean containsall (Collection C) finds whether the collection has elements in the set C;
- Boolean AddAll (Collection c) adds all the elements in the collection C to the collection;
- void Clear () deletes all elements in the collection;
- void RemoveAll (Collection c) removes elements from the collection that are also in the C collection;
- void Retainall (Collection c) removes elements from the collection that are not contained in collection C.
List interface
The List is an ordered Collection, using this interface to precisely control where each element is inserted. The user is able to access the elements in the list using an index (where the element is positioned in the list, similar to an array subscript), similar to an array of Java. Unlike the Set mentioned below, the List allows the same elements.
In addition to the iterator () method, which has the Collection interface prerequisites, the List also provides a listiterator () method that returns a Listiterator interface. Compared to the standard Iterator interface, Listiterator has a number of add () methods that allow adding, deleting, setting elements, traversing forward or backward, and so on. The common classes that implement the List interface are Linkedlist,arraylist,vector, Stack, and so on.
The main methods provided by the List interface are:
- void Add (int index,object Element) Adds an object at the specified position;
- Boolean addall (int index,collection c) adds the elements of the collection C to the specified position;
- Object get (int index) returns the element at the specified position in the list;
- an int indexOf (Object o) returns the position of the first occurrence of the element o;
- Object removeint (int index) deletes the element at the specified position;
- The Object set (int index,object element) replaces the element on position index with element elements, returning the substituted element.
Map interface
Map does not inherit the Collection interface. Map provides a key-to-Value mapping, a map cannot contain the same key, and each key can only map one Value. The map interface provides views of 3 collections, and the contents of the map can be treated as a set of Key sets, a set of Value collections, or a set of key-value mappings.
MAP provides the main methods:
- Boolean equals (Object o) Comparison object;
- A Boolean remove (object o) deletes an object;
- Put (Object Key,object value) to add key and value.
Randomaccess interface
The Randomaccess interface is a flag interface and does not provide any method in itself, and the task is generally considered to be an object that supports fast random access by invoking the Randomaccess interface. The primary purpose of this interface is to identify those List implementations that can support fast random access. Any array-based List implementation implements the Raodomaccess interface, while the implementation of the linked list is not. Because only the array can be accessed quickly and randomly, the random access to the list needs to traverse the linked list. Therefore, the advantage of this interface is that you can know in your application whether the list object being processed can be quickly and randomly accessed, thus doing different things for different lists to improve the performance of the program.
Introduction to collection Classes
LinkedList class
The LinkedList implements a List interface that allows Null elements. In addition, LinkedList provides additional methods of Get, Remove, and Insert to manipulate the data at the first or the tail of the LinkedList. These operations enable LinkedList to be used as a stack, queue, or two-way queue (Deque). Note that LinkedList does not have a synchronous method, which is not thread-synchronized, that is, if multiple threads access a List at the same time, they must implement access synchronization on their own. One workaround is to construct a synchronized list when the list is created, such as
List List = Collections.synchronizedlist (new LinkedList (...)) ;
ArrayList class
ArrayList implements a variable-size array. It allows all elements, including Null. The run time of Size, IsEmpty, Get, Set, and so on is constant, but the Add method cost is the allocated constant, and the N element requires O (n) time, and the other method runs at a linear time.
Each ArrayList instance has a capacity (capacity) that stores the size of an array of elements that can be automatically incremented as new elements are added continuously. When you need to insert a large number of elements, you can call the Ensurecapacity method before inserting to increase the capacity of the ArrayList to improve insertion efficiency. Like LinkedList, ArrayList is also a thread-unsynchronized (unsynchronized).
ArrayList provides the main methods:
- A Boolean Add (Object o) adds the specified element to the end of the list;
- A Boolean Add (int index,object Element) adds the specified element to the specified position in the list;
- Boolean AddAll (Collection c) adds the specified collection to the end of the list;
- Boolean addall (int index,collection c) joins the specified set in the specified position in the list;
- Boolean Clear () deletes all the elements in the list;
- Boolean Clone () returns a copy of the list instance;
- Boolean contains (Object O) determines whether the list contains elements;
- The Boolean ensurecapacity (int m) increases the capacity of the list and, if necessary, can accommodate m elements;
- Object get (int index) returns the element at the specified position in the list;
- Int indexOf (Object elem) finds the subscript for the specified element in the list;
- Int size () returns the number of elements in the current list.
Vector class
Vectors are very similar to ArrayList, except that vectors are thread-synchronized. The Iterator created by the vector, although the same interface as the Iterator created by ArrayList, but because the vector is synchronous, when a Iterator is created and is being used, another thread changes the state of the vector (for example, adding or remove some elements), the Iterator method will be called when the concurrentmodificationexception is thrown, so the exception must be caught.
Stack class
Stack inherits from Vector and implements a last-in-first-out stack. The stack provides 5 additional ways to make the Vector available as a stack. In addition to the basic Push and Pop methods, and the Peek method to get the element at the top of the stack, the empty method tests if the stack is empty, and the Search method detects the position of an element on the stack. Note that the stack has just been created with an empty stack.
Set class
Set is a Collection that contains no duplicate elements, that is, any of the two elements E1 and E2 have E1.equals (E2) =false. Set has a maximum of one null element. Obviously, the constructor of a Set has a constraint that the passed-in Collection parameter cannot contain duplicate elements. Note that variable objects (Mutable object) must be handled with care, which can cause problems if mutable elements in a Set change their state.
Hashtable class
Hashtable inherits the map interface and implements a hash table based on the key-value mapping. Any object that is not empty (non-null) can be either a Key or a Value. Add data using Put (Key,value), take out the data using Get (Key), the time overhead for these two basic operations is constant.
The Hashtable adjusts performance through the Initial capacity and Load Factor two parameters. Normally the default Load Factor 0.75 is a good way to achieve a balanced time and space. Increasing the Load Factor can save space but the corresponding lookup time will increase, affecting operations like Get and Put. Using Hashtable's simple example, put 1, 2, 3 of these three numbers into Hashtable, their Key is "one", "two", "three", the code is shown in Listing 2.
Listing 2. Hashtable Example
Hashtable numbers = new Hashtable () numbers.put ("One", new Integer (1)), Numbers.put ("One", New Integer (2)), Numbers.put ( "Three", New Integer (3));
If we need to take out a number, such as 2, we can use the corresponding key to remove the code, as shown in Listing 3.
Listing 3: reading data from hastable
Integer n = (integer) numbers.get ("a"); System.out.println ("both =" + N);
Because an object that is a key will determine the position of its corresponding Value by calculating its hash function, any object that is a key must implement the Hashcode and Equals methods. The Hashcode and Equals methods inherit from the root class Object, and if you use a custom class as a Key, be quite careful, according to the hash function definition, if two objects are the same, i.e. obj1.equals (OBJ2) =true, their hashcode Must be the same, but if the two objects are different, their hashcode is not necessarily different, if two different objects hashcode the same, this phenomenon is called conflict, the conflict causes the operation of the hash table to increase the time overhead, so as far as possible to define a good hashcode () method, can speed up the operation of the hash table.
If the same object has different hashcode, the operation of the hash table will have unexpected results (expecting the Get method to return Null), to avoid this problem, it is best to simultaneously replicate the Equals method and the Hashcode method, rather than write only one of them.
HashMap class
HashMap is similar to Hashtable, except that HashMap is thread-unsynchronized and allows NULL, which is null Value and null Key. However, when HASHMAP is treated as Collection (the values () method can return Collection), its iterative sub-operation time overhead is proportional to the capacity of HashMap. Therefore, if the performance of an iterative operation is significant, do not set the initialization capacity of the HashMap too high, or the Load Factor parameter is set too low.
Weakhashmap class
Weakhashmap is an improved HASHMAP, which implements a "weak reference" to key, which can be recycled by GC if a key is no longer referenced externally.
Collection Class Practice
ArrayList, Vector, LinkedList are all from the implementation of Abstractlist, and Abstractlist directly implemented the List interface, and extended from Abstarctcollection. ArrayList and vectors use an array implementation, ArrayList does not provide thread synchronization for any one method, so it is not thread-safe, and most of the methods in the vector do thread synchronization, which is a thread-safe implementation. LinkedList uses a circular doubly linked list data structure, which is connected by a series of table items, and a table entry always contains 3 parts, element content, precursor table entries, and back drive table entries.
The capacity needs to be expanded when ArrayList exceeds the size of the current array. During the expansion process, a large number of array copy operations are performed, and when the array is copied, the System.arraycopy () method is eventually called. LinkedList because of the use of the structure of the linked list, so there is no need to maintain the size of the capacity, but each time the element increases the need to create a new Entry object, and more assignment operations, in the frequent system calls, the performance will have a certain impact, The creation of new objects without interruption or a certain amount of resources is consumed. Because of the continuity of the array, it is always possible to add elements at the end, only to generate array expansions and arrays when there is insufficient space.
ArrayList is an array-based implementation, and an array is a contiguous memory space, and if an element is inserted anywhere in the array, it is inevitable that all elements after that position need to be rearranged, so it is inefficient to insert data into the trailer whenever possible. LinkedList does not cause performance degradation by inserting data.
After each valid element delete operation of ArrayList, the array is reorganized, and the position of the deleted element is higher, the higher the cost of the array reorganization, the lower the cost of the element to be removed. LinkedList to remove the intermediate data, you need to conveniently complete the half List.
Listing 4. ArrayList and LinkedList using code
Import Java.util.arraylist;import Java.util.linkedlist;public class Arraylistandlinkedlist {public static void main ( String[] args) {Long start = System.currenttimemillis (); ArrayList list = new ArrayList (); Object obj = new Object (); for (int i=0;i<5000000;i++) {list.add (obj);} Long end = System.currenttimemillis (); System.out.println (End-start); Start = System.currenttimemillis (); LinkedList list1 = new LinkedList (); Object obj1 = new Object (); for (int i=0;i<5000000;i++) {list1.add (obj1);} end = System.currenttimemillis (); System.out.println (End-start); Start = System.currenttimemillis (); Object obj2 = new Object (); for (int i=0;i<1000;i++) {list.add (0,OBJ2);} end = System.currenttimemillis (); System.out.println (End-start); Start = System.currenttimemillis (); Object obj3 = new Object (); for (int i=0;i<1000;i++) {list1.add (obj1);} end = System.currenttimemillis (); System.out.println (End-start); Start = System.currenttimemillis (); List.remove (0); End = System.currenttiMemillis (); System.out.println (End-start); Start = System.currenttimemillis (); List1.remove (250000); End = System.currenttimemillis (); System.out.println (End-start); }}
Listing 5. Run output
639129669690015
HashMap is to make the key hash algorithm, and then map the hash value to the memory address, directly get the data corresponding to the key. In HashMap, the underlying data structure uses an array, the so-called memory address, which is the subscript index of the array. HashMap's high performance needs to ensure the following points:
- The Hash algorithm must be efficient;
- The algorithm of Hash value to memory address (array index) is fast;
- The corresponding value can be obtained directly based on the memory address (array index).
HashMap is actually an array of linked lists. As mentioned earlier, the implementation mechanism of HASHMAP-based linked list, as long as the hashcode () and the Hash () method to achieve good enough to minimize the generation of conflicts, then the operation of the HASHMAP is almost equivalent to the random access operation of the array, with good performance. However, if the hashcode () or Hash () method is poor, in the case of a large number of conflicts, HashMap in fact degenerate into a few linked lists, the operation of HashMap is equivalent to traversing the linked list, when performance is poor.
One of the drawbacks of HASHMAP is its disorder, the element that is deposited into the HashMap, whose output is unordered when traversing HashMap. If you want elements to remain in the order of input, you can use Linkedhashmap overrides.
Linkedhashmap inherits from the HashMap, has the efficiency, at the same time on the basis of HASHMAP, but also inside adds a chain list, holds the element the order.
The HASHMAP uses the hash algorithm to perform the Put () and Get () operations as quickly as possible. The TREEMAP provides a completely different MAP implementation. Functionally, TREEMAP has a more powerful function than HashMap, which implements the SortedMap interface, which means it can sort elements. The performance of the TREEMAP is slightly lower than HashMap. If you need to sort elements in development, you won't be able to do this with HASHMAP, and the iteration output using TREEMAP will be in element order. Linkedhashmap are ordered based on the order in which elements enter the collection or are accessed, and TreeMap are based on the intrinsic order of the elements (determined by Comparator or comparable).
Linkedhashmap are sorted according to the order in which the elements are added or accessed, while TreeMap are sorted according to the Key of the element.
The code shown in Listing 6 illustrates the ordering of business logic using TREEMAP.
Listing 6. TREEMAP implementation Sequencing
Import Java.util.iterator;import Java.util.map;import Java.util.treemap;public class Student implements comparable <student>{public string Name;public int score;public Student (string Name,int score) {this.name = Name;this.score = S Core;} @Override//Tell TreeMap how to sort public int compareTo (Student o) {//TODO auto-generated method stubif (O.score<this.score) { return 1;} else if (o.score>this.score) {return-1;} return 0;} @Overridepublic String toString () {stringbuffer sb = new StringBuffer (); Sb.append ("Name:"); Sb.append (name); Sb.append ( ""); Sb.append ("Score:"); Sb.append (score); return sb.tostring ();} public static void Main (string[] args) {TreeMap map = new TreeMap (); Student S1 = new Student ("1", 100); Student s2 = new Student ("2", 99); Student s3 = new Student ("3", 97); Student S4 = new Student ("4", "Map.put"), S1 (Studentdetailinfo, new S1 (map.put)) (S2, new Studentdetailinfo (S2)); map.put (S3, New Studentdetailinfo (S3)); Map.put (S4, New Studentdetailinfo (S4));//Print the number of people between S4 and S2 map map1= ((TREEMAP) MAp). SubMap (S4, S2); for (Iterator Iterator=map1.keyset (). Iterator (); Iterator.hasnext ();) {Student key = (Student) Iterator.next (); System.out.println (key+ "," +map.get (key));} System.out.println ("SubMap end");///print scores lower than S1 map1= ((TREEMAP) map). Headmap (S1); for (Iterator Iterator=map1.keyset () . iterator (); Iterator.hasnext ();) {Student key = (Student) iterator.next (); System.out.println (key+ "," +map.get (key));} System.out.println ("SubMap end");//print scores higher than S1 map1= ((TREEMAP) map). Tailmap (S1); for (Iterator Iterator=map1.keyset () . iterator (); Iterator.hasnext ();) {Student key = (Student) iterator.next (); System.out.println (key+ "," +map.get (key));} System.out.println ("SubMap End");}} Class Studentdetailinfo{student S;public Studentdetailinfo (Student s) {THIS.S = s;} @Overridepublic String toString () {return s.name + "' s detail Information";}}
Listing 7: Running the output
Name:4 score:91->4 ' s detail informationname:3 score:97->3 ' s detail informationsubmap endname:4 score:91->4 ' s Detail Informationname:3 score:97->3 ' s detail informationname:2 score:99->2 ' s detail informationsubmap endname:1 Score:100->1 ' s detail informationsubmap end
The Weakhashmap feature is that if there is no other reference to this key, the MAP will automatically discard the value, except that it has a reference to the key itself. The code shown in Listing 8 declares two Map objects, one is HashMap, the other is Weakhashmap, and A and B two objects are placed into two maps, and when HashMap deletes a, and both A and B point to Null, Weakhashmap A will automatically be recycled. The reason for this is that, for a object, when HashMap is removed and a is pointed to Null, there is no pointer to a in addition to the Weakhashmap, so Weakhashmap will automatically discard a, while for the B object it points to NULL, but there is also a pointer to B in HashMap, so Weakhashmap will keep the B object.
Listing 8.WeakHashMap Sample Code
Import Java.util.HashMap; Import Java.util.Iterator; Import Java.util.Map; Import Java.util.WeakHashMap; public class Weakhashmaptest {public static void Main (string[] args) throws Exception { string a = new String ("a" ); String b = new String ("B"); Map Weakmap = new Weakhashmap (); Map map = new HashMap (); Map.put (A, "AAA"); Map.put (b, "BBB"); Weakmap.put (A, "AAA"); Weakmap.put (b, "BBB"); Map.Remove (a); A=null; B=null; System.GC (); Iterator i = Map.entryset (). Iterator (); while (I.hasnext ()) { map.entry en = (map.entry) i.next (); SYSTEM.OUT.PRINTLN ("Map:" +en.getkey () + ":" +en.getvalue ()); } Iterator j = Weakmap.entryset (). Iterator (); while (J.hasnext ()) { map.entry en = (map.entry) j.next (); System.out.println ("Weakmap:" +en.getkey () + ":" +en.getvalue ());}}}
Listing 9: Running the output
map:b:bbbweakmap:b:bbb
Weakhashmap mainly through the function of expungestaleentries to remove its internal unused items, so as to achieve the purpose of automatic memory release. Basically, this function is called when the content of the weakhashmap is accessed, so that it clears entries that are no longer external references to them. But wouldn't it be possible to free up memory if Weakhashmap was pre-generated and the weakhashmap was never accessed before the GC?
Listing 10. WeakHashMapTest1
Import Java.util.arraylist;import Java.util.list;import Java.util.weakhashmap;public class WeakHashMapTest1 {public static void Main (string[] args) throws Exception {list<weakhashmap<byte[][], byte[][]>> maps = new arraylist& Lt Weakhashmap<byte[][], byte[][]>> (); for (int i = 0; i < i++) {weakhashmap<byte[][], byte[][]> d = new weakhashmap<byte[][], byte[][]> (); D.put (New byte[1000][1000], new byte[1000][1000]); Maps.add (d); System.GC (); System.err.println (i); } }}
Run the code shown in listing 10 without changing any JVM parameters, because the Java default memory is 64M, throwing out memory overflow error.
Listing 11. Run output
241242243Exception in thread "main" Java.lang.OutOfMemoryError:Java Heap spaceat Weakhashmaptest1.main ( WEAKHASHMAPTEST1.JAVA:10)
Sure enough, weakhashmap this time does not automatically help us to release the unused memory. The code shown in listing 12 does not present a memory overflow issue.
(I write in parentheses, I use jdk1.8 64 bit, so run the above program is no memory overflow, the program runs for a long time, therefore, should not free memory)
Listing 12. WeakHashMapTest2
Import Java.util.arraylist;import Java.util.list;import Java.util.weakhashmap;public class WeakHashMapTest2 {public static void Main (string[] args) throws Exception {list<weakhashmap<byte[][], byte[][]>> maps = new arraylist& Lt Weakhashmap<byte[][], byte[][]>> (); for (int i = 0; i < i++) {weakhashmap<byte[][], byte[][]> d = new weakhashmap<byte[][], byte[][]> (); D.put (New byte[1000][1000], new byte[1000][1000]); Maps.add (d); System.GC (); System.err.println (i); for (int j = 0; J < i; J + +) {System.err.println (j + "size" + Maps.get (j). Size ());}} }}
The result of the operation shows that the test output is normal and there is no memory overflow problem.
In general, Weakhashmap is not what you do it can automatically release objects that are not used internally, but instead release objects that are not used when you access its content.
Weakhashmap implements a weak reference because its entry<k,v> is inherited from Weakreference<k>,
The class definitions and constructors in weakhashmap$entry<k,v> are shown in Listing 13.
Listing 13. Weakhashmap class definition
private static class Entry<k,v> extends Weakreference<k> implements map.entry<k,v> Entry (K key, V value , referencequeue<k> queue,int Hash, entry<k,v> next) {Super (key, queue); this.value = value; This.hash = Hash ; This.next = Next; }
Notice that it constructs the statement of the parent class: "Super (key, queue);", which is the key, so key is the weak reference, and Value is a direct strong reference in This.value. At System.GC (), the Byte array in Key is recycled, and value remains (value is strongly associated to Entry, Entry is associated with the map, and map is associated with ArrayList).
Each time a new weakhashmap is made for the For loop, after the Put operation, although the GC recycles the Byte array in the WeakReference Key and notifies the referencequeue of the event, there is no corresponding action to trigger Weakhashmap to deal with Referencequeue, so weakreference packaging Key still exists in the Weakhashmap, the corresponding value of course exists.
When was the value cleared? Analysis of listing 10 and listing 112 sample programs shows that Maps.get (j) of listing 11. Size () triggers the recovery of Value, so how does it trigger? View Weakhashmap Source The Size method calls the Expungestaleentries method, which iterates through the Entry (Quene) The JVM is about to reclaim, and empties the Entry Value to reclaim memory. So the effect is that key is cleared during GC and Value is cleared after key is cleared for access to Weakhashmap.
The Weakhashmap class is thread-out-of-sync, and you can use the Collections.synchronizedmap method to construct a synchronized weakhashmap, where each key object is stored indirectly as an indication object of a weak reference. Therefore, the key is removed automatically, either within the map or outside the map, only after the garbage collector clears the weak reference for a key. It is important to note that the value object in Weakhashmap is persisted by a normal strong reference. Therefore, care should be taken to ensure that the value object does not strongly refer to its own key directly or indirectly because it prevents the key from being discarded. Note that a value object can indirectly refer to its corresponding key through the weakhashmap itself, which means that a value object may strongly refer to some other key object, while the value object associated with the key object will instead strongly refer to the key of the first value object.
One way to deal with this problem is to wrap the value itself in weakreferences before inserting it, such as: M.put (Key, new WeakReference (value)), and then unpack it with GET, all collection view methods for that class The iterator returned is a quick failure, and after the iterator is created, if the mapping is modified from the structure, the iterator will throw a change in any way, except through the Remove or Add method of the iterator itself, any other time. Concurrentmodificationexception. Thus, in the face of concurrent modifications, the iterator will soon fail completely, rather than risking any uncertainty at any time in the uncertain future.
Note that we cannot ensure that iterators do not fail, and in general, there is no guarantee that any fully deterministic can be made when there are concurrent changes that are out of sync.
Back to top of page
Summarize
Combining the previous introduction and the example code, we can know that if it involves stacks, queues and other operations, you should consider using List. For operations that require quick insertion, deletion of elements, and so on, you should use LinkedList. If you need to quickly randomly access elements, you should use ArrayList. If the program is in a single-threaded environment, or if access is done only in one thread, it is more efficient to consider non-synchronous classes. If multiple threads might manipulate a class at the same time, the synchronized class should be used. Pay special attention to the operation of the hash table, and the object as Key should correctly replicate the Equals and Hashcode methods. Try to return the interface rather than the actual type, such as returning a List instead of ArrayList, so that if you need to change ArrayList to LinkedList later, the client code does not have to be changed, which is the idea of programming for abstraction.
This article is only for application-level sharing, subsequent articles will be specific to the implementation of the source code level in-depth introduction, but also on the specific implementation of the algorithm based on an in-depth introduction, please have the needs of readers concerned about the following articles.
A collection of java.util that is reproduced and explained.