Differences in data structures that are common in Java

Source: Internet
Author: User
Tags array length

Storing multiple data in a certain storage way is the data structure of storage. Data is stored in many ways, arrays, queues, lists, stacks, hash tables, and so on. Different data structures, performance is not the same, for example, some inserts faster, query faster, but the deletion is relatively slow. Some delete faster, insert faster, but the query is slow. According to the actual operation, reasonable choice can be.

ArrayList and vectors use arrays to store data, which is larger than the actual stored data in order to add and insert elements, allowing direct ordinal index elements, but inserting data to design memory operations such as array element movement, so the index data is fast inserting data slowly, Vector because of the use of the Synchronized method (thread safety) so the performance is worse than the ArrayList, LinkedList using the two-way linked list to implement storage, indexed by ordinal data needs to be traversed forward or backward, but when inserting data only need to record the item's front and rear items, So insert several faster!

Different data structures in doing different operations, performance is a difference in the industry computing performance is called "Big O algorithm", we can simply analyze the following.

Analyze the performance of ArrayList (array-based lists) when doing crud:

1: Insert: And do not consider the expansion problem, the expansion is actually better performance (create a new array, array element copy).

2: Delete: Move the elements in the back to the whole forward.

If you delete the last element: This operation is done 1 times.

If you delete the first element: the operation is now N times.

Average: (n+1)/2 times.

3: Modify: Operate 1 times.

4: Query: If you are querying an element based on an index, do it 1 times. If you are querying for the first/last occurrence based on the element: If the element is in the first position: this is done 1 times. If the element is in the last position: the operation is now N times. Average: (n+1)/2 times.

An array-based list (ArrayList) is found at this point, which is slower on the delete operation. In insert, modify, query on, relatively fast.

Simple analysis of LinkedList's algorithm performance:

1: Insert operation: one-way list: Insert first only 1 operations. Insert last: N operations. Bidirectional list: Insert the first/last one is 1 times. However, insert into the middle position: N/2 times.

2: Modify operation: N/2 operation.

3: Query operation: Get first and last, only 1 operations. Query the middle element: N/2 times.

4: delete operation: N/2 times + 1 times.

Operations the first and last are very fast. From a performance analysis, the delete/insert to middle position operation changes the previous and next reference addresses faster than the ArrayList shift.

Linear table, linked list, hash table is a common data structure, in Java development, the JDK has provided us with a series of corresponding classes to implement the basic data structure. These classes are all in the Java.util package.

The list and set inherit from the collection interface:
Collection
├list
│├linkedlist
│├arraylist
│└vector
│└stack
└set
Map
├hashtable
├hashmap
└weakhashmap

Collection interface
Collection is the most basic set interface, and a collection represents a set of object, the collection element (Elements). Some collection allow the same elements while others do not. Some can sort and others can't. The Java SDK does not provide classes that inherit directly from collection, and the Java SDK provides classes that inherit from collection, such as list and set.
All classes that implement the collection interface must provide two standard constructors: a parameterless constructor is used to create an empty collection, and a constructor with a collection parameter is used to create a new collection. This new collection has the same elements as the incoming collection. The latter constructor allows the user to copy a collection.
How do I traverse every element in the collection? Regardless of the actual type of collection, it supports a iterator () method that returns an iteration that uses the iteration to access each element of the collection one at a time. Typical usage is as follows:
Iterator it = Collection.iterator (); Get an iteration child
while (It.hasnext ()) {
Object obj = It.next (); Get the next element
}
The two interfaces that are derived from the collection interface are list and set.

List interface
The list is an ordered collection, using this interface to precisely control where each element is inserted. The user is able to access the elements in the list using an index (where the element is positioned in the list, similar to an array subscript), similar to an array of java.
Unlike the set mentioned below, the list allows the same elements.
In addition to the iterator () method, which has the collection interface prerequisites, the list also provides a listiterator () method that returns a Listiterator interface, compared to the standard iterator interface. Listiterator has a number of add () methods that allow you to add, delete, set elements, and traverse forward or backward.
The common classes that implement the list interface are Linkedlist,arraylist,vector and stacks.

LinkedList class
The LinkedList implements a list interface that allows null elements. Additionally LinkedList provides an additional Get,remove,insert method at the first or the tail of the LinkedList. These operations make the LinkedList available as a stack (stack), queue, or two-way queue (deque).
Note LinkedList does not have a synchronization method. If multiple threads access a list at the same time, you must implement access synchronization yourself. One workaround is to construct a synchronized list when the list is created:
List List = Collections.synchronizedlist (new LinkedList (...));

ArrayList class
ArrayList implements a variable-size array. It allows all elements, including null. ArrayList is not synchronized.
Size,isempty,get,set method run time is constant. But the Add method cost is the allocated constant, and adding n elements requires T (n) time. Other methods run at a linear time.
Each ArrayList instance has a capacity (capacity), which is the size of the array used to store the elements. This capacity automatically increases as new elements are added, but the growth algorithm is not defined. When you need to insert a large number of elements, you can call the Ensurecapacity method before inserting to increase the capacity of the ArrayList to improve insertion efficiency.
Like LinkedList, ArrayList is also unsynchronized (unsynchronized).

Vector class
Vectors are very similar to ArrayList, but vectors are synchronous. The iterator created by the vector, although the same interface as the iterator created by ArrayList, but because the vector is synchronous, when a iterator is created and is being used, another thread changes the state of the vector (for example, Some elements have been added or removed, Concurrentmodificationexception will be thrown when the iterator method is called, so the exception must be caught.

Stack class
Stack inherits from Vector and implements a last-in-first-out stack. The stack provides 5 additional ways to make the vector available as a stack. The basic push and pop methods, and the Peek method to get the stack top element, the empty method tests if the stack is empty, and the search method detects the position of an element on the stack. Stack has just been created as an empty stack.

Set interface
Set is a collection that contains no duplicate elements, that is, any two elements E1 and E2 have E1.equals (E2) =false,set have a maximum of one null element.
Obviously, the constructor of a set has a constraint that the passed-in collection parameter cannot contain duplicate elements.
Note: Variable objects (Mutable object) must be handled with care. If a mutable element in a set changes its state, causing Object.Equals (Object) =true will cause some problems.

Map interface
Note that map does not inherit the collection interface, and map provides a key-to-value mapping. A map cannot contain the same key, and each key can only map one value. The map interface provides views of 3 collections, and the contents of the map can be treated as a set of key sets, a set of value collections, or a set of key-value mappings.

Hashtable class
Hashtable inherits the map interface to implement a key-value mapped hash table. Any object that is not empty (non-null) can be either a key or a value.
Add data using put (key, value), take out the data using get (key), the time overhead for these two basic operations is constant.
The Hashtable adjusts performance through the initial capacity and load factor two parameters. Normally the default load factor 0.75 is a good way to achieve a balanced time and space. Increasing the load factor can save space but the corresponding lookup time will increase, which will affect operations like get and put.
A simple example of using Hashtable is to put the three-to-one in the Hashtable, with their key being "single", "Two", "three":
Hashtable numbers = new Hashtable ();
Numbers.put ("One", New Integer (1));
Numbers.put ("n", New Integer (2));
Numbers.put ("Three", New Integer (3));
To take out a number, say 2, with the corresponding key:
Integer n = (integer) numbers.get ("a");
System.out.println ("both =" + N);
Because an object that is a key will determine the position of its corresponding value by calculating its hash function, any object that is a key must implement the Hashcode and Equals methods. The hashcode and Equals methods inherit from the root class object, and if you use a custom class as key, be quite careful, as defined by the hash function, if two objects are the same, i.e. obj1.equals (OBJ2) =true, Their hashcode must be the same, but if two objects are different, their hashcode may not be different, if the hashcode of two different objects is the same, this phenomenon is called a conflict, the conflict causes the time overhead of manipulating the hash table, so define the hashcode () as much as possible. method to speed up the operation of the hash table.
If the same object has different hashcode, the operation of the hash table will have unexpected results (expecting the Get method to return null), to avoid this problem, you need to keep in mind one: to replicate both the Equals method and the Hashcode method, rather than write only one of them.
The Hashtable is synchronous.

HashMap class
HashMap and Hashtable are similar, except that HashMap is non-synchronous and allows NULL, which is null value and null key. , but when HashMap is treated as collection (the values () method can return collection), its iteration sub-operation time overhead is proportional to the capacity of HashMap. Therefore, if the performance of the iterative operation is quite important, do not set the initialization capacity of the hashmap too high or load factor too low.

Weakhashmap class
Weakhashmap is an improved hashmap, which implements a "weak reference" to key, which can be recycled by GC if a key is no longer referenced externally.

Summarize
If it involves operations such as stacks, queues, and so on, you should consider using the list, for quick insertions, for deleting elements, should use LinkedList, and if you need to quickly randomly access elements, you should use ArrayList.
If the program is in a single-threaded environment, or if access is done only in one thread, it is more efficient to consider non-synchronous classes, and if multiple threads may operate on a class at the same time, the synchronized classes should be used.
Pay special attention to the operation of the hash table, and the object as key should correctly replicate the Equals and Hashcode methods.

Try to return the interface rather than the actual type, such as returning a list instead of ArrayList, so that if you need to change ArrayList to LinkedList later, the client code does not have to be changed. This is for abstract programming.

Synchronization of
The vectors are synchronized. Some of the methods in this class ensure that the objects in the vector are thread-safe. The ArrayList is asynchronous, so objects in ArrayList are not thread-safe. Because the requirements of synchronization affect the efficiency of execution, it is a good choice to use ArrayList if you do not need a thread-safe collection, which avoids unnecessary performance overhead due to synchronization.
Data growth
From an internal implementation mechanism, both ArrayList and vectors use arrays to control the objects in the collection. When you add elements to both types, if the number of elements exceeds the current length of the internal array, they all need to extend the length of the internal array, and vector automatically grows the array length by default, ArrayList is the original 50%. So in the end you get this collection that takes up more space than you actually need. So if you're going to save a lot of data in a collection then there are some advantages to using vectors, because you can avoid unnecessary resource overhead by setting the initialization size of the collection.
Usage patterns
In ArrayList and vectors, it takes the same amount of time to find data from a specified location (through an index) or to add or remove an element at the end of a collection, which we use in T (1). However, if an element is added or removed at another location in the collection, the time spent will grow linearly: t (n-i), where n represents the number of elements in the collection, and I represents the index position at which the element was added or removed. Why is that? This is because all elements after the first and second elements of the collection are performing the action of the displacement. What does all this mean?
This means that you just look for elements in a particular location or only add and remove elements at the end of the collection, so you can use either a vector or a ArrayList. If this is another operation, you might want to choose a different collection operation class. For example, does the Linklist collection class take the same amount of time to add or remove elements from any position in the collection? t (1), but it is less slow to use one element in the index than T (i), Where I is the location of the index. It's also easy to use ArrayList because you can simply use an index instead of creating an iterator object. Linklist also creates objects for each inserted element, all of which you have to understand that it also brings additional overhead.
Finally, it is recommended to use a simple array instead of vectors or ArrayList. This is especially true for programs with high efficiency requirements. Because the use of arrays (array) avoids synchronization, additional method calls, and unnecessary reallocation of space operations.

Differences in data structures that are common in Java

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.