"Java" Java Collection framework brief analysis of source code and data structure--list

Source: Internet
Author: User

Objective

Previously, the collection framework was divided into collection and map, mainly based on the storage content is a single row and a double column, in fact, so that the distinction is not correct, set is actually a double-column structure.

Now look back at the set frame and see a lot of things that I couldn't see.

Now look at the set frame, part of list, part of set and Map,set and map is almost the same thing.


First, the data structure

In fact, I can't speak much in depth.

The data structure is the relationship of a bunch of numbers.

Logical Structure --the relationship between data logic is actually data structure, and the logical structure of data can be divided into almost four kinds: linear structure, set structure, tree structure and graph structure.

Physical Structure -the four logical structures, either of which, ultimately, are to be saved to physical memory, that is, the logical structure is based on the physical structure, and the physical structure of the data is no different from the two: sequential storage structure and chain storage structure. Sequential storage structure, common is the array, need a piece of contiguous memory, chained storage structure, do not need contiguous memory, but the previous data object needs to correlate the memory address of the next data object.

Second, List

List, translated is "linked list", the logical structure of the list is a linear structure, the physical structure can be the order can also be chained.

In the Java Collection framework, the implementation of list has the sequential storage structure, and the chain storage structure is adopted. Common ArrayList and vectors are sequential chain lists based on sequential storage structures (arrays), and LinkedList are chain-linked lists based on chained storage structures.

1. ArrayList and vectors

The similarities and differences between ArrayList and vectors have been discussed, and there is no longer a continuation (see the previous blog post, "Set frame and map base", written earlier, more superficial), directly to the ArrayList source for discussion.

The structure of the data in the ArrayList is an array (followed by HashSet, HashMap, and an array-based sequential storage structure), requiring a contiguous memory, a very efficient search and modification, and a time complexity of O (1). Since it is based on an array, the data has a certain type of data, and after initialization the size is immutable, what are the problems he faces?

A. How to implement dynamic bracketing when the array of data is not sufficient?

B. When data is added and deleted, how and how does it perform?

/** * The array buffer into which the elements of the ArrayList is stored. * The capacity of the ArrayList is the length of this array buffer. any * empty ArrayList with elementdata = = Defaultcapacity_empty_elementdata * would be a expanded to default_capacity when th E first element is added. */transient object[] Elementdata; Non-private to simplify nested class access
Why is the element data type in the array an object? In fact, regardless of whether you use generics, the data type of the object in the collection framework is object, which is not only visible from the source code, but also through reflection. The generic scope compilation period, to run time has been erased, and reflection is to obtain the runtime data information of an object, interested students can self-understanding, here is no longer described in detail. Also, remember this array object elementdata. There is no need to worry that elementdata will cause a null pointer exception, if you pass in the constructor to Initsize, then he will be initialized in the constructor, if you do not pass, then he will be an empty array of length==0, But when you add data, there will be a logical re-allocation of space, this part of the code trivial, no longer show, please self-viewing source.

1.1 Dynamic expansion

Since ArrayList is a collection framework based on an array structure, and the data type and length of an array are not changed once the initialization is complete, how does the ArrayList do dynamic scaling? When the number of data elements deposited is greater than the length of the array, if a larger new array is not given, then an exception to the array subscript bounds will occur. The idea of ArrayList dynamic expansion is to create a larger array, and then copy the data from the original array, whether it is from the time complexity O (n) or the space complexity s (n), the efficiency and performance are all underneath, and compared to its super-fast query and modification, it simply can't bear to look directly. The core code for its expansion is posted below.

private void Grow (int mincapacity) {//overflow-conscious codeint oldcapacity = elementdata.length;int newcapacity = oldCa Pacity + (oldcapacity >> 1), if (newcapacity-mincapacity < 0) newcapacity = mincapacity;if (Newcapacity-max_arr Ay_size > 0) newcapacity = hugecapacity (mincapacity);//mincapacity is usually close to SIZE, so the content above is a win:// Determining the initialization size of an array//The following code means creating a new array object and copy the data from the old array object past elementdata = arrays.copyof (Elementdata, newcapacity);}
Arrays.copy (source code below), to achieve the creation of new data and copy data, but this method has not yet explored the landing point, the probe is a data copy, but do not forget arrays.copyof (Elementdata, newcapacity); By reflecting the credit for creating a new array, The real realization of the data copy is a native method System.arraycopy, the method of efficient implementation of the data copy (if the SRC array and dest array is the same, the use of a temp array to copy, if the SRC and dest are different, then copy directly, C + + source, not posted here), this method will be used with the existence of all sequential storage structures in the Java Collection framework, he shines in sequential storage (remember him, and use him)!

public static <T,U> t[] CopyOf (u[] original, int newlength, class<? extends t[]> NewType) {@SuppressWarnings (" Unchecked ") t[] copy = ((object) NewType = = (object) object[].class)? (t[]) New Object[newlength]: (t[]) array.newinstance (Newtype.getcomponenttype (), newlength);//above by reflection new array// The following is a copy of the data via the system's native static method System.arraycopy (original, 0, copy, 0, Math.min (original.length, newlength)); return copy;}

ArrayList is a continuous memory-based array implementation, you can look at his three construction methods, in view of their physical storage characteristics and implementation code, it is recommended to use this data structure, if the first to determine the number of data objects to be placed in the construction of the time of the incoming, In fact, using ArrayList more hope that he can be used as a static data structure only for the search, do not let him always change frequently, this is too inefficient, if you are a good person, then please choose other structure.

1.2 Data additions and deletions

When the elementdata is not enough, he will expand his own dynamic, then when Elementdata enough, his data increase, deletion and modification is swollen like it? Of course, in each add and delete before, will have to determine whether the array is adequate, not enough to expand, enough so is the next discussion of the content, the data of the increase and deletion.

ArrayList Add, one is relative, is added directly to the last (implementation is through a member variable size, the number of objects stored in the current array), Time complexity O (1), one is absolute, the data is added to the specified position, the worst time negative again O (n), Because you need to move the data after the specified position back, then put the inserted data in.

Relative: The Last public boolean add (E e) {ensurecapacityinternal (size + 1) added to the existing data;  Increments modcount!! elementdata[size++] = e;//See here return true;} Absolute: Stores the data at the specified location public void Add (int index, E element) {Rangecheckforadd (index);//Whether you want to dynamically scale ensurecapacityinternal (size + 1 );  Increments modcount!! See here, is not and arrays.copy implementation very much like//Yes, this is the native method that shines in the Java sequential storage structure system.arraycopy//If the array size is sufficient, then move the data after the specified position back one position, And then put the data in//compared to the chain storage structure, this efficiency is too low system.arraycopy (Elementdata, index, Elementdata, index + 1, size-index); elementdata[ Index] = element;size++;}
The efficiency of adding data at a given location is already low, so what about deleting it? The removal efficiency is also very low ... There are two kinds of deletions, one is to delete the data at the specified location, and the other is to delete the data object directly. Specify the location, only need to move the back of the data all the way, overwrite it, the worst time complexity O (n), remember to return the deleted value, this is very useful, and only a list; Specify the object, you need to traverse to find out where the object is stored, and then move the data back all the way forward , time complexity, properly drip, preferably O (n), the worst is two times, although also O (n), but in fact is O (2n), so if you know in which position, priority to specify the location of the delete it. Of course, don't forget the system.arraycopy in the Java sequential store.

Specify location Delete public E remove (int index) {rangecheck (index); modcount++;  E OldValue = elementdata (index), int nummoved = size-index-1;if (nummoved > 0) system.arraycopy (elementdata, Index+1, Elementdata, index, nummoved); elementdata[--size] = null; Clear to let GC does its workreturn oldValue;} Specifies the object Delete method entry public boolean remove (object o) {if (o = = null) {for (int index = 0; index < size; index++) if (elementdata[ Index] = = null) {fastremove (index);//To find the object's location before deleting return true;}} else {for (int index = 0; index < size; index++) if (O.equals (Elementdata[index])) {fastremove (index);//To find the object's storage position before entering Row delete return true;}} return false;} Specify the true implementation method for object deletion private void fastremove (int index) {modcount++;int nummoved = size-index-1;if (nummoved > 0) SYSTEM.A Rraycopy (Elementdata, index+1, Elementdata, Index, nummoved); elementdata[--size] = null; Clear to let GC do it work}

2. LinkedList

ArrayList is based on the sequential storage structure, LinkedList is a chain-based storage structure, and is a doubly linked list, the former requires contiguous memory space, the latter does not require contiguous memory space, but each element needs to refer to the memory address of the next element, of course, except the last one.

What do we need to discuss about the list? LinkedList is definitely a wonderful, he is a chain list, is a first-out queue, or a last-in-first out of the stack, here only to discuss him as a list, or too much.

A. How to implement a reference to a data memory address.

B. Change and delete

2.1 Data structures referenced by addresses

A class in Java is a custom data type, and each object of a class is a reference. Most importantly, by instantiating an object, the object is already allocated memory, and the memory address of the virtual machine is fixed. In this way, assigning one object to another is actually a copy of the memory address stored in the stack in the past, both pointing to the same memory address and space, and the memory address reference is so smoothly implemented.

Of course, the collection can only hold objects, but the value of the storage object is heap memory, that is, we really need to value, swollen can easily assign him out? Therefore, every value that is stored in is wrapped and packaged into the data area, and the address of the last given point and the next point in the list is referenced by other means. Look directly at the source bar, the inner class defined in Node,linkedlist.

private static class Node<e> {E item; Node<e> Next; Node<e> prev; Node (node<e> prev, E element, node<e> next) {This.item = Element;this.next = Next;this.prev = prev;}}
is spicy simple. The incoming data is wrapped into a node object, which is actually assigned to node's item, and the memory address references to the front and back data are stored in next and prev.

2.2 Increase and deletion check

For efficiency, LinkedList has two member variables, first and last. In fact, as a chain storage structure, the addition or insertion of linkedlist and deletion performance is quite strong, and compared to the sequential storage structure, there is much better.

Add and delete to change the logic of the same, Time complexity O (1), as long as the understanding of the chain list principle, basically understand no difficulty, read the source logic is also seconds to understand, not even see can write out of the rhythm, here to be discussed by adding. Add is also two, specify element to join, add element at the specified position.

Add an element void Linklast (e e) {final node<e> L = last;//incoming data will be wrapped into a nodefinal node<e> newNode = new Node<> ;(L, E, null); last = newnode;if (L = = null) First = Newnode;elsel.next = newnode;size++;modcount++;} Specifies the location of the add element public void Add (int index, E element) {Checkpositionindex (index);//If the last one is inserted, then the logic to call Add (e) directly is the same if (index = = size) linklast (element);//If the specified position is not the last, then a link is required, in fact, a re-assignment of several elselinkbefore (element, node (index));} void Linkbefore (e E, node<e> succ) {//assert succ! = null;final node<e> pred = succ.prev;//Here, by constructing the method, the Newnod Next of E points to succfinal node<e> NewNode = new Node<> (pred, E, succ); Succ.prev = newnode;if (pred = = null) First = new node;else//here, change the original point to Succ to point to Newnodepred.next = newnode;size++;modcount++;}
Finding is not that good performance anymore. Find the time complexity O (n) that needs to traverse from the head node down until it is traversed to an equal element position.

node<e> node (int index) {//Assert Iselementindex (index), if (Index < (size >> 1)) {node<e> x = First; for (int i = 0; i < index; i++) x = X.next;return x;} else {node<e> x = last;for (int i = size-1; i > Index; i--) x = X.prev;return x;}}
On and off, the basic list is over here.

Third, Comparison

Comparison of what, in fact, is the sequential storage structure and chain storage structure of the pros and cons.

Increment: ArrayList, relative to O (1), Absolute O (n), requires Sysem.arraycopy;linkedlist,add (e e), because of the existence of node last, O (1), add (int, E), need to traverse the depth of int first, In fact, the increment is the process of re-assignment, not the process of traversal, but also O (1), but actually a round down, also reduced to O (n);

Delete: Arraylist,o (n), need to system.arraycopy;linkedlist, only see Delete, O (1), but in any case can not ignore the performance of the search consumption, and eventually reduced to O (n);

Change: ArrayList, specify the position O (1), specify the element O (n), because the need to traverse the comparison, LinkedList, whether the specified position or the specified element, all need to traverse, although the modification is only O (1), but with the traversal of the consumption, reduced to O (n);

Check: ArrayList, specify position O (1), specify element O (n); Linkedlist,o (n);

The final placement is the overhead of copying arrays, or the overhead of traversing node. This point of discussion, or according to the characteristics of business data to choose the structure, system stability, need to frequently find, less additions and deletions, then choose ArrayList, conversely, choose LinkedList. If it's both, it doesn't matter who has more or less, or suggests ArrayList, because he looks better.



Note:

If there are mistakes in this article, please do not hesitate to correct, thank you!

"Java" Java Collection framework brief analysis of source code and data structure--list

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.