Data structure and algorithm (i) sequential storage of linear tables and ArrayList, vector implementation

Source: Internet
Author: User
Tags foreach arrays data structures int size sleep sort concurrentmodificationexception

As a developer of the upper layer, it is very rare to write data structures and algorithms directly, but the data structures and algorithms are ubiquitous in our development process, such as the set frame we use, sorting, searching, etc. Of course, the programming language provides us with APIs for our use. But we still need to understand their internal principles in order to better use them.

The data structures introduced in this series include arrays, linked lists, stacks, queues, hash tables, binary trees, binary tree, binomial search tree, balanced binary tree, AVL, red-black tree, Huffman tree, Trie, heap, segment tree, KD tree, and search set.

The introduction of data structures will also be interspersed with some of the commonly used collection classes in Java, such as ArrayList, Vector, LinkedList, Stack, Queue, Priorityqueue, Arraydeque, HashMap, TreeMap and so on.

The algorithms introduced include select Sort, bubble sort, quick sort, insert sort (direct insert sort, binary insert sort, heap sort), merge sort, bucket sort, base sort, count sort, binary find, check set, graph traverse, minimum spanning tree, shortest path, etc. Basic Concepts

Let's look at some of the concepts of data structures first.

What is a data structure. Learning about the logical relationships, storage, and operation of data is the so-called data structure. logical Structure of data

The association relationships that exist between data elements, regardless of where they are stored in the computer, are referred to as the logical structure of the data.

The logical structure of the data is roughly divided into 4 logical structures: Collections: A linear structure of relationships between data elements that only "belong to a set": a "one-to-many" relational tree structure between data elements: a "one-to-much" relational graph structure or mesh structure between data elements: the storage structure of data

For different logical structures of data, there are usually two physical storage structures at the bottom (where data elements are stored in the computer's storage space): Sequential storage structure (linear table) chained storage structure (linked list)

A summary of the above content with mind mapping:

Linear Table

For the commonly used data structure can be divided into linear structure and nonlinear structure. Linear structure is mainly linear table, the nonlinear structure is mainly tree and graph.

From the introduction of the logical structure of the data structure, we know that there is a "one-to-one" relationship between the elements, that is, in addition to the first and last data elements, the other data elements are both end to bottom (note that the loop list is a linear structure, but it is connected to each other).

Each element of a linear table must have the same structure (the elements can be simple data or complex data, but the internal structure of the complex data is the same). Linear table basic Operation Linear table Initialize insert element insert element to specified position delete element Delete the element at the specified position returns the position of the element at the specified position return the length of the linear table to determine whether the linear table is empty clear linear table

Linear tables have two main storage structures: Sequential storage (linear table), chained storage (linked list). This article focuses on sequential storage, with chained storage placed in the next article.

Sequential structured storage refers to storing elements in a linear table once with a contiguous set of storage units. In other words, the physical and logical relationships of the data elements in the sequential structure linear table are consistent. So if a linear table is stored sequentially, inserting or deleting an element in the middle of a linear table requires moving the position and its subsequent elements. The Linear table intermediate position of the sequential storage structure inserts a new element

The first step is to move the position and its subsequent elements back one place, freeing up space for the new element.

Insert an element at the location of the index index=2:

Move the index index=2 and all the elements that follow it back one cell, freeing the new element for a position:

Insert new Element

Delete the linear table intermediate position element of the sequential storage structure

Deletes the element in the middle position of the linear table of the sequential storage structure, similar to the operation.

To delete an element of an index index=2:

To delete an element:

Move all elements after index=2 one cell to the left:

Sequential storage of linear tables, with array storage, inserting elements if the capacity is not enough, you need to expand. The main expansion is to create a new array, and then copy the data from the old array to the new array. implement ArrayList

The Arraylis in Java is a linear table that is stored sequentially. Below to achieve a simple ArrayList.

The basic operation of the above linear table can be extracted from a Java interface List.java:

Public interface list<t>  {
    void Add (t t);
    void Add (int index, T t);
    T get (int index);
    int indexOf (t t);
    Boolean remove (T t);
    T Remove (int index);
    void Clear ();
    int size ();
}

The implementation classes are as follows:

public class Arraylist<t> implements list<t> {private static final int default_size = 16;

    Private object[] array;

    private int capacity;


    private int size;
        Public ArrayList () {capacity = Default_size;
    Array = new Object[capacity];
        } public ArrayList (int size) {capacity = 1;
        Set the capacity to be greater than the minimum 2 of the size of the N-square while (capacity < size) {capacity <<= 1;
    } array = new object[capacity];
        }/** * Expansion * * @param expectcapacity */private void ensuresize (int expectcapacity) {
            If the current capacity is less than the expected capacity if (Expectcapacity > capacity) {//constant will capacity * 2 until capacity is greater than expectcapacity
            while (capacity < expectcapacity) {capacity <<= 1;
        } array = arrays.copyof (array, capacity);
  }}/** * Check for out-of-bounds * * @param index * @param size */  private void Checkindexoutofbound (int index, int size) {if (Index < 0 | | | index > Size) {thro
        W New indexoutofboundsexception ("Index large than Size"); }}/** * add element to last position * * @param t */@Override public void Add (T t) {Add (SI
    Ze, t); /** * add element at a specific location * * @param t * @param index */@Override public void Add (int in
        Dex, T T) {Checkindexoutofbound (index, size);
        Ensuresize (capacity + 1);
        System.arraycopy (array, index, array, index + 1, size-index);
        Array[index] = t;
    size++;
        }/** * Gets the element of a location * * @param index * @return */@Override public T get (int index) {
        Checkindexoutofbound (index, size-1);
    Return (T) Array[index];
        }/** * Gets the location of an element * * @param t * @return */@Override public int indexOf (T t) { for (int i = 0; i < size;
            i++) {Object obj = array[i];
            if (obj = = NULL && t = = null) {return i;
            } if (obj! = null && obj.equals (t)) {return i;
    }} return-1;
        }/** * Delete an element * * @param t * @return */@Override public boolean remove (T t) {
        int index = INDEXOF (t);
        if (Index! =-1) {remove (index);
    } return Index! =-1;  }/** * Delete elements of a location * * @param index * @return */@Override public T-Remove (int index)
        {checkindexoutofbound (index, size-1);
        T oldValue = (t) Array[index];
        int copysize = size-index-1; If you delete the last element you do not need copy//or index = size-1 if (copysize > 0) {system.arraycopy (array, Ind
        Ex + 1, array, index, copysize);
     } The elements of the//index position move to the left, you need to set the last element to NULL to avoid a memory leak   Array[--size] = null;
    return oldValue;
            /** * Empty all elements */@Override public void Clear () {for (int i = 0; i < size; i++) {
        Array[i] = null;
    } size = 0;
    } @Override public int size () {return size;
        Public String toString () {if (size = = 0) {return ' [] ';
        } StringBuilder sb = new StringBuilder ();
        Sb.append (' [');
        for (int i = 0; i < size; i++) {sb.append (Array[i]). Append (', ');
        } Sb.deletecharat (Sb.length ()-1);
        Sb.append ('] ');
    return sb.tostring ();
 }
}
compare ArrayList in the JDK

The above code implements the basic operation of the linear table, and the code comment is not to be described in detail. The main points are different from JDK ArrayList: 1. foreach Iteration

The above code does not support foreach iterations, and in order to be able to support foreach iterations, the Java.lang.Iterable interface needs to be implemented as follows:

@Override
Public Iterator Iterator () {
return new Myiterator ();
}

Class Myiterator implements Iterator {
private int index = 0;
public Boolean hasnext () {
Return index < size ();
}
Public T Next () {
Return get (index++);
}
} 2. Expansion mechanism

We are constantly going to capacity * 2, until capacity is greater than EXPECTCAPACITY,JDK in the ArrayList expansion mechanism is: the default capacity is 10, when the required capacity exceeds the default capacity, the expansion algorithm is (int) (oldcapacity * 1.5), JDK ArrayList expansion code:

private void Grow (int mincapacity) {
    //overflow-conscious code
    int oldcapacity = elementdata.length;
    int newcapacity = oldcapacity + (oldcapacity >> 1);
    if (newcapacity-mincapacity < 0)
        newcapacity = mincapacity;
    if (newcapacity-max_array_size > 0)
        newcapacity = hugecapacity (mincapacity);
    Mincapacity is usually close to size, so this is a win:
    Elementdata = arrays.copyof (Elementdata, newcapacity);
}
3. Fail-fast mechanism

When we foreach, deleting a collection element throws a Java.util.ConcurrentModificationException exception. Such as:

for (int i:list) {
    list.remove (i);
}

ArrayList's parent class abstractlist has a modcount field, each time the ArrayList object is structurally adjusted (structurally modified), Structural adjustments include calling ArrayList's Add method or the Remove method and adding or removing elements, and the modcount will increment itself. Each iterator iterator has a expectedmodcount with an initial value of Modcount, and the next and remove methods that call the iterator iterator will check if Expectedmodcount and Modcount are equal. Throws an java.util.ConcurrentModificationException exception if not equal immediately.

The foreach iteration is essentially using the iterator iteration, which is deleted at iteration, then Modcount, the next method checks to Expectedmodcount and Modcount are unequal and throws an exception.

The workaround is to use iterator for iteration and delete operations:

iterator<integer> Iterator = List.iterator ();
while (Iterator.hasnext ()) {
    int i = Iterator.next ();
    Iterator.remove ();//iterator performs the delete operation
    System.out.println (i);
}

This scheme is valid on a single thread, and if in the case of multithreading, it will trigger Fail-fast, as in the following code:

private static void Iteratormultiplethread () {java.util.arraylist<integer> list = new JAVA.UTIL.ARRAYLIST&LT;&G
    t; ();
    for (int i = 0; i < i++) {list.add (i); } New Thread (new Runnable () {@Override public void run () {iterator<integer> Itera
            Tor = List.iterator ();
                while (Iterator.hasnext ()) {System.out.println ("iteration element:" + iterator.next ());
                    try {thread.sleep (10);//Assign CPU resource} catch (Interruptedexception e) {
                E.printstacktrace ();

    }}}). Start (); New Thread (New Runnable () {@Override public void run () {iterator<integer> Iterator = l
            Ist.iterator ();
                while (Iterator.hasnext ()) {int i = Iterator.next ();
                    if (i% 2 = = 0) {iterator.remove (); SYSTEM.OUT.PRINTLN ("Remove meta"+ i);

}}}). Start (); }

Cause Analysis:

Through the JDK ArrayList source, the List.iterator () method will create a new iterator iterator, from above on the iterator iterator we know, Each iterator iterator has a expectedmodcount with an initial value of Modcount.
The next and remove methods that call the iterator iterator will check whether Expectedmodcount and Modcount are equal, Throws an java.util.ConcurrentModificationException exception if not equal immediately.

We start with two threads, then there will be two iterator, as long as one thread executes the remove operation, and the other thread is going to iterate (executing the next method) will trigger fail-fast.

You can put the iterator operation into the synchronized synchronization code block. Such as:

private static void Fixiteratormultiplethread () {java.util.arraylist<integer> list = new Java.util.ArrayLis
        T<> ();
        for (int i = 0; i < i++) {list.add (i);

        } String flag = "Flag";
                    New Thread (New Runnable () {@Override public void run () {synchronized (flag) {
                    iterator<integer> Iterator = List.iterator ();
                        while (Iterator.hasnext ()) {System.out.println ("iteration:" + iterator.next ());
                            try {thread.sleep (10);//Assign CPU resource} catch (Interruptedexception e) {
                        E.printstacktrace ();


        }}}). Start ();
               New Thread (New Runnable () {@Override public void run () {synchronized (flag) {     iterator<integer> Iterator = List.iterator ();
                        while (Iterator.hasnext ()) {int i = Iterator.next ();
                            if (i% 2 = = 0) {iterator.remove ();
                        System.out.println ("Remove:" + i);

    }}}). Start (); }

Although the

Synchronized solves this problem, it loses concurrency, and although it has two threads, it needs to wait for one thread to complete before another thread can execute. You can use copyonwritearraylist that support concurrency instead of synchronized. For example, the following operation, the iterative process of continuous insertion (thread1 insertion, thread2 iteration)

    private static void FixIteratorMultipleThread2 () {copyonwritearraylist<integer> list = new Copyonwritea
Rraylist<> ();
        java.util.arraylist<integer> list = new java.util.arraylist<> (); Thread thread = new Thread (new Runnable () {@Override public void run () {int i =
                1;
                    while (true) {List.add (i++);
                SYSTEM.OUT.PRINTLN (i + "---------add");
        }
            }
        });
        Thread.setdaemon (TRUE);

        Thread.Start ();
                    Thread thread2 = new Thread (new Runnable () {@Override public void run () {try {
                Thread.Sleep (1);
                } catch (Interruptedexception e) {e.printstacktrace ();
                } for (int value:list) {System.out.println ("------" + value);
    }
            }    });
    Thread2.start (); }

The following is the vector class.

Vectors are very similar to ArrayList, vectors are thread-safe, and ArrayList are not thread-safe.
Vector is also very easy to implement thread safety, the main method of operation with the Synchronized keyword, including iterator. The above ArrayList Fail-fast demo also applies to vectors, even if the ArrayList is replaced with a vector also concurrentmodificationexception anomalies (the reason is not explained). Summary

Sequential stored linear tables such as ArrayList, vectors are stored with a contiguous set of storage units (arrays) to store elements.

Sequential storage of linear tables, because of the need for a fixed-length array, there is a certain amount of waste in space.

The logical structure and physical structure of sequential stored linear tables are consistent, so the efficiency of finding and reading is very high.

If the element is added and deleted in the middle of the ArrayList, Vector, the performance is poor because the element needs to be moved in its entirety.

If we know the number of elements to be stored in development, in order to save memory space, we can pass the capacity parameters when constructing ArrayList, such as New ArrayList (2); If you are using an argument-free construction method, an array of length 10 (default_capacity = 10) is created when you execute the Add method.

It is more efficient to use ArrayList without regard to multithreading.

Source code related to this article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.