The algorithm foundation of the introduction of rereading algorithm

Source: Internet
Author: User
Tags array length

The algorithm foundation of the introduction of rereading algorithm Insert Sort

? An efficient algorithm for small amounts of data. Principle:

  1. The elements in the array are divided into two parts throughout the process, sorted part A and unsorted part B
  2. In the insert process, never sort part B takes a value to insert the sorted part A
  3. The insertion process takes the form of a comparison of the elements taken from the first and the largest elements in a and B, and if the element is not the same as the expected size of the element in B, then the element A is moved to the right

? The specific code is as follows:

 Public Static void Insertionsort(int[] arr) {//array is empty or there is only one element that does not need to be sorted  if(arr = =NULL|| Arr.length<=1) {return; }//Start inserting a sort, assuming that the first element of the element group belongs to the already ordered part A, the elements are taken out from part B, and the comparison is inserted   for(intj =1; J < arr.length; J + +) {intkey = Arr[j];inti = J-1; for(; I >=0; i--) {if(Arr[i] > key) {arr[i +1] = Arr[i]; }Else{ Break; }} arr[i+1] = key; }}

? The error point is, the end should be set Arr[i + 1] = key. It can be assumed that when all elements in a are smaller than the number selected in B, the subscript at which key is located should be the same, at which point I + 1 = J. So there is Arr[i + 1] = key.

Cyclic invariant type

? Cyclic invariant is mainly used to help us understand the correctness of the algorithm. To prove that an algorithm is cyclic invariant, it must be proved that the algorithm satisfies three properties:

  1. Initialize: Before the first iteration of the loop, it is true
  2. Hold: If it is true before an iteration of the loop, then the current iteration is completed and the next iteration is still true
  3. Termination: At the end of the loop, the invariant gives us a useful property that helps to prove that the algorithm is correct

? The cyclic invariant is similar to the mathematical inductive method. Only in induction, inductive steps are infinitely used, and here there is a cyclic termination, stop induction.

Insert sort with cyclic invariant validation
    • Initialize: From the above code can be seen. Before the loop, we assume that the ordered part A contains only one element, when a is, of course, satisfied with the order. That is, initialize a satisfies the cyclic invariant type

    • Hold: The process of each cycle is analyzed below. The key is removed from the never-sorted B, if the arr[i] > key in a is not the appropriate position at this point, you need to move the arr[i in a to the right to Arr[i+1] (arr[i + 1] = Arr[i]). Until Arr[i] <= key is found, the arr[i+1] should be the correct location for key, Arr[i + 1] = key. After completing the iteration, because a turns out to be orderly, it simply shifts the non-conforming element in a to the right and does not affect its orderly character. And after the key is inserted in the correct position, it also guarantees that the new a satisfies the cyclic invariant after a+key.

    • Termination: J in the code indicates the leftmost element subscript of the part B that is not ordered. You can see that the final condition of the loop is j=arr.length. That is, a contains all the elements of the array. So the termination condition also satisfies the cyclic invariant type.

The entire flowchart for quick sorting is as follows:

Analysis algorithm

? Two major concepts are involved

  1. Input scale: The best concept depends on the problem of research. For many problems, such as sorting or calculating discrete Fourier transforms, the most natural measure is the number of items in the input . For many other problems, such as multiplying by two numbers, the best measure of input size is the total number of digits required to enter with the usual binary date. Sometimes, the input size may need to be represented by multiple numbers. For example, if an algorithm inputs a graph, the input scale may be more appropriate to describe with the number of vertices and the number of edges in the graph.
  2. Run time: Refers to the basic number of operands or steps performed by the algorithm. The concept of stepping is defined to be independent of the machine.

? We can use c~i~ to represent the time-consuming execution of each line of code. The number of times each line is executed we can use t~j~ to express. Assume that the size of the input array is n. Here's a time-based analysis of the code for the Fast line:

Code execution time-consuming Number of executions
for (int j = 1; j < Arr.length; J + +) { c~1~ N
int key = Arr[j]; c~2~ N-1
int i = j-1; c~3~ N-1
for (; I >= 0; i--) {if (Arr[i] > key) { c~4~ The sum of t~j~ (2=< J <=n)
Arr[i + 1] = Arr[i]; c~5~ (t~j~-1) and the sum (2=< J <=n)
ARR[I+1] = key; c~6~ N-1

Note: The statement for the For loop is executed more than once in the loop, because the for loop will eventually execute a third increment statement more than once.

? Through the above table analysis, it can be found that the key factors affecting the efficiency of the algorithm are in lines 4th and 5th. In the best case, the array itself is ordered, then the t~j~ is 1, at which time the total time is a multiple of n. In the worst case, the array is exactly in reverse order, then the number of steps to be performed in lines 4th and 5th is related to J, and the result is (2 + 3 + 4 + ... + N), and the results are related to \ (n^2\) .

Worst case and average situation analysis

? The best situation of an algorithm often symbolizes our good intentions, and there is no practical research significance. We are more focused on the worst case and average situation of an algorithm. Because the worst case scenario represents the upper bound of the program's operation, he can make us ready for the worst. The average situation can give us the time consumption most likely to run when we don't deliberately construct the input, which can be considered the closest natural time consumption. Of course, in most cases, the average situation is the same as the worst. For example, the insertion sort can be considered as the average number of runs of the 4th/5 line of code is half of J, and finally the result is found to be related to \ (n^2\) .

The concept of growth scale

? What we are really interested in is not how much the value of the polynomial is expressed in the specific run time. It is the growth/growth level of the run time. For the polynomial of time expression, we focus only on the highest number of items of the polynomial. Here we introduce the Theta symbol: \[\theta\] .

? We'll write the worst run time for the insertion sort to: \ (\theta\)(\ (n^2\)). If one algorithm has a lower incremental level than the other, we can often assume that an algorithm with lower incremental levels is more efficient.

Divide and conquer algorithm of design algorithm

? Sometimes a problem can be tricky to solve as a whole, so consider dividing a big problem into smaller ones. If the small problem is small enough to be solved directly, and finally the solution of each small problem is merged, the whole problem is solved. I think in fact, like the current flow, refinement of the Division of labor, in essence, is a divide and conquer algorithm. A company's business involves all aspects, we can regard each aspect as a small-scale problem, the normal development of the company as long as everyone's own duties, solve their own problems on that side of the line. As for the boss, it is necessary to grasp the overall problem, responsible for the integration of all aspects of the "solution."

? Well, it's not far. Take a direct look at the three steps to solve the divide-and-conquer algorithm:

  1. decomposition : Decomposition of the original problem is a number of sub-problems, these sub-problems are the original problem of small instances
  2. solution : Solve each sub-problem in a recursive way
  3. Merging : Solution of Jia Cheng original problem of merging sub-problem

? One of the most common manifestations of divide-and-conquer algorithms is recursive invocation. A recursive call is a method that repeatedly calls itself, and the problem size of each call shrinks until the critical value returned by the method is reached. Merging sort is a typical application of the idea of divide and conquer algorithm. The following is a direct look at the merge sort code:

 Public Static void MergeSort(int[] arr,intStartIndex,intEndIndex) {//Here EndIndex indicates an unreachable subscript, so if the number of elements is less than or equal to 1 then no reordering is EndIndex <= StartIndex + 1        if(EndIndex <= StartIndex +1) {return; }intMidindex = (EndIndex + startIndex)/2;///Recursive call to solve sub-problem, each sub-problem scale is the original        MergeSort(arr, StartIndex, Midindex);MergeSort(arr, Midindex, EndIndex);//Merge sub-Problem solution: At this time meet [StartIndex, Midindex] and [Midindex, EndIndex) have been sequenced        Merge(arr, StartIndex, Midindex, EndIndex); }Private Static void Merge(int[] arr,intStartIndex,intMidindex,intEndIndex) {/*** Merge Policy:* 1. Create a new two array, and access the left half of the ordered array and the right half of the sorted array* 2. Starting from the left and right two arrays to start the index, select the smaller one to put the original array corresponding position* 3. Finally, if one of the left and right arrays has been traversed, the remaining elements of the other array are placed directly behind the group of elements.         */        //STEP1        int[] Leftarr =New int[Midindex-startindex];int[] Rightarr =New int[Endindex-midindex]; System.arraycopy(arr, StartIndex, Leftarr,0, Leftarr.length); System.arraycopy(arr, Midindex, Rightarr,0, Rightarr.length);//STEP2        intK = StartIndex;//Store subscript in the original array        inti =0;//store subscript of the left array        intj =0;//store subscript for the right array         while(I < Leftarr.length&& J < Rightarr.length) {//Copy smaller elements to the group of elements corresponding to subscript k, and move the array of smaller elements with subscript            if(Leftarr[i] < rightarr[j])                {arr[k++] = leftarr[i++]; }Else{arr[k++] = rightarr[j++]; }        }//STEP3        if(I < Leftarr.length) {System.arraycopy(Leftarr, I, arr, K, Leftarr.)length-i); }Else if(J <= Rightarr.length) {System.arraycopy(Rightarr, J, arr, K, Rightarr.)length-j); }    }

? The above code has the following three points to note:

  1. MergeSort End condition of recursion
  2. How to calculate Midindex
  3. When we copy the array, we take advantage of the method System.arraycopy , which is a native method with better efficiency. Arrays.copyOfThe method is finally called the native method, which can be seen directly in the JDK source code
Merge sort flowchart

? The flowchart is roughly as follows, where the shaded part of the original array represents the part that is not yet ordered. The shaded portion of the left and right array represents the part that has been processed. In the introduction of the algorithm, a sentinel element is used to determine whether the end of the left and right elements, in the above source code we directly according to the subscript to judge:

? Of course, this whole process can also be represented by a tree as follows:

?

Analysis of merging sorting algorithm

? Based on the algorithm steps of the merge sort, let's step through the worst-case time complexity:

  1. decomposition : Each step decomposition is equivalent to just solving the intermediate position subscript of the part to be sorted, need constant time, therefore: D (n) = \ (\theta\)(1)
  2. Resolution : The process of resolution is to solve the problem of N/2 scale recursively, will require 2T (N/2) Run time
  3. Merging : Each merge is actually traversing the elements of the left and right arrays, which can be thought of as: C (n) = \ (\theta\)(n)

? To analyze the total elapsed time, we need to add D (n) and C (n). This is actually the equivalent of \ (\theta\)(n) and \ (\theta\)(1) Add. The sum is still a linear function of N. So we can still express it as \ (\theta\)(n). Add it to the expression in step 2 to get the T (n) recursion for the worst-case run time of the merge sort:

? We use the time constant in C to simplify the formula to:

?

? To solve recursion we need to use a recursive tree. For the sake of convenience we assume that the scale of the problem is just a power of 2 (this is exactly a binary tree). Using recursive tree analysis such as:

? In fact, it is not difficult to find that each layer runs at the cost of the time is CN. So the key to the complexity of the whole algorithm is to find out how many layers there are. We know the scale of the top-level problem is n, and the size of the underlying problem is 1. Then the scale of each problem is reduced to half the original. It is not difficult to conclude that the total number of layers for this recursive tree is:\ (\log\)~2~n + 1, simply recorded as: LGN + 1. So the total time consumed for the recursive expression is: Cnlgn + CN. The expressions for worst run time are ignored for low-order and constant entries: \ (\theta\)(NLGN).

The practice and thinking about the basic algorithm the recursive realization of binary search method

? In fact, it is also a kind of divided thinking, gradually reduce the scale of the problem until the corresponding elements are found. Only the algorithm ultimately does not need to combine each small-scale solution.

Private Static int BinarySearch(int[] arr,intLowindex,intUpindex,intSearchvalue) {if(Upindex < Lowindex +1) {return-1; }intMidindex = (Lowindex + upindex)/2;if(Arr[midindex] = = Searchvalue) {returnMidindex;//equal returns the current subscript}Else if(Arr[midindex] < Searchvalue) {///less than the value to find to find the right half of the large value range            return BinarySearch(arr, Midindex +1, Upindex, Searchvalue); }Else{//greater than to find the left half of the small value range            return BinarySearch(arr, Lowindex, Midindex, Searchvalue); }    }

? The time complexity of the algorithm is the number of decomposition into sub-problem, because each sub-problem is directly take the intermediate element and the current element contrast, you can think of the time complexity of \ (\theta\)(1). So the worst time complexity for the entire binary lookup method is:\ (\theta\)(LGN).

optimization of insertion sorting efficiency by binary search method

? By the worst-case analysis of the upper face insertion sort. The worst time to insert a sort appears in the input array just in reverse order with the desired sort result. For elements with subscript I, it is still necessary to compare I at this time. Binary lookup Although you can speed up where elements should be inserted, if the input array is reversed, the number of moves does not change, so the worst time to insert a sort cannot be really optimized .

Bubble sort

? The name of the bubbling sort comes from the process of its algorithm similar to bubbling: in order from small to large, each of the adjacent two elements is exchanged until the smallest element bubbles to the left of the unsorted part. Its Java implementation code is as follows:

private  static   void  bubblesort  (int  [] arr) {//I can be considered an unordered array The leftmost element subscript, the smallest element bubbles at the leftmost end of each cycle, I increment  for  (int  i = 0 ; I < Arr.length -1 ; i++) {//J can be considered as an unordered set of partial elements, in turn comparing bubbling  for  (int  j = Arr.length -1 ; J > i; j--) {if  (Arr[j] < arr[j-1 ]) {  int temp = arr[j]; ARR[J] = arr[j- 1 ]; Arr[j- 1 ] = temp; } } } } 

? In fact, the process of bubbling and sorting is basically similar, except that selecting sort is the smallest element that is selected every time the unsorted part is traversed, and the unsorted part is 22 compared in the bubble sort. The worst time complexity is: \ (\theta\)(\ (n^2\)).

Use Insert sort optimization for decimal groups in merge sort

? While the worst-case runtime for a merge sort is θ (NLGN), and the worst-case runtime for the insertion sort is θ (N2), the constant factor in the insert sort may make it actually run faster on many machines when n is smaller. Therefore, when the sub-problem becomes sufficient in the merge sort, it makes sense to use the insertion sort to make the recursive leaf thicker. Consider a modification to the merge sort, which uses the insert sort to sort the n/k sub-table of length k, and then uses the standard merge mechanism to merge the sub-tables, where k is a pending value.

  1. Proof: The Insert sort worst case can be ordered in \ (\theta\)(NK) time for each n/k child table of length K.
  2. Indicates how these child tables are merged in \ (\theta\)(NLG (n/k)) time in the worst case scenario.
  3. Assuming the worst-case run time of the modified algorithm is θ (NK+NLG (n/k)), to make the modified algorithm with the standard merge sort have the same run time, as a function of n, with the help of the θ notation, what is the maximum value of k?
  4. In practice, how should we choose K?
    1. The worst-case complexity of sorting by the preceding insertion is: \ (\theta\)(n/k * \ (k^2\)) = \ (\theta\)(NK)
    2. Because finally, divide-and-conquer algorithms are applied to the lowest level of each group of elements K. So this grouping actually went through: LG (N/K) + 1 times. The time required to merge each layer is: CN. So the worst time complexity is: \ (\theta\)(NLG (n/k))
    3. If the modified merge sort time coincides with the original time, then there is: \ (\theta\)(NLG (n)) = \ (\theta\)(nk + NLG (n/k)) \ (\rightarrow\) Θ (k +LG (n/k)) =θ (LGN) \ (\rightarrow\) K should have a maximum value of LGN
    4. In practice, the value of K should be chosen as the maximum array length that makes the insertion sort faster than the merge sort. It is easy to understand that, assuming k=1, then degenerate into a standard merge sort, then to increase efficiency you need to enlarge k,k to make array[k] use Insert sort faster than merge sort, and array[k+1] insert sort is less efficient or equal to merge sort

The algorithm foundation of the introduction of rereading algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.