Sorting (1) merging, fast sorting, and priority queue (graphic details)

Source: Internet
Author: User

Sort (1)


Basic Sorting Algorithm
Select sort


Thought: First, find the smallest element in the array. Second, it is exchanged with the first element of the array. Again, find the smallest element in the remaining element and swap it with the second element of the array. Until the entire array is sorted.

 

[Legend]


In the figure, the X axis is an array index, and the Y axis is the value of the elements to be sorted.

 

There are two distinctive features of sorting:

The running time is irrelevant to the input. Scanning an array to find the smallest element does not provide any information for the next scan. This nature is a disadvantage in some cases. (No matter what the initial status of the array is, this algorithm is equally inefficient)

Data movement is minimal. Each exchange changes the values of two array elements. Therefore, N exchanges are used for sorting. The number of exchanges is linearly related to the array size. (None of the algorithms we study has this feature)

 

[For an array whose length is N, it takes about N2/2 comparisons and N exchanges to sort the array]

 

Bubble Sorting


Thought: it repeatedly visited the series to be sorted,Compare two elements at a time, and exchange them if their order is wrong. The work of visiting a series is repeated until there is no need to exchangeThat is to say, the sequence has been sorted. The name of this algorithm comes from because the smaller elements will slowly "float" to the top of the series through the exchange. (Large elements will sink slowly to the bottom .)

 

The Bubble Sorting Algorithm operates as follows:

1. Compare adjacent elements. If the first is bigger than the second, exchange the two of them.

2. perform the same operation on each adjacent element, starting from the first pair to the last one at the end. At this point,The final element should be the largest number..

3. Repeat the preceding steps for all elements except the last one.

4. Continue to repeat the above steps for fewer and fewer elements until there is no need to compare any number.

 

[Legend]


In the figure, the X axis is an array index, and the Y axis is the value of the elements to be sorted.

From the figure, we can see that the Bubble Sorting is gradual and orderly from the back to the front. The largest element is first sunk to the bottom, followed by the next big ......


Void BubbleSort (Comparable [] a) {// exchanged indicates whether exchange processing has been performed. If a bubble has not done exchange processing, that is, if the position of any element is not changed, the bubble is stopped. Bool exchanged = false; int N =. length; // a maximum of N-1 sequential Bubble Sorting (if a certain sort has no exchange operation, the bubble is stopped) for (int I = 1; I <N &&! Exchanged; I ++) {exchanged = false; // from 0th to the first N-1-i for a bubble, select the maximum value in this trip, sink to the bottom. For (j = 0; j <N-I; j ++) {// if two adjacent keywords are exchanged in reverse order if (a [j]> a [j + 1]) {exch (a [j], a [j + 1]); // exchange a [j], a [j + 1] exchanged = true; // mark the element exchange position} // for (j )}}

 

Features:

Bubble Sorting has the same execution time as insert sorting, but the number of exchanges required by the two methods is quite different.In the worst case, Bubble Sorting requires an O (n2) exchange, while insert sorting requires a maximum O (n) exchange.. The implementation of Bubble Sorting (similar to the following) usually executes O (n2) for the sorted series, while inserting sorting in this example only requires O (n) operations.

Therefore, many implementations are in progress.Avoid using Bubble Sorting instead of insert sorting.. If a flag is used for the first execution of the internal loop to indicate whether there is a possibility of exchange, the bubble sort may also reduce the best complexity to O (n ). In this case, there is no need to exchange the sorted series. (For example, the above Code)

 

 

Insert sort


Thought: Compare the I-th element with the ordered element on the left, locate the appropriate position, and insert it. To free up space for the elements to be inserted, we need to move all the other elements one to the right before insertion.


The specific algorithm is described as follows:

1. Starting from the first element, this element can be considered to have been sorted

2. Retrieve the next element and scan the sorted element sequence from the back to the front.

3. If the element (sorted) is greater than the new element, move the element to the next position.

4. Repeat Step 3 until you find the position where the sorted elements are smaller than or equal to the new elements.

5. Insert the new element to this position.

6. Repeat Step 2 ~ 5

 

Like sorting, all elements on the left of the current index are ordered. However, their final position is unknown. To make room for smaller elements, they may be moved. However, when the index reaches the right of the array, the array sorting is complete. Insert sorting does not access the elements on the right of the index, but select sorting does not access the elements on the left of the index.

The time required for insertion sorting depends on the initial sequence of elements in the input. Sort the ordered (or near-ordered) array of an element, which is much faster than the random or reverse order array.

 

[Legend]

Sort by insert into a column of numbers.


From the past to the next step.


Void sort (Comparable [] a) {// sort a [] in ascending order int N =. length; for (int I = 1; I <N; I ++) {// insert a [I] into a [I-1], a [I-2], a [I-3]... For (int j = I; j> = 1 & less (a [j], a [J-1]); j --) exch (a, j, j-1 );}}

// When index I changes from left to right, the elements on the left are always ordered, so the sorting is completed when I reaches the right end of the array.

Improvement: it is not difficult to significantly increase the insert sorting speed, you only need to move the larger elements to the right in the internal loop rather than always swap two elements (so that the number of visits to the array can be halved ).

 

[On average, insertion sorting requires ~ N2/4 comparisons and ~ N2/4 exchanges .]

[When there are few inverted values (two elements are reversed), insertion sorting may be faster than any other sorting algorithm !]

 

[Insert sorting is very efficient for partial ordered arrays and suitable for small-scale arrays. It is also an intermediate process of advanced sorting algorithms .]

 


Hill sorting


Hill sorting, also known as the descending incremental sorting algorithm, is a more efficient improved version of insert sorting.

Hill sorting proposes an improvement method based on the following two attributes of insert sorting:

1. Insert sorting is efficient when performing operations on data in almost sorted order, that is, linear sorting can be achieved.

2. insertion and sorting of large-scale disordered arrays is slow, because it only exchanges adjacent elements, so the elements can only be moved from one end of the array to the other at 1.1 points.

 

Hill sorting simply improves insertion sorting, SWAps non-Adjacent Elements to sort the partial part of the array, and finally sorts the partial ordered array with insertion sorting. (Sort the entire large array first, and then insert and sort the large array once)

 

Thought: the elements in the array with any interval of h are ordered. Such an array is called an h ordered array. If h is large during sorting, we can move the elements far away to create more convenient h ordering.



We only need to change the distance of Moving Elements to h in the inserted sorting code. In this way, the implementation of hill sorting is converted to a process similar to inserting sorting but using different increments.

 

[Legend]

 

Void sort (Comparable [] a) {// sort a [] in ascending order int N =. length; int h = 1; // select the appropriate initial interval h while (h <N/3) h = 3 * h + 1, 121,364,109 3 .... // h of each Shell sort gradually decreases while (h> = 1) {// Changes the array to h ordered for (int I = h; I <N; I ++) {// insert a [I] into a [I-h], a [I-2 * h], a [I-3 * h]... For (int j = I; j> = h & less (a [j], a [j-h]); j-= h) exch (a, j, j-h) ;}h = h/3 ;}}

Hill sorting is more efficient because it balances the size and order of sub-arrays. At the beginning of sorting, each sub-array is very short. After sorting, the sub-array is partially ordered, which is suitable for insertion sorting.

 

The performance of the hill sorting algorithm depends not only on h, but also on the mathematical nature between h. In practical applications, it is enough to use an incremental sequence of 3 * h + 1.

[In the worst case, the comparison times of hill sorting are proportional to N1.5]

 

The amount of code sorted by hill is small, and no additional memory space is required.. If you need to solve a sorting problem and there is no system sorting function available (for example, code running in embedded systems), you can first sort by hill, then consider whether it is worth replacing it with a more complex sorting algorithm.

 


Merge Sorting


Thought: To sort an array, You can first (recursively) divide it into two halves for sorting, and then merge the results.

 

[Legend]

An example of merging and sorting: sorting a random vertex linked list.



Void Sort (Comparable [] a) {aux = new Comparable [. length]; // allocate the auxiliary space sort (a, 0,. length-1);} void sort (Comparable [] a, int lo, int hi) {// set the array a [lo .... hi] sorting if (hi <= lo) // there is only one element, ending with recursive return; int mid = lo + (hi-lo)/2; sort (a, lo, mid); // sort the left half side sort (a, mid + 1, hi); // sort the right half side merge (a, lo, mid, hi ); // merge result} // merge void merge (Comparable [] a, int lo, int mid, int hi) in situ {// set a [lo... mid] And a [mid + 1... hi] merge int I = lo, j = mid + 1; for (int k = lo; k <= hi; k ++) // set a [lo .. hi] Copy to auxiliary array aux [lo .. hi] aux [k] = a [k]; for (int k = lo; k <= hi; k ++) // merge to a [lo .. hi] if (I> mid) a [k] = aux [j ++]; else if (j> hi) a [k] = aux [I ++]; else if (aux [j] <aux [I]) a [k] = aux [j ++]; else a [k] = aux [I ++];}

Improvement: using different methods to deal with small-scale problems can improve the performance of most recursive algorithms, because recursion will make method calls too frequent in small-scale problems, therefore, the entire algorithm can be improved by improving their processing methods. (That is, to increase the granularity of recursion, so that recursion stops when it reaches a small range, rather than when an element is reached)

Insert sorting is very simple for sorting, so it is likely to be faster than Merge Sorting in small arrays.Use Insert sort to process small-scale sub-ArraysGenerally, the running time of Merge Sorting can be shortened by 10% ~ 15%.

 


Bottom-up Merge Sorting


Recursive Merge Sorting is a typical application of the divide and conquer idea in algorithm design. We can write the recursive method as iterative -- first merge those micro arrays, and then merge them into child arrays.

First, we merge data in two, then merge data in four or four, and then merge data in eight or eight ways.
Void sort (Comparable [] a) {aux = new Comparable [. length]; // allocate the auxiliary space int N =. length; for (int sz = 1; sz <N; sz = sz + sz) // the size of the sz sub-array for (int lo = 0; lo <N-sz; lo + = sz + sz) // a merge (a, lo, lo + sz-1, min (lo + sz + sz-1, N-1 ));}

The bottom-up sorting method is suitable for organizing data with a linked list. In this method, you only need to re-organize the Linked List link to sort the linked list in the same place (no new linked list node needs to be created)

 

[Merge Sorting is a progressive and optimal algorithm based on comparative sorting]

(That is, the number of comparisons in the worst case and the minimum number of comparisons required by any comparison-based sorting algorithm is ~ NlgN)

 

Disadvantages of Merge Sorting: the extra space required is proportional to N.

 


Quick sorting


Idea: quick sorting is a Sort Algorithm for grouping. It divides an array into two sub-arrays and sorts the two parts independently.

The general policy is to randomly take a [lo] As the splitting element, that is, the element that will be scheduled, and then weScanning from left to rightUntil an element greater than or equal to it is foundScans left from the right side of the arrayUntil an element smaller than or equal to it is found.

These two elements are obviously not scheduled, so we exchange their locations. In this way, we can ensure that the left element of the Left pointer I is not greater than the split element, and the right element of the Right pointer j is not less than the split element. When two pointers meet each other, we only need to swap the split element a [lo] and the element (a [j]) on the far right of the Left subarray and return j.

 

Void sort (Comparable [] a, int lo, int hi) {if (hi <= lo) return; int j = partition (a, lo, hi ); // Split sort (a, lo, J-1); // sort the left half of sort (a, j + 1, hi ); // sort the right half} // sharding int partition (Comparable [] a, int lo, int hi) {int I = lo, j = hi + 1; // left and right scan pointer Comparable v = a [lo]; // split element while (true) {// scan left and right, check whether the scan is complete and exchange the element while (less (a [++ I], v) if (I = hi) break; while (less (v, a [-- j]) if (j = lo) break; if (I> = j) break; exch (a, I, j) ;}return j ;}

Fast sorting is characterized by in-situ sorting (only a small auxiliary stack is required), and the time required to sort arrays with N length is proportional to NlgN.

Disadvantage: This program may be extremely inefficient when the splitting is unbalanced.

 

Improvement: Switch to insert sorting

Like most recursive sorting algorithms, an easy way to improve the fast sorting performance is based on the following two points:

For small arrays, quick sorting is slower than insert sorting.

Because of recursion, the sort () method of quick sorting also calls itself in small arrays.

Therefore, you should switch to insert sorting when sorting small arrays.

Return the statement if (hi <= lo) in sort;

Replace with: if (hi <= lo + M) {Insertion. sort (a, lo, hi); return ;}

 


● Fast sorting of three-way splitting


It often appears in practical applications.Array containing a large number of repeated Elements.

In the case of a large number of repeated elements, the progressive sorting will make all repeated sub-arrays of the elements frequently appear, which has great potential for improvement, improves the performance of the current linear logarithm level to the linear level.

A simple idea is to split the array into three parts, corresponding to the array elements smaller than, equal to, and greater than the split element.

 


Void sort (Comparable [] a, int lo. int hi) {if (hi <= lo) return; int lt = lo, I = lo + 1, gt = hi; Comparable v = a [lo]; while (I <= gt) {if (a [I] <v) exch (a, lt ++, I ++ ); // a [I] is smaller than v. Put the [I] value into [lo... lt-1] Set else if (a [I]> v) exch (a, I, gt --); // a [I] greater than v put a [I] value into the end of else I ++;} sort (a, lo, lt-1); sort (a, gt + 1, hi );}

The splitting of this sort code can normalize the elements equal to the splitting elements so that they are not included in the Child arrays processed by recursive calls. For arrays with a large number of repeated elements, this method is much more efficient than standard fast sorting.

 


Priority queue


Many applications need to process ordered elements, but they are not necessarily required to be fully ordered, or they do not have to be sorted at once.In many cases, we will collect some elements, process the elements with the largest current key value, then collect more elements, and then process the elements with the largest current key value..

In this case, a proper data structure should support two operations: deleting the maximum element and inserting element. This type of data is called priority queue.

 

Data StructureBinary heapThe basic operations of the priority queue can be well implemented.

When each node of a binary tree is greater than or equal to its two subknots, it is calledHeap order. (Big top stack)

It is very convenient to use a full binary tree to express the binary heap. The full binary tree can be expressed only by arrays without pointers. The specific method is to put the node of a binary tree into an array in hierarchical order. The root node is at location 1, and its subnodes are at location 2 and 3, the sub-nodes are located at locations 4, 5, 6, and 7, and so on.


Heap Algorithm


We use a private array pq [] with the length of N + 1 to represent a heap with the size of N. We will not use pq [0], the heap elements are stored in pq [1] to pq [N.

 

We willTwo situations:

When the priority of a node increases (or a new element is added at the bottom of the heap), we need to restore the heap order from bottom to bottom.

When the priority of a node decreases (for example, replacing the root node with a smaller element), we need to restore the heap sequence from top to bottom.

 

·Ordering from bottom to top (floating)


If the heap's ordered state is broken because a node becomes bigger than its parent node, we need to replace it with its parent node to fix the heap.

After the switch, this node is larger than its two child nodes, but this node may still be larger than its current parent node. We can use the same method to restore order over and over again and move the node up until we encounter a larger parent node. (Stream-up)

 

void swim(int k){    while (k > 1 && less(k/2, k))    {        exch(k/2, k) ;        k = k/2 ;    }}

·Sorting from top to bottom (sinking)


If the heap order state is broken because a node becomes smaller than a child node, then we can recover the heap by exchanging it with the large ones in its two subnodes.

Switching may continue to break the heap's orderly state at the subnode, so we need to constantly fix it in the same way, move the node down until its child node is smaller than it or reaches the bottom of the heap. (Downstream)

 

void sink(int k){    while (2*k <= N)    {        int j = 2*k ;        if (j < N && less(j, j+1))            j++ ;        if (!less(k, j))            break ;        exch(k, j) ;        k = j ;    }}



Heap sorting


We can turn any priority queue into a sorting method. Insert all elements into a priority queue for finding the smallest elements, and then call the operation to delete the smallest elements repeatedly to delete them in order.

 

1.Heap construction:

Constructs a heap from N given elements and uses sink () from right to left ()SinkingFunction constructor. At the beginning, we only need to scan half of the elements in the array, because we can skip the child heap with a size of 1, and then sink () to each node forward (), until we call the sink () method on location 1, the scan ends.

(Using a sinking operation to construct a heap with N elements requires less than 2N comparisons and less than N exchanges)

(If we use swim () from left to right ()FloatingTo traverse the array, you need to use NlogN in a proportional time)

 

For (int k = N/2; k> = 1; k --)

Sink (a, k, N );

 

2.Sinking sorting:

The main tasks of heap sorting are completed in the second stage. Here weDelete the largest element in the heap, and then add it to the hollow position of the array after the heap is reduced..

This process is similar to the selection and sorting (selecting the most value step by step), but it requires less, because the heap provides an effective way to find the maximum element from the unsorted part.

 

While (N> 1)

{

Exch (a, 1, N --); // exchange the maximum element of the heap end node with the heap top.

Sink (a, 1, N); // sink the changed heap top node

}

 

Total heap Sorting Code: (use only SinkingOperation)

Void sort (Comparable [] a) {int n =. length; for (int k = N/2; k> = 1; k --) sink (a, k, N); while (N> 1) {exch (, 1, N --); // swap the heap tail node with the maximum element on the heap top for sink (a, 1, N); // perform the Sinking Operation on the changed heap top node }}

Features: heap sorting is what we knowThe only method that can best utilize space and time at the same time-- It can also be used in the worst case ~ 2NlgN comparison and constant extra space.

When the space is very tight (such as in an embedded system), it is very popular, because it can achieve better performance with only a few lines. But many applications in modern systems seldom use it because it cannot exploit the cache. The array elements are rarely compared with other adjacent elements. ThereforeIts cache miss times are much higherMost algorithms are compared between adjacent elements, such as fast sorting, Merge Sorting, or even Hill sorting.

 

Application:

TopM Problems:

In some data processing examples, the total data volume is too large to be sorted (or even not all loaded into the memory ). If you need to select the ten largest elements from 1 billion, do you really want to sort an array of 1 billion scale? But with the priority queue, you only need a queue that can store ten elements.

 

[Example]

Find the maximum number of 100.

A:

Put the 100 million data records in files (each file contains million data records ).

Repeat the priority queue: Find the top 100 in each file and use a heap containing elements (the top is small, the top is large, and the top is large. The top is the smallest heap, for example, the top is large, first, we need to adjust the first 100 elements to the smallest heap. If we find that, then we can scan the following data and compare it with the heap top element. If it is larger than the heap top element, use this element to replace the heap top and then adjust it to the minimum heap. The last element in the heap is the top ).

Find the TOP100 in each file, and then combine the TOP100 of the 100 files, a total of pieces of data, and use the method similar to the above to obtain the TOP100.

 


[Note] We all talk about the dynamic graph NB, Which is copied from Wikipedia :-)


Bottom half: Key Index counting, base sorting, bucket sorting, bit chart, and loser tree.



 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.