Sorting algorithm Summary:

Sorting has an internal sort and an external sort, the internal sort is the data record is sorted in memory, and the external sort is because the sorted data is large, one cannot hold all the sort records at a time, and the external memory needs to be accessed during the sorting process.

Here we talk about eight sorts of sort are internal sort.

When n is larger, it should be sorted by the time complexity O (nlog2n) : Quick Sort, heap sort, or merge sort order.

Quick sort: is currently considered the best method based on the comparison of internal sorting, when the keywords to be sorted are randomly distributed, the average time of the fast sorting is shortest;

The picture above is from a ppt. It summarizes all the common sorting algorithms in the data structure. Here are a few questions:

1. What is the idea of each algorithm?

2. What is the stability of each algorithm? What is the complexity of the time?

3, under what circumstances, the best situation of the algorithm or the worst case?

4. What is the specific implementation of each algorithm?

This is the ranking algorithm inside the most basic, but also the most frequently tested problems. Here is my summary.

**First, direct insert sort (insert sort).**

1, **the pseudo-code of the algorithm (so easy to understand):**

2, **thought** : As shown, each time the selection of an element k is inserted into the previously ordered part of A[1...I], the insertion process k is followed by the forward and A[1...I] elements in the comparison. If Discovery A[x]>=k is found, insert K into the back of a[x] and the element needs to be moved before insertion.

3. The **complexity of the algorithm time.**

Best case: **ordinal order** (small to large), so that only n times need to be compared, do not need to move. So time complexity is O (n)

Worst case: **reverse order** , so that each element needs to be compared n times, a total of n elements, so the actual complexity of O (N2)

On average: O (n2)

4. **stability.**

Understanding memory is better than rote learning. So, let's analyze it. Stability, that is, there are two of the same elements, the relative position of the order has changed, mainly used in the case of sorting when there are multiple collations. In the insert sort, K1 is the element in the sorted section, and when K2 and K1 are compared, plug directly into the back of the K1 (there is no need to plug in the front of the K1, so you need to move!!). ), so the **insertion sort is stable. **

** **5, **Code (c version) Blog.csdn.com/whuslei**

**Second, Hill sort (insert sort)**

1, **thought** : Hill Sort is also an insert sorting method, is actually a grouping insert method. First an integer less than n is D1 as the first increment, all the records of the table are divided into D1 groups, all records with multiples of D1 are placed in the same group, the direct insertion sort is made within each group, then the second increment D2 ( until the increment DT =1(DT<DT-1<...<D2<D1), where all records are placed in the same group for direct insert sorting.

For example, divide N records into D sub-sequences:

{r[0], r[d], r[2d], ..., R[KD]}

{r[1], r[1+d], R[1+2D],...,R[1+KD]}

...

{r[d-1],r[2d-1],r[3d-1],...,r[(k+1) d-1]}

Note: When d=5, first from a[d] start to insert, Judge A[d-d], and then a[d+1] and a[(d+1)-d] compared, and so on, this round after the original sequence is divided into a group of D. < back forward >

2. **complexity of time.**

**Best case** : Because of the good quality of the hill sort and the choice of step D is a lot of relationship, so there is no best way to choose the step (now some of the better choice, but not sure whether it is the best). So, do not know the best case of algorithmic time complexity.

**worst** case: O (N*LOGN), worst case and average case.

On **average** : O (N*LOGN)

3, **stability** .

Because of the number of insertions, we know that one insert sort is stable and does not change the relative order of the same elements, but in different insertion sorts, the same elements may move in their own insert sort, and finally their stability will be disturbed, so the shell sort is unstable. (There is a guess, easy to remember: in general, if there are non-adjacent elements between the exchange, it is likely to be an unstable sort.) )

4, **Code (c version) Blog.csdn.com/whuslei**

**Third, bubble sort (swap sort)**

1, the **basic idea** : Through the disorder of the adjacent record key words between the comparison and position exchange, so that the smallest record of the key words such as bubbles generally gradually upward "floating" to "water surface."

2. complexity of Time

In the best case: ordinal order, you only need to compare n times. Therefore, for O (n)

**worst case:** reverse order, you need to compare (n-1) + (n-2) +......+1, therefore, O (n*n)

3. **Stability**

Only the positions of adjacent two elements are exchanged during the sorting process. Therefore, when two numbers are equal, it is not necessary to exchange the positions of two numbers. Therefore, their relative position has not changed, the **bubble sorting algorithm is stable** !

4. **Code (version C) blog.csdn.com/whuslei**

**Iv. Quick Sort (Exchange sort)**

1, **thought** : It is improved by the bubble sort. The data series is divided into two parts of the N records to be sorted (usually taking the first record), and the record is placed in the appropriate location. all keywords that are smaller than the record keyword are placed in the previous section, all records larger than it are placed in the latter part , and the record is in the middle of the two parts (called the record Homing), which is called a quick sort.

Description: The core idea is to put the small part on the left, the large part on the right side, to achieve segmentation. ** **

2, algorithmic complexity

** Best Case **: Because the sequence is divided into two parts at a time ( ), so O (N*LOGN)

** worst case **: When basically ordered, degraded to bubble sort, almost to compare n*n times, so O (n*n)

**3. Stability**

Because each time you need to exchange with the middle axis elements, the original order can be disrupted. A sequence of 5 3 3 4 3 8 9 10 11 will disrupt the order of 3. So, the **quick sort is not stable! **

**4, code (c version)**

**V. Direct selection of sorting (select sort)**

2, ** the complexity of the time. **

** best case: ** swaps 0 times, but finds the smallest element each time, so it must traverse n*n times, so O (n*n). Reduce the number of exchanges!

** worst case, on average **: O (n*n)

3. **Stability**

Since the smallest element x in the unordered sequence A is selected each time it is exchanged with the first element in a, it is possible to break the relative position between the elements by crossing the distance, so **choosing a sort is not stable! **

4. **Code (version C) blog.csdn.com/whuslei**

**Vi. sequencing of Heaps**

1. **thought** : Using the intrinsic relationship between the parent node and the child node in the complete binary tree, select the maximum (or minimum) record of the keyword in the current unordered region. That is, in the case of the smallest heap, the root node is the smallest element, and the larger nodes tend to be distributed near the bottom of the heap.

2. **complexity of the algorithm**

Worst case scenario, close to the worst case: O (N*LOGN), so it's a good sorting algorithm.

3. **Stability**

Heap ordering requires constant adjustment of the heap, so it is an unstable **sort** !

4, **the code (C version, look at the code after more easy to understand!) )**

**Vii. Merge Sort**

2, ** Algorithmic time complexity **

** Best Case ** : A trip to merge takes N times, a total of logn times, so O (n*logn)

** worst case **, close to Average, O (n*logn)

** Description ** : For files of length n, the Logn two-way merge is required, and the time for each merge is O (n), so its time complexity is O (NLGN) in the best case and worst case.

3. **Stability**

The most important feature of merge sort is that it is a **stable** sorting algorithm. In the merging process, the relative position of the element is not changed.

4, the disadvantage is that it requires O (n) extra space. But **it's good for multi-linked list sorting. **

5, code (slightly) Blog.csdn.com/whuslei

Eight, the **base sort**

** ** 1, ** thought **: It is a non-comparative sort. It is sorted according to the level of the bit, that is, sort by digit first, then sort according to 10 bits ... And so on Examples are as follows:

algorithm

allocation requires O (n), collected as O (r), where R is the number of post-distribution lists, with r=10 as an example, there are 0~9 such 10 linked lists to classify the original sequence. and D, that is, the number of digits (such as the maximum number is 1234, the number of digits is 4, then d=4), that is, "allocation-collection" of the number of trips. So the time complexity is O (d* (n+r)).

3. **Stability**

The base ordering process does not change the relative position of the element, so it is **stable! **

4, **applicable** : If there is a sequence, know the range of numbers (such as 1~1000), with a quick sort or a heap sort, you need O (N*logn), but if the base order, you can achieve O (n+10) =o (N) time complexity. Sort of the fastest in this case!!

5. **code (slightly)**

Summary: Each algorithm should be applied to its conditions, this article is only a review of the underlying. If you do not understand the place please refer to the textbook.

Note: This article was reprinted from: http://blog.csdn.net/whuslei/article/details/6442755

Summary of sorting algorithms