Summary of seven internal sorting algorithms (insert sorting, Hill sorting, Bubble sorting, simple selection sorting, quick sorting, Merge Sorting, and heap sorting)

Source: Internet
Author: User

Preface:

Sorting is an important operation in computer programming. Its function is to rearrange any sequence of a data element into a sequence ordered by keywords. Therefore, it is very important to master sorting algorithms. For the sorting described below, we assume that all the sorting keywords are integers, and the parameters of the input function are checked by default. It is just a simple description of each algorithm and provides specific implementation code, without further research.


Basic knowledge:


Because the number of records to be sorted is different, the memory designed in the sorting process is different. There are two types of sorting methods: Internal sorting, it refers to the sorting process in which the records to be sorted are stored in computer random memory. The other type is external sorting, which refers to the sorting process where the number of records to be sorted is large and the memory cannot accommodate all records at a time.

In the sorting process, you need to perform the following two basic operations: 1. Compare the size of the two keywords; 2. Move the record from one location to another. Operation 1 is necessary for most sorting methods, and operation 2 can be avoided by changing the storage method of records.

The sequence of records to be sorted can be stored in the following three storage methods: 1. A group of records to be sorted is stored in a group of storage units with consecutive addresses. 2. A group of records to be sorted are stored in a static linked list. The order relationships between records are indicated by pointers. You only need to modify the pointer instead of moving the records. 3. the records to be sorted are stored in a group of address-contiguous storage units, and an address vector indicating the storage location of each record is set. During the sorting process, the record itself is not moved, the "addresses" of these records in the mobile address vector are adjusted based on the value in the address vector after sorting.


Algorithm analysis:


1. Insert sorting:

Basic Idea: Insert a record to an ordered table with a sorted order to obtain a new ordered table with a 1 Increase in the number of records.

The time complexity is O (n ^ 2). If the sequence to be sorted is positive, the time complexity can be increased to O (N). In space, only one auxiliary space of the record is required.


A. sort by direct insertion

Sample Code 1:

Void insertionsort (elementtype A [], int N) {Int J, P; elementtype TMP; // record auxiliary space for (P = 1; P <n; P ++) {TMP = A [p]; for (j = P; j> 0 & A [J-1]> TMP; j --) // insert a record into an ordered table in the sorted order a [J] = A [J-1]; A [J] = TMP ;}}

Sample Code 2:

Void insertionsort (elementtype A [], int N) {for (INT I = 1; I <n; I ++) {int TMP = A [I]; // record auxiliary space Int J = I-1; while (j>-1 & A [J]> A [I]) {A [J + 1] = A [J]; // insert a record to an ordered table in sorted order -- J;} A [J + 1] = TMP ;} return ;}

The insertion sorting algorithm is simple and easy to implement. When the number of records to be sorted is n hours, this is a good sorting method. However, when n is large, direct sorting is not recommended. Because of direct sorting, the main time consumption is "comparison" and "moving". Therefore, based on direct sorting, by reducing the number of "comparison" and "moving" operations, you can choose "Half insert sorting", "2-way insert sorting", and "table insert sorting.


B. Semi-insert sorting

Because the basic operation of insert sorting is to search and insert an ordered table, this "Search" operation can be achieved by "half-lookup". The insertion sorting is called half-insertion sorting.



2. Hill sorting

Basic Idea: first, divide the entire sequence of records to be sorted into several subsequences for direct insertion and sorting, insert and sort all records at a time. We can see that the hill sorting is actually just an improved insert sorting, so the above insert sorting is also called direct insert sorting.

Feature: subsequences are not simply separated by segments, but are separated by an incremental record to form a subsequence. It works by comparing elements with a certain interval. The distance used for comparison decreases with the algorithm until the last sorting of adjacent elements is compared.


Sample Code

void Shellsort(ElementType A[], int N){int i, j, Increment;ElementType Tmp;for(Increment = N / 2; Increment > 0; Increment /= 2){for(i = Increment; i < N; i++){Tmp = A[i];for(j = i; j >= Increment; j -= Increment){if(Tmp < A[j - Increment])A[j] = A[j - Increment];elsebreak;}A[j] = Tmp;}}}

The sequence increment selected in the preceding example is the sequence recommended by shell: n/2 and increment/2. The worst-case run time for hill sorting when Hill increments are used is O (n ^ 2 ).


3. Bubble Sorting

Basic Idea: first, compare the keywords of the first record with those of the second record. If the record is in reverse order, the two records are exchanged, then compare the keywords of the second record and the third record. And so on until the keyword of the nth record and the nth record is compared. The above process is called the first Bubble sorting, and the result is that the record with the largest keyword is placed at the position of the last record. Then perform the second Bubble sorting and perform the same operation on the first n-1 records. The result is that the record with the keyword size is placed at the position of the n-1 record. Generally, the I-train Bubble Sorting compares two adjacent keywords from 1 to n-I + 1, and exchanges adjacent records in reverse order, the result is that the record with the largest keyword in the n-I + 1 record is exchanged to the position at n-I + 1. The condition for determining the end Of the Bubble Sorting should be "the exchange records have not been performed during the sorting process ".


Sample Code 1:

void bubblesort(ElementType A[], int N){int i, j;ElementType tmp;for(i = 0; i < N; i++){for(j = 0; j < N-i; j++){if(A[j] > A[j+1]){tmp = A[j];A[j] = A[j+1];A[j+1] = tmp;}}}}

Sample Code 2:

void bubblesort(ElementType a[], int n){int j;bool flag;ElementType tmp;flag = true;while(flag){flag = false;for(j = 1; j < n; j++){if(a[j-1] > a[j]){tmp = a[j-1];a[j-1] = a[j];a[j] = tmp;flag = true;}}n--;}}

The time complexity of Bubble Sorting is O (n ^ 2 ). Efficiency: when the data volume is small, Bubble Sorting can be used.


4. Simple selection and sorting

Basic Idea: Every trip is in n-I + 1 (I = ,..., N-1) records with the smallest keyword are selected as the I-th record in the sequence. Directly selecting sorting is similar to directly inserting sorting. Data is divided into ordered and unordered areas, the difference is that direct insertion of sorting inserts the first element of the unordered area directly into the ordered area to form a larger ordered area, directly selecting sorting means selecting the smallest element from the unordered area and putting it at the end of the ordered area.


Sample Code:

void Selectsort(int a[], int n){int i, j, nMinIndex, tmp;for(i = 0; i < n; i++){nMinIndex = i;for(j = i + 1; j < n; j++)if(a[j] < a[nMinIndex])nMinIndex = j;tmp = a[i];a[i] = a[nMinIndex];a[nMinIndex] = tmp;}}

The time complexity of simple sorting is O (n ^ 2 ).


5. Fast sorting

Basic Idea: quick sorting is an improvement of Bubble sorting. The basic idea is to split the records to be sorted into two separate parts by one sort. the keywords of some records are smaller than those of other records, then the two records can be sorted separately to make the entire sequence effective.

The specific method of quick sorting is to include two pointers, low and high. Their initial values are low and high respectively, and the keyword of the pivot record is pivotkey, search forward from the position indicated by high to find records with the first keyword less than prvotkey and pivot records exchange with each other, and then search backward from the position indicated by low, find the record and pivot record with the first keyword greater than privotkey and repeat the two steps until low = high.


Sample Code 1:

void Swap(ElementType *left, ElementType *right){ElementType temp = *left;*left = *right;*right = temp;}int Partition(ElementType A[], int low, int high){ElementType pivotkey = A[low];while(low < high){while(low < high && A[high] >= pivotkey)high--;Swap(&A[low], &A[high]);while(low < high && A[low] <= pivotkey)low++;Swap(&A[low], &A[high]);}return low;}void QSort(ElementType A[], int low, int high){int pivotloc;if(low < high){pivotloc = Partition(A, low, high);QSort(A, low, pivotloc - 1);QSort(A, pivotloc + 1, high);}}void QuickSort(ElementType A[], int low, int high){QSort(A, low, high);}


The average time of fast sorting is O (n) = nlogn; it is considered the best internal sorting method.


6. Merge and sort

Basic Idea: combine two or more ordered tables into a new ordered table. 2-route Merge Sorting: assume that the initial sequence contains N records, it can be regarded as N ordered subsequences. The length of each subsequence is 1, and then the two subsequences are merged, we can obtain an ordered subsequence of n/2 (or n/2 + 1) with a length of 2 or 1. Then, we can merge them in two ,...... This is repeated until an ordered sequence with a length of N is obtained.


Sample Code:

void Merge(ElementType A[], ElementType TmpArray[], int Lpos, int Rpos, int RightEnd){int i, LeftEnd, NumElements, TmpPos;LeftEnd = Rpos - 1;TmpPos = Lpos;NumElements = RightEnd - Lpos + 1;/*main loop*/while(Lpos <= LeftEnd && Rpos <= RightEnd)if(A[Lpos] <= A[Rpos])TmpArray[TmpPos++] = A[Lpos++];elseTmpArray[TmpPos++] = A[Rpos++];while(Lpos <= LeftEnd)/*Copy rest of first half*/TmpArray[TmpPos++] = A[Lpos++];while(Rpos <= RightEnd)/*Copy rest of second half*/TmpArray[TmpPos++] = A[Rpos++];/*Copy TmpArray back*/for(i = 0; i < NumElements; i++, RightEnd--)A[RightEnd] = TmpArray[RightEnd];}void MSort(ElementType A[], ElementType TmpArray[], int Left, int Right){int Center;if(Left < Right){Center = (Left + Right) / 2;MSort(A, TmpArray, Left, Center);MSort(A, TmpArray, Center + 1, Right);Merge(A, TmpArray, Left, Center + 1, Right);}}void Mergesort(ElementType A[], int N){ElementType *TmpArray;TmpArray = (ElementType *)malloc(N*sizeof(ElementType));if(TmpArray == NULL){fprintf(stderr, "no space for tmp array!\n");return;}MSort(A, TmpArray, 0, N-1);free(TmpArray);return;}

The efficiency of merging and sorting is relatively high. The length of a series is set to N. Splitting a series into small series requires a total of n logn steps. Each step is a process of merging ordered series, the time complexity is recorded as O (N), so the time complexity is O (n * logn ). It is difficult to use for primary storage sorting. The main problem is that linear memory is required to merge two sorted tables, in the entire algorithm, it also takes some additional work to copy the data to a temporary array and then copy it back. The result of this operation severely slows down the sorting speed.


7. Heap sorting

A heap is a complete binary tree with the following properties: the value of each node is greater than or equal to the value of its left and right child nodes; or the value of each node is smaller than or equal to the value of its left and right child nodes.


Heap sorting is a method of sorting by heap. the basic idea is to construct the sequence to be sorted into a large top heap. at this point, the maximum value of the entire sequence is the root node of the heap. remove it (in fact, it is to swap it with the end element of the heap array, at this time the end element is the maximum value), and then re-construct the remaining n-1 sequences into a heap, in this way, the secondary operators of n elements are obtained. after repeated execution, an ordered sequence can be obtained. The time complexity is O (nlogn), which is better than bubble, simple choice, and directly inserted O (N ^ 2)


The main problem with this algorithm is that it uses an additional array. Therefore, storage needs are doubled. Note: It takes only O (n) to copy the second array back to the first array. This does not significantly affect the running time. This is a space issue.


Sample Code:

#define LeftChild(i) (2*(i) + 1)void Swap(ElementType *pa, ElementType *pb){ElementType *pc = pa;pa = pb;pb = pc;}void PercDown(ElementType A[], int i, int N){int Child;ElementType Tmp;for(Tmp = A[i]; LeftChild(i) < N; i = Child){Child = LeftChild(i);if(Child != N-1 && A[Child + 1] > A[Child])Child++;if(Tmp < A[Child])A[i] = A[Child];elsebreak;}A[i] = Tmp;}void Heapsort(ElementType A[], int N){int i;for(i = N/2; i >= 0; i--)/*BuildHeap*/PercDown(A, i, N);for(i = N - 1; i > 0; i--){Swap(&A[0], &A[i]);/*DeleteMax*/PercDown(A, 0, i);}}


Summary:


Although the above seven internal sorting methods are given, they can be simply divided into the following categories: insert sorting (direct insertion sorting, Hill sorting) quick sorting (Bubble sorting, quick sorting), select sorting (Simple selection sorting, heap sorting), Merge Sorting, and base sorting.


Comparison of internal sorting methods:


1. In terms of average time performance, fast sorting is the best, and the time required is the least time. However, in the worst case of fast sorting, the time performance is inferior to heap sorting and Merge Sorting. The result of comparison between the two is that when n is large, the time required for merging and sorting is less than that of heap sorting, but it requires the most auxiliary storage.

2. Simple sorting includes all insertion sorting, Bubble sorting, and simple selection sorting except for hill sorting. In this example, direct insertion sorting is the simplest, when the record in the sequence is "basically ordered" or N value is small, it is the best sorting method, so it is often compared with other sorting methods, such as quick sorting and merge sorting.

3. The actual complexity of base sorting can be written as O (D * n ). It is most suitable for sequences with large N values and small keywords. If the keyword is large, and the "highest bit keyword" of most records in the sequence is different, you can also divide the sequence into several "small" subsequences based on the "highest bit keyword, then sort the data directly by insert.

4. Compared with the method stability, the base sorting is a stable internal sorting method, and the simple sorting method with all time complexity O (N ^ 2) is also stable. However, fast sorting, heap sorting, Hill sorting, and other sorting methods with better time performance are unstable.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.