20 basic algorithms required by intermediate programmers (1) -- fast sorting

Source: Internet
Author: User

Algorithm entry-level research generally begins with "sorting" and "Searching. "Sorting algorithms" and her sister "search algorithms" are the basis of many complex algorithms and the basis of many complex systems. For example, the most complex virtual memory management in Linux is based on the "red-black tree" search algorithm; Solaris is based on the AVL tree search algorithm; MySQL is based on the B-tree search algorithm; p2P technology is based on the DHT hash algorithm ...... (To be supplemented ...)

The top 20 basic algorithms that intermediate programmers must understand today are introduced. By introducing various basic sorting and search algorithms, we can sort out the basics of computer algorithms to pave the way for introducing those complex algorithms. Here we start with sorting algorithms ..

For sorting algorithms, I want to first make a simple comparison, first, emotional comparison (more in line with my personality ), by programming, we can compare the time performance of the nine classical sorting algorithms in data structure. My environment is
Vc6.0 (release) + win2000pro + 128 mddr + P4 (1.6g)
In a multi-task operating system, the system schedules the program and affects the experiment results. The following are slightly corrected values. If we want to obtain a more accurate value, we need to calculate the average value for multiple experiments.
.

Comparison of Sorting Algorithm experiments (unit: seconds)

 

Method

1 K

10 K

100 K

200 K

100 K

Forward

Reverse Order

Bubble Sorting

0

0.422

44.790

188.462

0

31.459

Bubble sort 2

0

0.281

30.335

131.771

0

27.568

Quick sorting

0

0

0.016

0.047

5.095

7.002

Select sort directly

0

0.141

16.878

79.332

16.785

33.242

Heap sorting

0

0

0.031

0.109

0.031

0.015

Insert sort directly

0

0.047

8.705

57.800

0

24.865

Shell sorting

0

0

0.047

0.110

0.015

0.015

Merge Sorting

0

0

0.031

0.094

0.032

0.032

Base sort

0

0

0.47

0.109

0.047

0.046

With perceptual knowledge, let's get to know it rationally:

(1) Stability Comparison
Insertion sorting, Bubble sorting, binary tree sorting, two-way merge sorting, and other linear sorting are stable.
It is unstable to select sorting, Hill sorting, fast sorting, and heap sorting.

(2) Comparison of time complexity
Insert sort, bubble sort, and select sort time complexity as O (N ^ 2)
The time complexity of other non-linear sorting is O (nlog2n)
The time complexity of Linear sorting is O (n) (Linear sorting includes counting sorting and base sorting will be introduced later );

(3) Comparison of auxiliary Spaces
The auxiliary space for Linear sorting and binary Merge Sorting is O (n), and the auxiliary space for other sorting is O (1 );

(4) Other comparisons
The insertion and Bubble Sorting speed is slow, but this sorting speed can reach a high speed when the sequencing sequence is partial or overall. In this case, fast sorting slows down.
When N is relatively small, sequence is not required for stability. sequence is required for stability. insertion or Bubble Sorting is required.
If the keywords of the record to be sorted are within a obviously limited range, and the space can be sorted in buckets.
When N is large, keyword elements are random, so quick sorting is not required for stability.
When N is large, the keyword element may appear in order. When stability is required, the space is allowed. Sort by merge.
When N is large, the keyword element may appear in order, and there is no requirement for stability. sort by heap.

In summary, when we program, the most common case is that when n is large, the elements are random, but the stability is generally not required, so it leads to fast sorting.

The algorithm for fast sorting is as follows: First, check the number of data in the data list. If the number is smaller than two, exit the program directly. If there are more than two pieces of data, select a split point to divide the data into two parts. The data smaller than the split point is placed in one group, and the rest is placed in another group, sort the two groups of data respectively. Generally, the split point data is randomly selected. In this way, no matter whether your data has been sorted or not, the size of the two word lists you split is similar. The size of the two sublists is similar.

# Include <iostream. h> void run (int * pdata, int left, int right) {int I, j; int middle, itemp; I = left; j = right; middle = pdata [(left + right)/2]; // calculates the median do {While (pdata [I] <middle) & (I <right )) // The number of scans from the left that is greater than the median I ++; while (pdata [J]> middle) & (j> left )) // number of values greater than the value in the right scan j --; if (I <= J) // a pair of values is found {// itemp = pdata [I]; pdata [I] = pdata [J]; pdata [J] = itemp; I ++; j -- ;}} while (I <= J ); // If the subscripts of the two scans are staggered, stop (completed once) // when the left part has a value (left <j), recursive left half side if (left <j) run (pdata, left, J); // when the right part has a value (Right> I), recursive right half side if (Right> I) Run (pdata, I, right);} void quicksort (int * pdata, int count) {run (pdata, 0, Count-1);} void main () {int data [] = {10, 9, 8, 7, 6, 5, 4}; quicksort (data, 7); For (INT I = 0; I <7; I ++) cout <data [I] <"; cout <"/N ";}

Quick sorting is an improvement of Bubble sorting. It eliminates multiple reverse orders through one exchange, which can reduce the number of scans and data exchanges consumed during Reverse sorting. In optimal conditions, its sorting time complexity is O (nlog2n ). That is, each time the sequence is divided, it can be evenly divided into two substrings. However, in the worst case, the time complexity is O (n ^ 2 ). That is to say, each time you drag a molecular string, one string is empty, and the other string is s-1 (the 100k forward and backward orders in the program are exactly like this, if the program uses the data in the middle of each sequence as the Division point, it will be optimal in the forward and reverse order ). From the positive sorting results of K, "quick sorting" is slower than "Bubble Sorting". This is mainly because the "Bubble Sorting" adopts the method of early sorting. In some books, this explains "quick sorting". In theory, if the sequence can be evenly divided every time, it will be the fastest sorting algorithm, so it is called "quick sorting. Although it is difficult to evenly divide sequences, it is still the fastest in the internal Sorting Algorithm Based on keyword comparison in terms of average performance.

 

The core idea of Bubble Sorting is to scan the data list to find two adjacent projects in disordered order. After the two items are found, switch the project location and continue scanning. Repeat the above operations until all projects are sorted in order:

#include <iostream.h> void BubbleSort(int* pData,int Count) {     int iTemp;     for(int i=1;i<Count;i++)     {         for(int j=Count-1;j>=i;j--)         {             if(pData[j]<pData[j-1])             {                 iTemp = pData[j-1];                 pData[j-1] = pData[j];                 pData[j] = iTemp;             }         }     } } void main() {     int data[] = {10,9,8,7,6,5,4};     BubbleSort(data,7);     for (int i=0;i<7;i++)         cout<<data[i]<<" ";     cout<<"/n"; } 

In the optimal condition, the result can be obtained only after N-1 comparisons. (In this optimal condition, the sequence is in the forward order, and the result can be seen from the positive sequence of K ), but in the worst case, that is, Reverse Order (or a smaller value at the end), the sinking algorithm will need n (n-1)/Two comparisons. So in general, especially in reverse order, it is not ideal. It is a sort algorithm that is very sensitive to data orderliness.
 
Generally, bubble is unidirectional, and there is also a bidirectional bubble algorithm, that is, reverse work is required. The two-way bubble code looks complicated. After careful consideration, you can understand that it is a method of round-trip fluctuation. The author believes that this code can reduce some exchanges on the basis of bubbling:

# Include <iostream. h> void bubble2sort (int * pdata, int count) {int itemp; int left = 1; int right = count-1; int T; do {// positive part for (INT I = right; I> = left; I --) {If (pdata [I] <pdata [I-1]) {itemp = pdata [I]; pdata [I] = pdata [I-1]; pdata [I-1] = itemp; t = I ;}} left = t + 1; // reverse part for (I = left; I <right + 1; I ++) {If (pdata [I] <pdata [I-1]) {itemp = pdata [I]; pdata [I] = pdata [I-1]; pdata [I-1] = itemp; t = I ;}} right = T-1 ;} while (left <= right);} void main () {int data [] = {10, 9, 8, 7, 6, 5, 4}; bubble2sort (data, 7 ); for (INT I = 0; I <7; I ++) cout <data [I] <""; cout <"/N ";}

It is an improvement of Bubble Sorting (sinking and floating again). The optimal and worst cases are similar to Bubble sorting, but generally it is better than Bubble sorting, once again, this avoids a large comparison due to the reverse order of a number. For example ,..., N-1, n, 1), with Bubble Sorting requires n (n-1)/2 comparison, and this sorting only 3 rounds, total comparison (n-1) + (n-2) + (n-3) next, the first round of 1 will move up to the first place, the second round of 1 will move to the first place, the third round will find no data exchange, the sequence ends in order. However, it is also a sort algorithm that is very sensitive to data orderliness. It is only suitable for sorting data in basic order.

 

This section describes the quick sorting and the Bubble Sorting of its stable version. Next, we will introduce three simple comparative sorting items. These three methods are simple and used as sorting methods for comparison with quick sorting.

(1) directly select sorting

Algorithm idea: First, find the smallest data in the data list, and then change the data to the first data exchange location. Then, find the second small data, change it to the second data exchange location, and so on.

This method is similar to our artificial sorting habits: We select the smallest value from the data to exchange with the first value, and select the smallest and the second one from the saved part.

#include <iostream.h> void SelectSort(int* pData,int Count) {     int iTemp;     int iPos;     for(int i=0;i<Count-1;i++)     {         iTemp = pData[i];         iPos = i;         for(int j=i+1;j<Count;j++)         {             if(pData[j]<iTemp)             {                 iTemp = pData[j];                 iPos = j;             }         }         pData[iPos] = pData[i];         pData[i] = iTemp;     } } void main() {     int data[] = {10,9,8,7,6,5,4};     SelectSort(data,7);     for (int i=0;i<7;i++)         cout<<data[i]<<" ";     cout<<"/n"; } 

In simple sorting, the number of comparisons is certain: n (n-1)/2. Therefore, no matter what the sequence is, it will not have excellent performance (from the first 100k of the forward and reverse order data can be found that it is time-consuming, the difference is only the data moving time ), we can see that it is not sensitive to data orderliness. Although it is frequently used, its data exchange volume is very small. So we will find that it is generally faster than Bubble sorting.

(2) Insert directly to sort

The insertion method is complicated. The basic working principle is to draw a card, find the corresponding position in the front card, and then continue to the next one. Algorithm idea: After I-1 processing, L [1 .. I-1] has arranged the order. I-repeat only inserts L [I] into the proper position of L [1 .. I-1] so that l [1 .. I] is a sorted sequence. To achieve this goal, we can use the sequential comparison method. First compare L [I] and L [I-1], if l [I-1] ≤ L [I], then l [1 .. i] the order has been sorted, the I-th processing is over; otherwise, the position of switching L [I] and L [I-1] continues to compare L [I-1] and L [I-2], until a position J (1 ≤ j ≤ i-1) is found, making L [J]
≤ L [J + 1.

 

#include <iostream.h> void InsertSort(int* pData,int Count) {     int iTemp;     int iPos;     for(int i=1;i<Count;i++)     {         iTemp = pData[i];         iPos = i-1;         while((iPos>=0) && (iTemp<pData[iPos]))         {             pData[iPos+1] = pData[iPos];             iPos--;         }         pData[iPos+1] = iTemp;     } } void main() {     int data[] = {10,9,8,7,6,5,4};     InsertSort(data,7);     for (int i=0;i<7;i++)         cout<<data[i]<<" ";     cout<<"/n"; } 

Simple insert sorting: after each comparison, a maximum of one reverse order is removed, so the efficiency is the same as that of Bubble sorting. However, it is still high in speed, because values are exchanged in Bubble sorting and values are moved in insert sorting, so direct insertion sorting is better than Bubble sorting. Direct insertion is also an algorithm that is extremely sensitive to Data ordering. In order, only n-1 comparisons are required. In the worst case, n (n-1)/2 comparisons are required.

(3) Merge and sort

A merge algorithm (merge) is an operation that combines two sorted sequences into one.
If there is a series {6,202,100,301, 8, 1}
Initial status: [6] [202] [100] [301] [38] [8] [1]
Comparison times I = 1 [6 202] [100 301] [8 38] [1] 3
I = 2 [6 100 202 301] [1 8 38] 4
I = 3 [1 6 8 38 100 202 301] 4
Total: 11 times

void MergeSort(int array[], int first, int last){   int mid = 0;   if(first<last)   {       mid = (first+last)/2;       MergeSort(array, first, mid);       MergeSort(array, mid+1,last);       Merge(array,first,mid,last);   }} 

 

As we can see, the merge algorithm uses recursion, which is very simple. Merging and sorting is a non-local sorting and requires as much auxiliary space as the waiting sequence. When using it to merge two sorted sequences, it will have an incomparable advantage. The time complexity is O (nlog2n) in both the best and worst cases ). It is not sensitive to data orderliness. If the data node has a large amount of data, it is not suitable. However, the index operation can be improved, and the results will be excellent.

Finally, I found the most beautiful quick sorting code on the Internet and wrote it using the C ++ template class, ending with this article:

// Mydata. h file ////////////////////////////////////// //// // class cmydata {public: cmydata (INT index, char * strdata); cmydata (); Virtual ~ Cmydata (); int m_iindex; int getdatasize () {return m_idatasize ;}; const char * getdata () {return m_strdatamember ;}; // The operator is overloaded here: cmydata & operator = (cmydata & srcdata); bool operator <(cmydata & data); bool operator> (cmydata & data); Private: char * m_strdatamember; int m_idatasize ;}; //////////////////////////////////////// /// // mydata. CPP file ////////////////////////////////////// /////// ///// // Cmydata: cmydata (): m_iindex (0), m_idatasize (0), m_strdatamember (null) {} cmydata ::~ Cmydata () {If (m_strdatamember! = NULL) Delete [] m_strdatamember; m_strdatamember = NULL;} cmydata: cmydata (INT index, char * strdata): m_iindex (INDEX), m_idatasize (0), m_strdatamember (null) {m_idatasize = strlen (strdata); m_strdatamember = new char [m_idatasize + 1]; strcpy (m_strdatamember, strdata);} cmydata & cmydata: Operator = (cmydata & srcdata) {m_iindex = srcdata. m_iindex; m_idatasize = srcdata. getdatasize (); m_strdatamember = new char [m_idatasize + 1]; strcpy (m_strdatamember, srcdata. getdata (); return * This;} bool cmydata: Operator <(cmydata & Data) {return m_iindex <data. m_iindex;} bool cmydata: Operator> (cmydata & Data) {return m_iindex> data. m_iindex ;} //////////////////////////////////////// //////////////////////////////////////// /// // master program # include <iostream. h> # include "mydata. H "template <class T> void run (T * pdata, int left, int right) {int I, j; t middle, itemp; I = left; j = right; // The following comparison calls our overloaded operator function middle = pdata [(left + right)/2]; // calculate the median do {While (pdata [I] <middle) & (I <right) // The number of scans from the left that is greater than the median I ++; while (pdata [J]> middle) & (j> left) // number of values greater than the value in the right-hand scan j --; if (I <= J) // a pair of values is found {// itemp = pdata [I]; pdata [I] = pdata [J]; pdata [J] = itemp; I ++; J --;} while (I <= J); // If the subscripts of the two scans are staggered, stop (once completed) // when the left part has a value (left <j), recursive left half edge if (left <j) Run (pdata, left, J ); // when the right part has a value (Right> I), recursive right half edge if (Right> I) Run (pdata, I, right );} template <class T> void quicksort (T * pdata, int count) {run (pdata, 0, Count-1);} void main () {cmydata data [] = {cmydata (8, "xulion"), cmydata (7, "sanzoo"), cmydata (6, "wangjun"), cmydata (5, "vckbase"), cmydata (4, "jacky2000"), cmydata (3, "cwally"), cmydata (2, "vcuser"), cmydata (1, "isdong")}; quicksort (data, 8); For (INT I = 0; I <8; I ++) cout <data [I]. m_iindex <"" <data [I]. getdata () <"/N"; cout <"/N ";}

 

 

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.