First, bubble sort
The principle is 22 exchange, the big data moves in one direction slowly, like the bubble in the water, the sort is very simple, needless to say.
On the code:
1#include <stdio.h>2 3 voidBubblesort (intArr[],intlen)4 {5 inti,j;6 for(i =0; I < Len-1; ++i)7 {8 for(j =0; J < Len-1I ++j) //The selected data is no longer sorted 9 {Ten if(Arr[j] > arr[j+1]) One { A intTMP =Arr[j]; -ARR[J] = arr[j+1]; -arr[j+1] =tmp; the } - } - } - } + - voidShow (intArr[],intlen) + { A for(inti =0; i < Len; i++) at { -printf"%d", Arr[i]); - } -printf"\ n"); - - } in intMain () - { to intArr[] = {-1, A,3, -,3, the,6, at, +, +, -, the,5,Ten,1, +}; + intLen =sizeof(arr)/sizeof(arr[0]); - the Bubblesort (Arr,len); * Show (Arr,len); $ return 0;Panax Notoginseng}
Bubble Sort Summary:
Time complexity: Average condition O (N2), Best case O (n), worst case O (N2)
Spatial degree of clutter: O (1)
Stability: Stable
And I'm going to say another sort of exchange class, it's much more complicated----quick sort.
Second, quick sort
Quick sort, name since dare so called, fast Affirmation is its merit, the World Martial arts, only fast not broken , so, we study well.
First:
We can see:
Left pointer, right pointer, base reference number.
In fact, the idea is quite simple, that is, through the first traversal (let left and right pointer coincident) to find the array of cutting points.
First step: First we remove the number (12) from the left position of the array as the base reference.
Second step: Looking forward from the right position of the array, always finding a smaller number than base,
If found, assign this number to the left position (that is, assign 1 to 12),
The array is now: -1,3,12,32,44,52
The left and right pointers are the front and back-1 respectively.
Step three: Look backwards from the left position of the array, always finding a larger number than (base),
If found, assigns this number to the position of right (that is, 32 assigns-1),
At this point the array is: -1,32,44,32,52,
The left and right pointers are 32, respectively, before and after.
Fourth step: Repeat the "second, third" step until the left and right pointers coincide,
Finally, insert (base) into the 40 position,
At this point the array value is: -1,32,4,32,52, which completes the sort once.
Fifth step: At this point 12 has been infiltrated into the interior of the array, 12 of the left of a group of 12 small, 12 of the right side as a group of numbers are greater than 20,
Take 12 as the point of entry to the left and right sides according to the "first, second, third, fourth" step recursive, the final quick Platoon is finished.
intPartition (intArr[],intLowintHigh//return Keywords{ inti =Low ; intj =High ; intTMP =Arr[i]; while(I <j) { while(I < J && Arr[j] >= tmp) j--; Arr[i]=Arr[j]; while(I < J && Arr[i] <= tmp) i++; ARR[J]=Arr[i]; } Arr[i]=tmp; returni;}voidQuickSort (intArr[],intLowintHigh//fast Sorting, recursive implementation{ if(Low <High ) { intkey; Key=Partition (Arr,low,high); QuickSort (Arr,low,key-1); QuickSort (Arr,key+1, high); }}
Three, the optimization of the fast row
Let's take a look at the most common quick-line summary:
Time complexity: Average situation O (nlog2n), best case O (nlog2n), worst case O (n2)
Space complexity:O (nlog2n)
Stability: Unstable
The disadvantage of the fast-line is instability, and in the continuous recursion, will open up a lot of stack space, resulting in a certain degree of space complexity.
you can see that the quick sort has good average performance, But the worst event complexity is the same as the insertion sort , which is also O (n^2). such as a sequence 5,4,3,2,1, to be ranked as 1,2,3,4,5. According to the fast sorting method, only one data is entered in the correct order each time, the data can not be divided into two parts, it is obvious that the process of sequencing becomes a crooked neck tree, the depth of the tree is n, and the time complexity becomes O (n^2). So we just have to do our best to get the initial sequence out of order, and the natural performance is guaranteed.
1. How to select a Datum element
For divide-and-conquer algorithm, when each partition, if the algorithm can be divided into two equal-length sub-sequence, then the efficiency of the divide-and-conquer algorithm will reach the maximum. In other words, the selection of benchmarks is important. The selection of the datum determines the length of the two two sub-sequences, which has a decisive effect on the efficiency of the whole algorithm.
Ideally, the chosen benchmark would be able to divide the ordered sequence into two equal-length sub-sequences. (Actual execution is difficult)
Method one: Fixed datum element (Basic quick Sort)
Thought: Takes the first or last element of a sequence as a datum element.
If the input sequence is random, the processing time can be acceptable. If the array is already in order, the segmentation at this point is a very bad one. Because each partition can only be ordered to reduce the sequence of one, at this time, the worst case, the rapid sorting into a bubble sort, the time complexity of θ (n^2). Moreover, it is quite common for the input data to be ordered or partially ordered. Therefore, it is very bad to use the first element as a datum element, and in order to avoid this, the following two methods of obtaining a baseline are introduced.
Method Two: Random datum element
Thought: Take any element in the ordered sequence as a datum element.
The reason of the introduction: in order to alleviate this situation, the random selection datum element is introduced when the sequence is partially ordered and the fixed selection datum element makes the efficiency of the fast row.
This is a relatively safe strategy. Because the position of the datum element is random, the resulting segmentation will not always result in inferior segmentation. When the entire array number is equal, the worst case is still, and the time complexity is O (n^2). In fact, the probability of a stochastic quick ordering to get a theoretical worst case is only 1/(2^n). So randomization fast sequencing can achieve the desired time complexity of O (NLOGN) for the vast majority of input data. A predecessor made a brilliant summary: "Random rapid sequencing can meet a person's lifelong personality needs." ”
Method 3:3 in the number of
The reason for the introduction: Although the random selection of the benchmark, reduce the likelihood of bad segmentation, but still the worst case or O (n^2), to alleviate this situation, the introduction of a three-digit selection of the benchmark.
Analysis: The best division is to divide the sequence to be sorted into equal-length subsequence, the best state we can use the middle value of the sequence, that is, the number of N/2. However, this is difficult to figure out, and will significantly slow down the speed of sorting. The median estimate can be obtained by randomly selecting three elements and using their median value as the reference element. In fact, randomness does not help much, so the general practice is to use the median of the three elements on the left, right, and center positions as the base element. It is obvious that the three-digit median segmentation method eliminates the bad situation of pre-sorted inputs and reduces the number of comparisons of the fast rows by approximately 14%.
Example: The sequence to be sorted is: 8 1 4 9 6 3 5 2 7 0
to the left: 8 , to the right 0 , in the middle for 6
We take three numbers here and sort the middle number as the pivot, then the pivot is 6
Note: When selecting the middle axis value, it can be selected from the middle left and right three medium selection to five elements or more elements, in general, there will be (2t+1) mean partition method (median-of-(2t+1), three mean division method English is median-of-three.
Specific idea: treat the data in low, mid, and high three positions in the sort sequence, take the data in between them as a benchmark, and store the benchmark with 0 subscript elements.
2. Two ways to optimize
Optimization One: Use Insert sort when the length of the sequence to be sorted is split to a certain size
Cause: For very small and partially ordered arrays, the quick row is better than the interpolation. When the length of the sequence to be sorted is split to a certain size, the efficiency of continued segmentation is worse than the insertion sort, which can be used instead of a quick row.
Cutoff range: The sequence length to be sorted n = 10, although a similar result can be produced in any cutoff range between 5~20, this practice also avoids some harmful degradation scenarios.
For a random array, using the three-digit selection datum + interpolation, the efficiency can be improved a bit, really for the sorted array, there is no use. Because the ordered sequence is ordered, each partition can only subtract one from the sorted sequence. At this point, the interpolation is not playing a role. So there is no time to see the reduction. In addition, the three-digit selection datum + interpolate or cannot handle the repeating array.
optimization Two: After the end of a split, you can put the key equal elements together, continue the next split, no longer with the key equal element Segmentation
Example:
Sequence to sort 1 4 6 7 6 6 7 6 8 6
Select Datum in three-digit selection: The number of subscript is 4 6
After conversion, the sequence to be split: 6 4 6 7 1 6 7 6 8 6
Benchmark Key:6
After this partition, the result of being equal to the key element is not processed: 1 4 6 6 7 6 7 6 8 6
The next two sub-sequences are: 1 4 6 and 7 6 7 6 8 6
The result of equality with key elements after this division: 1 4 6 6 6 6 6 7 8 7
The next two sub-sequences are: 1 4 and 7 8 7
After comparison, we can see that, after a division, the elements equal to the key together, can reduce the number of iterations, efficiency will improve a lot
Process: In the process, there will be two steps
The first step, in the partitioning process, puts the key equal element into the two ends of the array
The second step, after dividing, moves the element equal to the key around the pivot.
Internal sorting summary----Exchange class sorting (bubbling, quick-row)