Sorting and searching

Last Update:2018-07-26 Source: Internet

Author: User

Tags sorts

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Sorting and searching

The Sorting algorithm (English: Sorting algorithm) is an algorithm that can arrange a string of data in a particular order.

Stability of the sorting algorithm

Stability: A stable sorting algorithm maintains a relative order of records that have equal key values. That is, if a sorting algorithm is stable, when there are two equal key values of the record r and S, and in the original list R appears before s, in the sorted list R will also be before S.

Stability is not a problem when equal elements are indistinguishable, such as integers. However, suppose the following pairs of numbers will be sorted by their first number.

(4, 1)  (3, 1)  (3, 7)（5, 6）

In this situation, it is possible to produce two different results, one is to maintain the relative order of the record of the equal key value, while the other one does not:

(3, 1)  (3, 7)  (4, 1)  (5, 6)  （维持次序）(3, 7)  (3, 1)  (4, 1)  (5, 6)  （次序被改变）

The unstable sorting algorithm may change the relative order of records in equal key values, but the stable sort algorithm never does. The unstable sorting algorithm can be specially implemented as stable. One way to do this is to artificially expand the comparison of the key values, so that in other respects the same key value of the comparison between two objects (such as the above comparison to add a second criterion: The second key value of the size) will be determined to use in the original data order of the entries, as a final. However, keep in mind that this order usually involves additional space burdens.

Bubble sort

Bubble sort (English: Bubble sort) is a simple sort algorithm. It iterates over the sequence of columns to be sorted, compares two elements at a time, and swaps them if their order is wrong. The work of iterating through the series is repeated until no more swapping is needed, meaning that the sequence is sorted. The algorithm is named because the smaller elements will slowly "float" through the switch to the top of the sequence.

The bubbling sorting algorithm works as follows:

Compares the adjacent elements. If the first one is larger than the second (ascending), swap them both.
Do the same for each pair of adjacent elements, starting with the last pair from the first pair to the end. When this is done, the final element will be the maximum number.
Repeat the above steps for all elements, except for the last one.
Repeat the above steps each time for fewer elements, until there are no pairs of numbers to compare.

Analysis of bubble sort

Swap process diagram (first time):

Then we need to do a n-1 bubble process, each corresponding to the number of comparisons as shown:

def Bubble_sort (alist):      for  in range (len (alist) -1,0,-1):        #  J indicates the number of times each traversal needs to be compared, decreasing for         inch Range (j):             if alist[i] > alist[i+1]:                alist[i], alist[i+1] = alist[i+1= [ 54,26,93,17,77,31,44,55,20]bubble_sort (li)print(LI)

Complexity of Time

Optimal time complexity: O (n) (indicates that traversing a discovery does not have any elements that can be exchanged, the sort ends.) ）
Worst time complexity: O (n2)
Stability: Stable

Demo of bubbling Sort

Effect:

Select sort

Select sort (Selection sort) is a simple and intuitive sorting algorithm. It works as follows. First find the smallest (large) element in the unordered sequence, place it at the beginning of the sort sequence, and then continue looking for the smallest (large) element from the remaining unsorted elements, and place it at the end of the sorted sequence. And so on until all elements are sorted.

The main advantages of selecting a sort are related to data movement. If an element is in the correct final position, it will not be moved. Select sort every time a pair of elements is exchanged, at least one of them will be moved to its final position, so the table of n elements is sorted for a total of up to n-1 times. In all of the sorting methods that rely entirely on swapping to move elements, choosing a sort is a very good one.

Select Sort Analysis

Sorting process:

Red indicates the current minimum value, yellow indicates the sorted sequence, and blue indicates the current position.

defSelection_sort (alist): N=Len (alist)#requires a n-1 selection operation     forIinchRange (n-1):        #Record minimum positionMin_index =I#Select the minimum data from the i+1 position to the end         forJinchRange (i+1, N):ifALIST[J] <Alist[min_index]: Min_index=J#If the selected data is not in the correct location, swap        ifMin_index! =I:alist[i], Alist[min_index]=Alist[min_index], alist[i]alist= [54,226,93,17,77,31,44,55,20]selection_sort (alist)Print(alist)

Complexity of Time

Optimal time complexity: O (n2)
Worst time complexity: O (n2)
Stability: Unstable (consider ascending each time to select the maximum case)

Select Sort Demo

Insert Sort

Insert Sort (English: insertion sort) is a simple and intuitive sorting algorithm. It works by constructing an ordered sequence, for unsorted data, to scan from backward forward in the sorted sequence, to find the appropriate position and insert it. Insert sort on implementation, in the process of backward forward scanning, the ordered elements need to be shifted backwards and forwards, providing the insertion space for the newest elements.

Insert Sort Analysis

def Insert_sort (alist):     # from the second position, that is, the element labeled 1 starts to insert forward     for  in range (1, Len (alist)):        #  starting with the first element, if it is less than the previous element, swap position for           in range (I, 0,-1):            if alist[j] < alist[j-1]:                alist[j], Alist[j-1] = alist[j-1= [54,26,93,17,77,31,44,55,20]insert_sort (alist)Print (alist)

Complexity of Time

Optimal time complexity: O (n) (in ascending order, sequence already in ascending state)
Worst time complexity: O (n2)
Stability: Stable

Insert Sort Demo

Quick Sort

Quick Sort (English: Quicksort), also known as the divide-and-exchange sort (partition-exchange sort), divides the sorted data into separate two parts by a single pass, where all of the data is smaller than any other part of the data, Then the two parts of the data are sorted quickly by this method, and the whole sorting process can be carried out recursively so as to achieve the whole data into ordered sequence.

The steps are:

Pick an element from the sequence called "Datum" (pivot),
Reorder the columns, where all elements are placed in front of the datum in a smaller position than the base value, and all elements are larger than the base value behind the datum (the same number can be on either side). At the end of this partition, the datum is in the middle of the sequence. This is called partition (partition) operation.
recursively (recursive) sorts sub-columns that are smaller than the base value elements and sub-columns that are larger than the base value elements.

At the bottom of the recursive scenario, the size of the sequence is 0 or one, which is always sorted. Although it is always recursive, the algorithm always ends, because in each iteration (iteration), it will at least put an element in its final position.

Analysis of fast sequencing

defQuick_sort (alist, Start, end):"""Quick Sort"""    #Recursive exit conditions    ifStart >=End:return    #sets the starting element as the datum element to find the positionMID =Alist[start]#Low is the left-to-right cursor for the left of the sequenceLow =Start#High is the right-to-left cursor to the right of the sequenceHigh =End whileLow <High :#If low is not coincident with high, the element with high points is not smaller than the datum element, then high moves to the left         whileLow < High andAlist[high] >=Mid:high-= 1#Place the high- pointing element in the low positionAlist[low] =Alist[high]#If low is not coincident with high, the element that is pointed to is smaller than the datum element, and the lower is moved to the right         whileLow < High andAlist[low] <Mid:low+ = 1#Place the element with low pointing to the high positionAlist[high] =Alist[low]#after exiting the loop, low is coincident with high, at which point the position is the correct position of the datum element    #Place a datum element at that locationAlist[low] =Mid#Quick ordering of sub-sequences to the left of a datum elementQuick_sort (alist, start, low-1)    #Quick ordering of sub-sequences to the right of a datum elementQuick_sort (Alist, low+1, end) Alist= [54,26,93,17,77,31,44,55,20]quick_sort (Alist,0,len (alist)-1)Print(alist)

Complexity of Time

Optimal time complexity: O (NLOGN)
Worst time complexity: O (n2)
Stability: Unstable

It is not obvious that the average cost of an O (n log n) time from the start of a quick sort is not apparent. But it is not difficult to observe the partitioning operation, the elements of the array will be visited once in each loop, using O (n) time. This operation is also O (n) in the version with which the binding (concatenation) is used.

In the best case, every time we run a partition, we divide a sequence into two nearly equal fragments. This means that each recursive call handles half the size of the sequence. Therefore, we only need to make log n nested calls before we reach the size of a series. This means that the depth of the call Tree is O (log n). However, in two program calls of the same hierarchy, the same part of the original sequence is not processed; Therefore, each hierarchy of program calls requires only an O (n) of time (there is some common additional cost per invocation, but because only O (n) calls are in each hierarchy, these are summarized in O (n) coefficients). The result is that the algorithm only uses O (n log n) time.

Quick Sort Demo

Hill sort

The Hill sort (Shell sort) is a sort of insertion. Also known as narrowing incremental sorting, is a more efficient and improved version of the direct insertion sorting algorithm. Hill Sort is a non-stable sorting algorithm. The method is due to DL. The shell was named after it was introduced in 1959. Hill sort is to group records by a certain increment of the subscript, sorting each group using the direct insertion sorting algorithm; As the increments gradually decrease, each group contains more and more keywords, when the increment is reduced to 1 o'clock, the entire file is divided into a group, the algorithm terminates.

Hill Sort Process

The basic idea of the hill sort is that the array is listed in a table and the columns are sorted separately, repeating the process, but each time it is done with a longer column (with longer steps and fewer columns). Finally, there is only one column for the entire table. Converting an array to a table is a good way to understand the algorithm, and the algorithm itself uses arrays for sorting.

For example, suppose you have a group of numbers [13 14 94 33 82 25 59 94 65 23 45 27 73 25 39 10], and if we start with a step of 5, we can better describe the algorithm by placing the list in a table of 5 columns, so they should look like this (vertical element is the step size):

13 14 94 33 8225 59 94 65 2345 27 73 25 3910

Then we sort each column:

10 14 73 25 2313 27 94 33 3925 59 94 65 8245

By connecting the four lines of numbers together, we get: [10 14 73 25 23 13 27 94 33 39 25 59 94 65 82 45]. At this point 10 has moved to the correct position, and then the step is sorted by 3:

10 14 7325 23 1327 94 3339 25 5994 65 8245

After sorting becomes:

10 14 1325 23 3327 25 5939 65 7345 94 8294

Finally, sort by 1 steps (this is a simple insert sort)

Analysis of the Hill sort

defShell_sort (alist): N=Len (alist)#Initial StepGap = N/2 whileGap >0:#Insert Sort by step         forIinchRange (Gap, N): J=I#Insert Sort             whileJ>=gap andALIST[J-GAP] >Alist[j]: alist[j-GAP], alist[j] = alist[j], alist[j-Gap] J-=Gap#get a new step sizeGap = GAP/2alist= [54,26,93,17,77,31,44,55,20]shell_sort (alist)Print(alist)

Complexity of Time

Optimal time complexity: varies by step sequence
Worst time complexity: O (n2)
Stability thinking: unstable

Hill Sort Demo

Merge sort

Merge sort is a very typical application of divide-and-conquer method. The idea of merging sorts is to recursively decompose the arrays and then merge the arrays.

After the decomposition of the array to a minimum, and then merge two ordered arrays, the basic idea is to compare the first number of two arrays, who is small to take who, after taking the corresponding pointer to move backward one. Then compare until an array is empty and finally copy the remainder of the other array.

Analysis of merge sort

defMerge_sort (alist):ifLen (alist) <= 1:        returnalist#two-part decompositionnum = Len (alist)/2 Left=Merge_sort (Alist[:num]) right=Merge_sort (alist[num:])#Merging    returnmerge (Left,right)defmerge (left, right):" "merge operation, combine two ordered arrays left[] and right[] into a large ordered array" "    #The subscript pointer to left and rightL, R =0, 0 result= []     whileL<len (left) andr<Len (right):ifLEFT[L] <Right[r]: Result.append (Left[l]) L+ = 1Else: Result.append (Right[r]) R+ = 1result+=left[l:] Result+=Right[r:]returnresultalist= [54,26,93,17,77,31,44,55,20]sorted_alist=mergesort (alist)Print(sorted_alist)

Complexity of Time

Optimal time complexity: O (NLOGN)
Worst time complexity: O (NLOGN)
Stability: Stable

Comparison of efficiency of common sorting algorithms

A search is an algorithmic process that finds a particular item in a project collection. The usual answer to the search is true or FALSE, because the item exists. Several common methods of search: Sequential lookup, binary lookup, binary tree lookup, hash lookup

Binary Method Search

Binary search also known as binary lookup, the advantages are less than the number of comparisons, Find Fast, the average performance is good, the disadvantage is that the unknown origin table is ordered table, and insert delete difficult. Therefore, the binary lookup method is suitable for an ordered list that does not change frequently and finds frequent. First, suppose that the elements in the table are arranged in ascending order, comparing the keywords in the middle position of the table with the lookup keywords, and if they are equal, the lookup succeeds; otherwise, the table is divided into the front and the last two sub-tables with the intermediate positional records, and if the middle position record keyword is greater than the Find keyword, the previous child Otherwise, find the latter child table further. Repeat the process until you find a record that satisfies the criteria, make the lookup successful, or until the child table does not exist, the lookup is unsuccessful at this time.

Binary Lookup implementation (non-recursive implementation)

defBinary_search (Alist, item): First=0 Last= Len (alist)-1 whilefirst<=Last:midpoint= (first + last)/2ifAlist[midpoint] = =Item:returnTrueelifItem <Alist[midpoint]: last= Midpoint-1Else: First= Midpoint+1returnfalsetestlist= [0, 1, 2, 8, 13, 17, 19, 32, 42,]Print(Binary_search (Testlist, 3))Print(Binary_search (Testlist, 13))

(Recursive implementation)

defBinary_search (Alist, item):ifLen (alist) = =0:returnFalseElse: Midpoint= Len (alist)//2ifalist[midpoint]==Item:returnTrueElse:          ifitem<Alist[midpoint]:returnBinary_search (Alist[:midpoint],item)Else:            returnBinary_search (alist[midpoint+1:],item) testlist= [0, 1, 2, 8, 13, 17, 19, 32, 42,]Print(Binary_search (Testlist, 3))Print(Binary_search (Testlist, 13))

Complexity of Time

Optimal time Complexity: O (1)
Worst time complexity: O (LOGN)

Sorting and searching

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More