Algorithm sorting NB Two people group heap sort Merge sort

Last Update:2017-09-04 Source: Internet

Author: User

Tags sorts

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Heap sort heap sort pre-tree with two fork tree

Tree is a very common non-linear data structure, called tree structure, referred to as tree. The so-called data structure is a collection of datasets together with their storage relationships and how they are manipulated. The tree structure is like the structure of a tree in nature, with a root and several branches and leaves. The root or trunk is the first layer, branching out from the trunk is the second layer, one layer at a time until finally, the end of the branches of the node is called the leaf, so the tree structure is a hierarchy. In the data structure, it is named after a human bloodline, a node of which is called the "child" of the node, and its upper node is called the father of the node. Root node no father, leaf knot no children. In the data structure, the strict mathematical definition of the tree is given by recursive method.

The standard definition of a tree:

A tree is a n>0 collection of nodes that contain n (s), Where:

(1) Each element is called a node;
(2) There is a specific node known as the root node or roots (root).
(3) The rest of the data elements outside the root node are divided into M (m≥0) disjoint combinations of t1,t2, ... Tm-1, where each set of Ti (1<=I<=M) itself is also a tree, known as the subtree (subtree) of the original tree.

The tree has the following characteristics:
(1) Each node has 0 or more child nodes.
(2) There is only one parent node for each child node.
(3) A node that does not have a parent node is called the root node.

Some terms about the tree

Node Degree: The number of sub-trees that a node contains is called the degree of the node;

leaf node or end node: a node with zero degrees is called a leaf node;

Non-terminal node or branch node: a node with a degree nonzero;

Parents or parent nodes: If a node has a child node, then it is called the parent node of its child node;

child node or child node: the root node of a subtree containing a node is called a child node of that node;

Sibling nodes: nodes with the same parent node are called sibling nodes;

The height or depth of a tree: Defines the root node level of a tree as 1, and the other node's level is its parent node level plus 1. The maximum level of all nodes in a tree is called the depth of the tree.

Hierarchy of nodes: From the beginning of the root definition, the root is the 1th layer, the root of the sub-node is the 2nd layer, and so on;

Degree of the tree: the degree of the largest node in a tree is called the degree of the tree;

Ancestor of a node: all nodes from the root to the branch of the node;

Descendants: Any node in a subtree that is rooted in a node is known as the descendant of that node.

Forest: The collection of trees that are disjoint by M (m>=0) is called a forest;

Summarized as follows:

1, tree is a data structure;
2. A tree is a data structure that can be recursively defined ;

3. A tree is a collection of N nodes:
If the n=0 empty set, then this is an empty tree;
If n>0, there are 1 nodes as the root node of the tree, the other nodes can be divided into M-sets, each set itself is a tree.

4. A tree is a tree of individual nodes made up of leaves. Root, each node has a unique path attached to the root (otherwise there is a ring). The path starts at the root, and the end is on that node. The length of the path is called the depth of the node. The maximum of all node depths is called the height of the tree or the depth of the tree. The predecessor of the previous node of the path is its parent node, and the child node is the successor of the node. Because the precursor is unique, that is, the non-root node into the degree is 1. So we just need to care about the degree. The degree of the node is the number of child nodes, called the degree of the nodal point.

Special and commonly used tree--two-fork tree

A binary tree is a finite set consisting of n (n≥0) nodes and an ordered tree with a maximum of two subtrees per node. It is either an empty set, or consists of a root and two disjoint two-tree trees called the left and right subtrees.

Characteristics:
(1) binary tree is an ordered tree, even if there is only one subtree, it must distinguish between the left and right subtree;
(2) The degree of each node of the binary tree can not be more than 2, can only take 0, 1, 23 one;
(3) There are 5 kinds of nodes in the binary tree: empty nodes, nodes with no left and right sub-trees, only Zuozi nodes, only the nodes of the left subtree, and the nodes with the subtree.

Summary: The two-tree is a tree with a maximum of two children per node, and its subtree has an ordered tree of left and right points; a binary tree with no more than 2 trees (nodes have a maximum of two forks)

Recursive definition of binary tree:
A binary tree is an ordered tree with at most two children at any one of its nodes, called the left child and the right child, respectively. The left child of the root L and right child R is also a binary tree, called the root of the Saozi right subtree.

Two kinds of special binary trees
⑴ Two fork Tree:
If a two-tree depth is k, there is a total of 2k-1 nodes, that is, any layer I have 2I-1 nodes, called full two fork tree.
⑵ Complete binary tree:
If a binary tree is up to the lowest level but not the number of leaf nodes can be less than 2 and the lowest node if there is only one child, it must be left child, then called this binary tree for complete binary tree)
In other words, the two-tree is full of children, each bearing the knot has two children. The complete binary tree is even though discontent, but the birth of the boy (left), there is no only daughter-raising.

How to store binary trees

　　A complete binary tree can be stored in a list of "sequential storage", through which the father can find the child or find the father from the child. Father node with I expression
Parent node subscript relationship to left child node: i = 2i+1
Parent node and right child node subscript relationship: i = 2i+2

Heap Sort Heap Sort

Here the heap (binary heap), refers to not the stack of the heap, but a data structure. The heap can be regarded as a complete two-fork tree, and an "excellent" property of a complete binary tree is that each layer is full except for the lowest level. This allows the heap to be represented by an array, with each node corresponding to an element in the array.
There are two types of binary piles: Dagen (large top pile) and small Gan (small top heap).
Dagen: A completely binary tree, satisfying any node is larger than its child node;
　　　　Features: The largest element appears on the root node, and the element in the root node of the largest heap must be the maximum value in the heap.
Small Gan: A completely binary tree, satisfies any node is smaller than its child node;
　　　　Features: Minimal elements appear on the root node

Note: When the left and right subtree of the root node is a heap, but itself is not a heap, it can be transformed into a heap by a downward adjustment.

　　Heap sort is a method of sorting by the nature of a large heap or small Gan. The overall time complexity for heap sorting is O (Nlogn). Take Dagen as an example: our heap sorting algorithm is to seize this feature of the heap, each time to take the heap top element, put it in the end of the sequence, and then re-adjust the remaining elements to the maximum heap, and so on, and finally get sorted sequence.

Core idea:
Constructs the sequence to be sorted into a large top heap, at which point the maximum value of the entire sequence is the root node of the heap. Swap it with the end element of the heap array, and then reconstruct the remaining n-1 sequence into a large top heap. After the remainder is adjusted to a large top heap, the maximum number of heap tops is removed again, and the remainder is adjusted to the large top heap. Repeat the previous operation, the process continues until the remaining number is only one end, and finally get an ordered sequence.
Heap sorting process:
1, build the heap, a few
2, get the heap top element, for the largest element
3. Remove the heap top and place the last element of the heap on top of the heap, where the heap can be re-ordered by a single adjustment.
4, the top element of the heap is the second largest element.
5. Repeat step 3 until the heap becomes empty.

Construction Heap:
For an unordered sequence, each time from the last sub-tree position, and then compare, the same level of the end, and then to judge the upper layer!
Heap by:
Find the last number as a pawn, then take the value of the heap top, put it at the end, and then execute the number of the fetch before the last fetch.

Algorithm implementation:

def cal_time (func): Def wrapper (*args, **kwargs): T1 = time.time () x = func (*args, **kwargs) t2 = Time.time () print ("%s running time%s secs."% (func.__name__, t2-t1)) return x return Wrapperdef sift (  Data, low, high): "" "Adjustment function Data: List Low: The root position of the subtree to be adjusted High: the position of the last node of the subtree to be adjusted" "" I = low #子树的根 j = 2 * i + 1 #根的左孩子 tmp = data[i] #根的值 # I point to an empty position while J<=high: #j一定要在范围之内, at this point the leader has already rolled to the bottom if j! = High and Da TA[J] < Data[j+1]: j + = 1 #j指向数值大的孩子 if tmp < DATA[J]: #如果小领导比撸下来的大领导能力值大 data[            I] = data[j] #把大值放在高位 i = J # Assign the current J to I, point to the new empty root j = 2*i+1 #生成新的j, hold the New left child # with the new i,j re-cycle else: Break #撸下来的领导比候选的领导能力值大 #循环结束, TMP is the smallest value written in Dagen last data[i] = tmp #查找已经到底了, writes the number to the last position. @cal_timedef heap_sort (data): "" "Heap Sort" "" N = Len (data) # Build the heap from the last non-leaf node subtree "The last leaf node compared to its parent node" starts Building # n//2-1 represents the last non-leaf node Index position, starting at this node, and comparing for I in range (n//2-1,-1,-1): Sift (data, I, n-1) #为了方便, we set the high of the last subtree to a pile of high.  And the index of the last element of the list # every number of the process, the high value changes and the low value is unchanged for the high in range (n-1,-1,-1): Data[0], Data[high] = Data[high], Data[0] #把堆顶和堆尾位置交换 (save space, only in the current list operation) Sift (data, 0, high-1) #因为每次都会出一个数, so the true high position is high-1return Datali = list (range (100000)) Rd.shuffle (LI) heap_sort (LI)

The run time of a heap sort is primarily consumed by repeated filtering of the initial build heap and rebuild heap.
The time complexity of its initial build heap is O (n).
When formally ordered, the time complexity of rebuilding the heap is O (Nlogn).
So the overall time complexity for heap sorting is O (Nlogn).

Heap sorting is not sensitive to the sort state of the original record, so it is O (NLOGN) regardless of the best, worst, and average time complexity. Performance is better than bubbling, simple selection and direct insertion algorithm. In space complexity, only one staging unit is needed for exchange. However, because the comparison and exchange of records are jumping, heap ordering is also an unstable sort method. In addition, because the initial build heap is more frequently compared, the heap ordering is not suitable for sorting with a smaller number of sequences.

Nlargest now has n number (n>10000), the design algorithm, the size of the order to get the first 10 small number. Application scenario: Top of the listTenSolution: Take the first 10 elements of the list to create a small Gan.    The top of the heap is now the 10th largest number. Iterates backward through the original list, ignoring the element if it is less than the top of the heap, if it is larger than the heap top, replacing the heap top with that element, and adjusting the heap once, and after iterating through all the elements of the list, the top of the heap pops up. Time complexity: O (NLOGM) def topn (Li, N): Heap= li[0: N] # Build a heap forIinchRange (n//2-1,-1,-1):Sift (heap, I, N-1) # Traversal forIinchrange (n, Len (LI)):ifLi[i] > heap[0]: heap[0] =Li[i] Sift (heap,0N1) # several forIinchRange (N-1, -1, -1): heap[0], heap[i] = Heap[i], heap[0] Sift (heap,0I1)

solve the problem of the first few big numbers!

Python built-in module--HEAPQ using the HEAPQ module to implement heap ordering # each time a number is added to the heap, it is adjusted upward (set to small Gan) def heapsort (li): h = []for value in Li:heappush (h, value) return [Heappop (h) for I in range (len (h))] uses the HEAPQ module to achieve Top-kheapq.nlargest (LI) #取前10大的数

Merging sort Merge Sorts

Principle:
Divides the original array into sub-lists, sorts each sub-list, continues to merge the handle list with the sub-list, and remains orderly after merging until all are merged, forming an ordered list.

decomposition: The smaller the list forewarned, until it is divided into an element.
An element is ordered.
Merge: Merges two ordered lists, and the list is getting bigger.

Algorithm implementation:

#!/usr/bin/env python# _*_ coding:utf-8 _*_# One merge sort code def merge (Li, Low, Mid, High): "" "merge sort, take the value between, to the left and right two temporary list to sort, after sorting completes two ordered list to merge" "" i = low J = Mid + 1 ltmp = [] # low ~ High This small            List of blocks while I <= Mid and J <= High: # Number of middle numbers or if li[i] <= Li[j]: ltmp.append (li[i)) i + = 1 else: # Li[i]>li[j] Ltmp.append (Li[j]) J + = 1 # Judging left always has while I <= mi    D:ltmp.append (Li[i]) i + = 1 # Judging Right always has while J <= High:ltmp.append (Li[j]) J + = 1 Li[low:high + 1] = ltmp# Write back to the original list def merge_sort (Li, Low, high): "" "Using recursive decomposition of" dichotomy "and Merging" "" if low < High:mi        D = (low + high)//2 # justified Value Merge_sort (Li, Low, mid) # recursive call left half Merge_sort (Li, Mid+1, high) # recursive call right half Merge (Li, Low, Mid, high) # merge return Lili = [4, 6, 3, 8, 2, 5, 7]high = Len (li) -1print (Merge_sort (Li,0,high) )

The merge sort is insensitive to the distribution of the original sequence elements and has a time complexity of O (Nlogn).
Merge sort requires the use of a certain auxiliary space "empty list" for recursion and storing results, so its spatial complexity is O (n).
There are no jumps in the merge sort, only 22 comparisons, so it is a stable sort.
In summary, merge sort is a more efficient and stable algorithm that uses memory.

Quick sort, heap sort, merge sort-Summary

The time complexity of the three sorting algorithms is O (NLOGN)

In general, in terms of run time:
Quick Sort < Merge sort < heap sort

Disadvantages of three sorting algorithms:
Quick sort: Low order efficiency in extreme situations
Merge sort: Additional memory overhead is required
Heap ordering: Relatively slow in fast sorting algorithms

Algorithm sorting NB Two people group heap sort Merge sort

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More