7. Advanced Sorting:
Hill sort: Simple unpleasant and not slow stable
Insert Sort: The number of copies is too many to behave badly in extreme situations
N-Delta ordering: a formula for gradually decreasing the increment calculation interval h=3*h+1 Knuth
0 1 2 3 4 5 6 7 8 9 10 11 12
The first small certainly in the first group
The second small certainly in the second group, the first group
h=4,8
H=4
Outer=4
temp = array[4]
Innner=4
0 and 4 for comparison
0 and 4 swap positions
Outer =5
temp = array[5]
Inner =5
4 and 0
5 and 1
6 and 2
7 and 3
8 and 4
9 and 5
Does the algorithm not repeat the comparison?
Gradually decrease the interval, and at last must be equal to 1 the last trip sort is a normal insert sort
Division: Hubs
Guaranteed to be smaller than the pivot value on the left, larger than the pivot value on the right
Extreme: All data is smaller or larger than the pivot value and operates in vain
Efficiency of Division: O (N)
Quick sort:
Choice of pivot: Any data item of the specific data item
Essence: Find the ultimate purpose of the hub exchange position
Ideal: The median value of the data item should be selected for the worst Case O (N2)
Choosing the right hub becomes the key: all data calculation no random calculation no three data items in the
Handling small divisions: Using insert sorting for small partitions
Eliminate recursion: Now the system can be more efficient in handling the invocation of the method
Cardinal sort: Do not need to compare the sorting using the computer high-speed bit operation
First trip Sub sort second order ....
Efficiency of Cardinal Sort:
8. Two fork Tree: ordered Array (fast query, free space time), linked list (insert fast)
Ordered linked list
Common terminology: path, root (path to any node with only one), parent node, child node, leaf node, subtree, access (perform some action), traversal, layer, keyword,
Tree solves the problem: Edge (Reference) node (object). Multi-Tree----> Special case, Binary tree
Binary tree: The key value of the left child node of a node is small and the right child node is large
An analogy: Hierarchical file structure
Binary search tree work:
Unbalanced tree: Missing left and right child nodes, non-balanced tree when insert order is ascending or descending
Node class concepts, data items abstracted out
Tree class: Root node
Node class: Data item, left Dial hand node, right child node
Change and delete
Tree query Efficiency: log2n, the time to find a node depends on the number of layers on which the node resides
Inserting the tree:
Traversal: Pre-order, middle-order, post-sequence recursive traversal prefix expression, suffix expression
Find maximum minimum value: Leftmost node minimum, rightmost node maximum
The nature of the tree structure: the leftmost node of each tree is the minimum, the rightmost node is the maximum value, as long as it satisfies this requirement
To delete a node: 1. Delete the leaf node, remove it from the tree,
2. Delete the node that contains a child node, change the parent node's reference to point to the child node
3. Delete the node balance with two child nodes was broken. Need to be filled up. That's so smart.
Binary tree efficiency: large o notation, log2n a full tree most of the nodes are on the ground floor.
Using an array representation tree: Location 2*index+1 actions not allowed to be deleted
Duplicate keyword issues
Compression, Decompression principle:
Huffman coding: The principle of reducing the number of bits that represent the characters commonly used character each code cannot be a prefix for other code
The number of characters with the most frequency table occurrences should be minimal
Create Huffman Tree decoding:
Create Huffman code
9. Red-Black Tree: Adhere to certain rules of insertion, the purpose of maintaining balance
Binary tree problem: orderly insertion does not behave well there is an unbalanced tree insertion efficiency lon2n to n
Balanced remedy: Red-black Tree features: nodes have color
Red and black Rules: Each node is either red or black
The roots are always black
If the node is red, the child node must be black
Each path from the root to the leaf node must contain the same number of black nodes, that is, the black height must be the same
Repeat the balance of the keyword:
Fix violation: Change the color of a node to perform a rotation operation
Non-balanced tree
In short, the purpose of the tree is relatively balanced, the process is quite cumbersome
Other balance tree: AVL tree Saozi Right subtree height difference cannot exceed 1
10.2-3-4 Tree and external storage
Each node has a maximum of 4 nodes, 3 data items all leaf nodes are always on the same layer
Non-leaf nodes exist: The number of child nodes is more than the number of data items in a node
An empty node does not exist
The keywords for all child nodes of the child1 subtree are less than key0 and are greater than key1
Child2 the key value of all child nodes is greater than key1 and less than Key2
Search 2-3-4 Tree:
Insert:
Node splitting: When looking down the insertion position, the node is full and must be split in order to maintain balance
Move data up and to the right
Splitting in the downlink: the parent node of any split node is definitely not full
When to split: when it's full, it splits.
2-3-4 trees can be turned into red and black trees: rotation and discoloration are equivalent to splitting
Storage requirements: Just take advantage of 5/7 of the available space red-Haishi on storage higher than 2-3-4 tree utilization
2-3 Tree: Node splitting: 2-3-4 The new data is inserted after all splits are completed, and the new data items in the 2-3 tree must participate in the splitting process
The purpose of division is to maintain balance.
External storage: Data stored in RAM, disk storage
Quick Find, insert, and delete
Main memory: One second of one out of 10,000 accesses a byte
Disk: 1 per thousand 10 ms for 1 seconds read and write head moved to the correct track, read and write head rotated to the correct position 10000 laps per minute,
Data is stored in chunks to access one block of data at a time
Order ordered: Sort all records by a keyword
Looking for: 2 points find
Insert: Frequent read and write operation keywords that move too frequently on blocks of data
B-Tree: Multi-forked tree commemorates our Nephilim r.bayer and e.m.mccreight data structures for external storage
One data block per node in Disk link is the number of blocks in the file int type
17-Step B-Tree
B-Tree efficiency: Read efficiency log9n Delete, insert 5+6<500000
Index: Keyword---block list index is much smaller than the actual file record, can be put in memory completely
Find:
inserting: inserting
Multilevel index
The index is too large for memory: stored as a B-tree structure data item in disk Save keyword and point to a block pointer in the main file
Combining search criteria: sequential lookups
External file sort: Block internal ordered read two pieces together into an orderly merge sort
11. Hash table
The hash table is faster than the tree
Cons: Performance degradation is very severe when array-based, hard-to-scale, hash tables are basically filled
Hashiha: keyword directly for array subscript, hash function
Keyword as index, array
Dictionaries: dictionaries, compilers (hash tables, symbol tables)
Word and array subscript establish contact number add
50,000 words there is no way to divide the word enough to open the power of 27 to produce an array subscript too much
Hashiha: Take the remainder.
Hash function: Converts a large range of numeric hashes to a small range of numbers
Take medium: Every two array units, there is a word, these units no words
Conflict: There are not too many words with the same array subscript
Solutions for conflict Resolution: Open address law by increasing the vacancy and reducing the conflict chain address method:
Development Address method: Search for blank units (linear detection, two-time detection, re-hashing)
Linear probing: aggregation, resulting in longer probe lengths, means that the last unit of the access sequence is time consuming. Filling factor: The ratio of the data item and the length of the table filled into the hash table is called the filling factor
Two probes: Detecting a unit step farther apart is the square of the number of steps: two aggregates
Re-hashing: Dependent keyword probe sequence stepsize = constant * (key% constant)
Link Address method:
Hash function: Fast calculation, random keyword
Collapse: Keyword grouping
Efficiency of hashing: constant time of hash function + probe length
Linear probing: Successful query p= (1-L2)/2 failed query p= (+)/(1-L)/2
List Address method: Successful 1+LOADFACTOR/2 failure 1+loadfactor
Two probes and re-hashing: Success log2 (1-loadfactory)/loadfactory failure 1/(1-loadfactory)
Comparison of open address method and chain address method: The capacity is known, the filling factor is less than 0.5 unknown chain address method is better
Hashing and external storage: Index
12. Heap
Priority queue insertion and deletion time complexity is logn full binary tree Each node's keywords are larger than the node's child nodes
Weak order: Do not support traversal does not support lookup of keywords
Remove: Move the last node of the root to the location of the root filter down
Insert: Filter up
It's not really swapping. Reduce the number of replications
Efficiency of the heap: log2n
Heap sorting: More replication times than fast sorting, time complexity O (LOGN)
13. Figure
Edges and vertices
The connectivity adjacency path between intersection points
Connected graphs and non-connected graphs
Graph without direction
Euler 7 Bridge problem
Data results and algorithm------