The tenth chapter basic data structure
Stack: can be represented by an array
Queue: can be represented by an array
Pointers and objects: can be represented by most groups. Available stacks represent free list
Have root number:
Binary tree: Left and right children
Branch Unrestricted: Left Child right brother representation
11th Chapter Hash List
Arrays: Reserving one location for each element
Hash table: Used to actually store a keyword much less than all possible keywords, such as dictionary operations
Resolve Hash Conflicts : Link method, open addressing method
11.2 Hash Table
With the linked list method, the average time required for a successful or unsuccessful search is θ (1+α) and α is load factor under the assumption of simple uniform hashing.
11.3 Hash function
A good hash function should (approximately) satisfy a simple uniform hypothesis: Each keyword is then hashed to any one of the M slots, regardless of which slot the other keywords have been hashed to.
Useful information for using keyword distributions. such as "PT", "pts" does not conflict
A good way to derive the hash value, in a way that should be independent of any pattern that the data may exist. such as Division Hash
Some applications of the hash function may require a more powerful property than simple, uniform hashing, such as global hashing, which can have distinct hash values for similar keywords.
Global hash function : A set of function h is called the whole domain, if each pair of different keywords k,l belongs to u, the number of hash functions satisfying H (k) =h (L) is at most [h]/m,m is the number of slots.
Such as: P is a large enough prime, so that each keyword K is in the [0,p-1],zp={0,1,..., p-1},zp*={1,2,..., p-1},
Hab (k) = ((ak+b) mod p) mod m
Hpm={hab:a belongs to Zp*,b belongs to ZP}
11.4 Open Addressing Method
All elements are in a hash list, with a loading factor of no more than 1. The probing order is an arrangement of 0~m-1, dependent on H (k,i), and I is the probe number. Save space by not using pointers.
However, when a keyword is deleted, the lookup time is no longer dependent on a, even if the flag is set to deleted. This situation is more appropriate for resolving conflicts by linking.
Uniform hashing: The probe sequence for each keyword may be 0~m-1 m! Any kind of arrangement. Difficult to achieve, only approximate (such as double hashing).
Three techniques: linear probing, two probing, dual probing.
Linear probing: H (k,i) = (h ' (k) +i) mod m. The more time you take, the longer it takes to find it.
Two probes: h (k,i) = (h ' (k) +c1i+c2i2) mod m. The initial position determines the profiling sequence.
Dual probing: h (k,i) = (H1 (k) +ih2 (k)) MoD M.
11.5 Full Hash
The hash has good average performance.
In particular, a full hash can provide excellent worst-case performance O (1) When the key value is static (unchanged after depositing, such as a reserved word for a program).
A full hash scheme is designed with two levels of hash columns, with full domain hashing at each level. Two hash lists do not use a linked list, by carefully selecting hi to ensure that there is no conflict at the second level.
theorem 11.9: If the hash function h is randomly calculated from a global hash function class, and the n keywords are stored in a hash table of size m=n2, the probability of a conflict in the tables is less than the one in the field.
theorem 11.10 : If the hash function h is randomly calculated from a global hash function class, using it to store n keywords in a hash table of size m=n, there is
E[ΣNJ2]<2N NJ is the number of key values for hashing to slot J.
corollary : If the two hash size is mj=nj2, the expected amount of storage required to store all two hash lists is less than 2n.
Corollary 2: The probability of storing all two hash lists equal to or greater than 4n is less than 1/2.
Introduction to algorithms notes--tenth to 11th data structure (a) hash