Introduction to algorithms notes--tenth to 11th data structure (a) hash

Last Update:2017-08-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The tenth chapter basic data structure

Stack: can be represented by an array

Queue: can be represented by an array

Pointers and objects: can be represented by most groups. Available stacks represent free list

Have root number:

Binary tree: Left and right children

Branch Unrestricted: Left Child right brother representation

11th Chapter Hash List

Arrays: Reserving one location for each element

Hash table: Used to actually store a keyword much less than all possible keywords, such as dictionary operations

Resolve Hash Conflicts : Link method, open addressing method

11.2 Hash Table

With the linked list method, the average time required for a successful or unsuccessful search is θ (1+α) and α is load factor under the assumption of simple uniform hashing.

11.3 Hash function

A good hash function should (approximately) satisfy a simple uniform hypothesis: Each keyword is then hashed to any one of the M slots, regardless of which slot the other keywords have been hashed to.

Useful information for using keyword distributions. such as "PT", "pts" does not conflict

A good way to derive the hash value, in a way that should be independent of any pattern that the data may exist. such as Division Hash

Some applications of the hash function may require a more powerful property than simple, uniform hashing, such as global hashing, which can have distinct hash values for similar keywords.

Global hash function : A set of function h is called the whole domain, if each pair of different keywords k,l belongs to u, the number of hash functions satisfying H (k) =h (L) is at most [h]/m,m is the number of slots.

Such as: P is a large enough prime, so that each keyword K is in the [0,p-1],zp={0,1,..., p-1},zp*={1,2,..., p-1},

Hab (k) = ((ak+b) mod p) mod m

Hpm={hab:a belongs to Zp*,b belongs to ZP}

11.4 Open Addressing Method

All elements are in a hash list, with a loading factor of no more than 1. The probing order is an arrangement of 0~m-1, dependent on H (k,i), and I is the probe number. Save space by not using pointers.

However, when a keyword is deleted, the lookup time is no longer dependent on a, even if the flag is set to deleted. This situation is more appropriate for resolving conflicts by linking.

Uniform hashing: The probe sequence for each keyword may be 0~m-1 m! Any kind of arrangement. Difficult to achieve, only approximate (such as double hashing).

Three techniques: linear probing, two probing, dual probing.

Linear probing: H (k,i) = (h ' (k) +i) mod m. The more time you take, the longer it takes to find it.

Two probes: h (k,i) = (h ' (k) +c1i+c2i2) mod m. The initial position determines the profiling sequence.

Dual probing: h (k,i) = (H1 (k) +ih2 (k)) MoD M.

　　11.5 Full Hash

The hash has good average performance.

In particular, a full hash can provide excellent worst-case performance O (1) When the key value is static (unchanged after depositing, such as a reserved word for a program).

A full hash scheme is designed with two levels of hash columns, with full domain hashing at each level. Two hash lists do not use a linked list, by carefully selecting hi to ensure that there is no conflict at the second level.

　　theorem 11.9: If the hash function h is randomly calculated from a global hash function class, and the n keywords are stored in a hash table of size m=n2, the probability of a conflict in the tables is less than the one in the field.

　　theorem 11.10 : If the hash function h is randomly calculated from a global hash function class, using it to store n keywords in a hash table of size m=n, there is

E[ΣNJ2]<2N NJ is the number of key values for hashing to slot J.

　　corollary : If the two hash size is mj=nj2, the expected amount of storage required to store all two hash lists is less than 2n.

　　Corollary 2: The probability of storing all two hash lists equal to or greater than 4n is less than 1/2.

Introduction to algorithms notes--tenth to 11th data structure (a) hash

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Introduction to algorithms notes--tenth to 11th data structure (a) hash

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support