hash+ linked List

Source: Internet
Author: User

The simple hash is to use the combination of array plus list to achieve, this hash is very simple, but the idea of hash in that.

#ifndef _hash_h_#define _hash_h_typedef struct _listnode{struct _listnode *prev;struct _listnode *next;void *data;} Listnode;typedef listnode *list;typedef listnode *position;typedef struct _hashtbl{int tablesize; List *thelists;} Hashtbl;int Hash (void *key, int tablesize); HASHTBL *inithash (int tablesize); void Insert (void *key, hashtbl *hashtable); Position Find (void *key, hashtbl *hashtable); void Destory (Hashtbl *hashtable); void *retrieve (Position P); #endif

#include <stdio.h> #include <stdlib.h> #include "hash.h" int hash (void *key, int tablesize) {char c;int i;int HV Al = 0;for (i = 1; (c = * (char *) key++)! = 0; i++) Hval + = C*i;return (hval%tablesize);} HASHTBL *inithash (int tablesize) {int i; Hashtbl *hashtable; HashTable = malloc (sizeof (HASHTBL)); if (NULL = = HashTable) {printf ("HashTable malloc error\n"); return;} Hashtable->tablesize = tablesize; hashtable->thelists = malloc (sizeof (List) *tablesize), if (NULL = = hashtable->thelists) {printf ("HashTable malloc Error\n "); return;} for (i = 0; i < tablesize; i++) {Hashtable->thelists[i] = malloc (sizeof (ListNode)); if (NULL = = Hashtable->thelists [i]) {printf ("HashTable malloc error\n"); return;} Else{hashtable->thelists[i]->next = NULL; Hashtable->thelists[i]->prev = NULL;}} return HashTable;} Position Find (void *key, hashtbl *hashtable) {int i,j; List L; Position p;i = Hash (key,hashtable->tablesize); L = hashtable->thelists[i]; p = l->next;while (P! = NULL && p->data! = key) p = P->next;return p;} void Insert (void *key, Hashtbl *hashtable) {Position p,tmp; List L; p = Find (key,hashtable), if (null = = P) {tmp = malloc (sizeof (ListNode)), if (null = = tmp) {printf ("malloc error\n"), return; L = Hashtable->thelists[hash (key,hashtable->tablesize)];tmp->data = Key;tmp->next = L->next;if (L-> Next! = NULL) L->next->prev = Tmp;tmp->prev = L; L->next = tmp;} elseprintf ("The key already exist\n");} void *retrieve (Position P) {return p->data;} void Destory (Hashtbl *hashtable) {int i; List L; Position tmp,tmp2;for (i = 0; i < hashtable->tablesize; i++) {L = Hashtable->thelists[i];tmp = L->next;while (tm P->next = NULL) {TMP2 = Tmp->next;free (TMP); tmp = TMP2;} Free (L);} Free (HashTable);} void Main (void) {HASHTBL *hashtable; HashTable = Inithash (+), insert ("a", HashTable), insert ("B", HashTable), insert ("B", HashTable); Position P; P = Find ("A", HashTable);p rintf ("%s\n", Retrieve (P));}

http://blog.csdn.net/dndxhej/article/details/7396841

A hash table (hash table) is a mapping from a collection A to another set B (mapping). A mapping is a correspondence, and an element of collection A can only correspond to one element in set B. In turn, however, an element in set B might correspond to the elements in multiple collection A. If the elements in B only correspond to one of the elements in a, such mappings are called one by one mappings. Such correspondence is common in real life, such as:

A-B

Person-to-man ID number

Date and constellation

In the above two mappings, the man-to-person ID number is a one by one mapping relationship. In a hash table, the corresponding procedure described above is called hashing. The element A in a corresponds to the element in B B,a is called the key value (key) and B is called the hash value of a.

The hash value of the Wei Xiaobao

The mapping is mathematically equivalent to a function f (x): A->b. such as f (x) = 3x + 2. The core of a hash table is a hash function, which specifies how the elements in set a correspond to the elements in set B. Like what:

A: Three-bit integer hash (x) = x% B: one integer

104 4

876 6

192 2

In this correspondence, the hash function is represented as a hash (x) = x 10. That is, given a three-digit number, we take its last one as the hash value of the three-digit number.

Hash tables are widely used in computer science. Like what:

FCS in Ethernet: See Small speakers start Broadcasting (Ethernet and WiFi protocol)

Checksum in the IP protocol: see my best (IP protocol explained)

Hash value in git: See Version Management Kingdoms

In the above application, we use a hash value to represent the key value. For example, in git, the file content is a key value, and the SHA algorithm as a hash function, the file content corresponds to a fixed-length string (hash value). If the contents of the file change, then the corresponding string will change. Git can tell if the contents of a file are changed by comparing the short hash values.

Another example is the computer's login password, which is usually a string of characters. However, for security reasons, the computer does not save the string directly, but instead saves the string's hash value (using MD5, Sha, or other algorithms as the hash function). Enter the password string the next time the user logs in. If the hash value of the password string is the same as the stored hash value, the user is assumed to have entered the correct password. In this way, even if the hacker broke into the database password records, he can see is only the password hash value. The hash function used above has a good unidirectional nature: it is difficult to infer the key value from the hash value. As a result, hackers cannot know the user's password.

(before there are many Web site users password leakage time, is because these sites store plaintext password, rather than hash value, see a number of sites involved in the CSDN leaks the secret code into a controversial focus)

Note that the hash only requires a mapping from A to B, and it does not qualify the corresponding relationship as a one by one mapping. So there is the possibility that two different key values correspond to the same hash value. This situation is called hash collision (hash collision). For example, the checksum in the network protocol may have this situation, that is, the content to be verified differs from the original text, but is the same as the checksum (hash value) generated by the original text. For example, the MD5 algorithm is commonly used to calculate the hash value of a password. Experiments have shown that the MD5 algorithm is likely to collide, that is, different plaintext passwords generate the same hash value, which will bring a great security hole to the system. (Refer to hash collision) hash and search

Hash tables are widely used in search. Set A is the search object, set B is the storage location, and the hash function is used to correspond the search object to the storage location. In this way, we can find the object's location through a hash. A common scenario is to set the set B to subscript the array. Since arrays can be randomly accessed based on array subscripts (random access, which has an algorithm complexity of 1), the search operation will depend on the complexity of the hash function.

For example, we use the name (string) as the key value, the array subscript as the hash value. Each array element is stored with a pointer to the record (person name and phone number).

The following is a simple hash function:

#define Hashsize 1007/* by Vamei * Hash function */int hash (char *p) {    int value=0;    while ((*p)! = ' + ') {       value = value + (int) (*p);//convert char to int, and sum       p++;    }    Return (value% hashsize); Won ' s exceed Hashsize}

Hash value of "Vamei": 498

Hash value of "Obama": 480

We can create an array of hashsize size records for storing records. Hashsize are selected as prime numbers so that the hash value can be distributed more evenly. When searching for "Vamei" record, can go through the hash, get the hash value 498, and then directly read records[498], you can read the record.

(666666 is Obama's phone number, and 111111 is Vamei's phone number.) Pure fiction, please do not take it seriously)

Hash search

If we do not use hash, but only search in an array, we need to access each record sequentially until we find the target record, and the algorithm complexity is n. We can consider why there is such a difference. Arrays can be randomly read, but the array subscript is random, it has nothing to do with the value of the element, so we have to access each element at a successive. With the hash function, we limit the elements that may be stored for each subscript location. In this way, we can use the key value and the hash function to have quite a priori knowledge to select the appropriate subscript to search. In the absence of a hash collision, we only need to select once, we can guarantee that the subscript point to the element is the element we want. Conflict

hash function needs to solve the problem of hash conflict. For example, in the above hash function, "Obama" and "Oaamb" have the same hash value, there is a conflict. How do we solve it?

One scenario is to store the conflicting records in a linked list, so that the hash value points to the linked list, which is called the Open hashing:

Open hashing

When we search, we first find the linked list according to the hash value, and then iterate through the list of search links according to the key value until we find the record. We can use other data structures instead of linked lists.

Open hashing need to use pointers. There are times when we want to avoid using pointers to preserve the advantages of random storage, so we use closed hashing to resolve conflicts.

Closed hashing

In this case, we put the record into the array. When there is a conflict, we place the conflict record in the array where it is still idle, and then the Oaamb is hashed to 480 bits after the middle Obama is inserted. But because 480 is occupied, Oaamb detects the next idle position (by adding 1 to the hash value) and recording.

The key to closed hashing is how to detect the next position. The above is the hash value plus 1. But there can be other ways. In summary, at the time of the first I, we should detect position (i) = (h (x) + f (i))% hashsize position. The value of the hash is added to 1, which is equivalent to setting f (i) = 1. When we are searching, we can use position (i) to detect where the record might appear until the record is found.

(f (i) choice will bring different results, no further in depth here)

If the array is full, then closed hashing need to do many more probes to find the empty space. This will greatly reduce the efficiency of insertions and searches. In this case, you need to increase the hashsize and put the original records into the new larger array. Such operations are called rehashing.

Http://www.cnblogs.com/vamei/archive/2013/03/24/2970339.html

hash+ linked List

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.