Linear detection method for dealing with hash collisions

Source: Internet
Author: User

A hash table is a data structure that accesses the memory storage location directly, based on the keyword ( key value). That is, it accesses records by calculating a function of the key value, mapping the data of the desired query to a location in the table, which speeds up the lookup. This mapping function is called a hash function, and the array that holds the record is called a hash list . (Excerpt from Wikipedia)

The same hash address may be obtained for different keywords, that is, K1!=K2, and F (K1) =f (K2), a phenomenon called collisions (English: collision), also known as hash collisions.

    • There are many ways to handle hash collisions:

    1. Closed Hash method

    2. Open chain method (hash bucket)

    3. Prime List

    4. String hashing algorithm

Here we discuss the simplest method of linear detection of the closed-hash method, and learn this method, which can be based on the idea of linear detection method to understand other methods.

    • Linear detection method

Definition: hash function hash (key), find the keyword key in the linear sequence of the position, if the current position has a keyword, the longevity of the hash conflict, the next detection of I position (I less than the size of the linear sequence), until the current position no keyword exists.

650) this.width=650; "src=" Http://s4.51cto.com/wyfs02/M02/7F/BE/wKioL1crNyTA7H6XAAAbtTzjjD0906.png "title=" Capture 2. PNG "alt=" Wkiol1crnyta7h6xaaabttzjjd0906.png "/>


#pragma  once#include<iostream> #include <string>using namespace std;enum state{ Empty,exist,delete};template<class t>struct defaultfunc{size_t operator () (Const T & data) {return  (size_t) data;}; Struct stringfunc{size_t operator () (CONST&NBSP;STRING&AMP;&NBSP;STR) {size_t sum = 0; for  (Size_t i = 0; i < str.size ();  ++i) {sum += str[i];} return sum;}}; Template<class k,class funcmodel=defaultfunc<k>>class hashtable{public:hashtable (); HashTable (const size_t size); Bool push (const k& data);//increment Bool remove (const  k& data);//delete Size_t find (const k& data);//Bool alter (const K&  data, const k& newdata);//Change Void print ();//Print hash table Protected:size_t hashfunc ( Const k& data);//hash function (hash function) Void swap (hashtable< K,&NBSP;FUNCMODEL&GT;&AMP;&NBSP;X);p rotected:k* _table;//Hashtable state* _state;//State table size_t _size; size_t _capacity; funcmodel _hf;//distinguishes between a hash function of the default type and a hash function of type string};


. cpp Files

#define  _crt_secure_no_warnings 1#include "HashTable.h" template<class k, class  Funcmodel = defaultfunc<k>>hashtable<k, funcmodel>::hashtable (): _table (NULL),  _state (NULL),  _size (0),  _capacity (0) {}template<class k, class funcmodel  = defaultfunc<k>>hashtable<k, funcmodel>::hashtable (const size_t size ): _table (New k[size]),  _state (New state[size]),  _size (0),  _capacity (size) {         //do not use memset () to initialize _state, dynamic memory for enumerated types cannot be initialized with Memset          //honestly an initialization of for  (size_t i = 0; i < _ capacity; i++) {_state[i] = empty;}} Template<class k, class funcmodel = defaultfunc<k>>size_t hashtable <k, funcmodel>::hashfunc (const k& data) {RETURN&NBSP;_HF (data)%_capacitY;//mod hash table of capacity, find the location in the Hashtable,//actually here is best mod a prime number}template<class k, class funcmodel =  Defaultfunc<k>>void hashtable<k, funcmodel>::swap (hashtable<k, funcmodel> &AMP;&NBSP;X)//Exchange two hash tables {swap (_table, x._table); swap (_state, x._state); swap (_size, x._size); Swap ( _capacity, x._capacity);} Template<class k, class funcmodel = defaultfunc<k>>bool hashtable <k, funcmodel&gt::P ush (Const k& data) {if if  (_size *10 >= _ CAPACITY*&NBSP;8)//load factor not exceeding 0.8{hashtable<k, funcmodel> tmp (2 * _capacity +  2);for  (size_t i = 0; i < _capacity; ++i) {if  (_state[i] = = exist) {Size_t index = hashfunc (_table[i]);while  (tmp._state[index] ==  EXIST) {index++;} Tmp._table[index] = _table[i];tmp._state[index] = exist;}} Swap (tmp);} Size_t index = hashfunc (data);while  (_state[index] == exist) {index++;} _table[index] = data;_state[index] = exist;_size++;return true;} Template<class k, class funcmodel = defaultfunc<k>>void hashtable <k, funcmodel>::P rint () {for  (size_t i = 0; i < _capacity;  ++i) {if  (_state[i] == exist) {printf ("_table[%d]:",  i); cout << _table [i] <<  "and exists";} else if  (_state[i] == delete) {printf ("_table[%d]:",  i); cout << _table [i] <<  "-delete";} else{printf ("_table[%d]: Empty",  i);} Cout << endl;}} Template<class k, class funcmodel = defaultfunc<k>>bool hashtable <k, funcmodel>::remove (Const k& data) {if  (_size > 0) {size_t  Index = find (Data);if  (index > 0) {_state[index] = delete;_size--;return true;} Elsereturn false;} Return false;} Template<class k, class funcmodel = defaultfunc<k>>size_t hashtable <k, funcmodel>::find (const k& data) {size_t index = hashfunc (data); size _t time = _capacity;while  (time--) {if  (_table[index++] == data) {return  --index;} if  (index == _capacity) {index = 0;}} Return -1;} Template<class k, class funcmodel = defaultfunc<k>>bool hashtable <k, funcmodel>::alter (const k& data, const k& newdata) {size_t  index = find (data);if  (index > 0) {_state[index] = delete;if  ( Push (NewData)) Return true;elsereturn false;} Return false;}

The following are some of the issues to be aware of in the implementation process :

    1. For linear probing, it is sometimes the last part of the hash table that is encountered at the beginning of the probe, but because the hash conflict key value is conflicted to the first part of the hash table, the index is set to 0 after the table end is detected, simple and rude.


      650) this.width=650; "src=" Http://s4.51cto.com/wyfs02/M01/7F/BF/wKioL1crSRaylSBiAAAUZLCQQ3U879.png "title=" Capture 3. PNG "alt=" Wkiol1crsraylsbiaaauzlcqq3u879.png "/>

    2. The deletion of the data in the hash table is a weak deletion, that is, the deletion does not delete the data, just _state the state of the data to delete.

    3. When the load factor exceeds 0.8, the increase in capacity, the higher the load factor, the more the hash conflict, the higher the number of hits. The CPU cache is greatly increased. Load factor a= The number/hash list length of the elements in the table.



a two-point description of the code:

    • Here I separate the template declaration from the definition, involving the separation of the template compilation, the template separation of the compilation is not very clear can see Bo main blog http://helloleex.blog.51cto.com/10728491/1769994

    • and to enhance the reusability of the code, I used the faux function to distinguish between calling the default type (base type, custom type) and string type , making the call more flexible

This article is from the "Straw Sunshine" blog, please be sure to keep this source http://helloleex.blog.51cto.com/10728491/1770568

Linear detection method for dealing with hash collisions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.