Hash-non-conflicting applications

Source: Internet
Author: User

As we all know, the hash speed is very low. We dare to claim that the time complexity is O (1, it has a fight with the quick rank (currently only the name "quick rank" is supported ). In addition, it is still very useful, which is doomed to its extraordinary existence. However, when we talk about hash, we mostly introduce the construction of various hash functions and how to avoid conflicts, and then pull the MD5 stuff. In this case, it seems that hash is not commonly used. In fact, let's just put it bluntly. Whatever he does, there is no conflict between them. Naturally, it saves the "Zipper door" and things. It can be used without a hash. The speed is so fast.
The things mentioned in this article basically do not involve conflicting hashing, that is, the "direct addressing" method that people often say ", here we will talk about its application in ordinary programs (Note: All the code in this article is C code; Note: I am a water man, so all the examples are water problems, so let's see it, haha ). Here is only for throwing bricks. If you have any Jade, please click it!


Let's take a look at this common thing:
If n is greater than 5, "yes" is output; otherwise, "no" is output. Needless to say, this statement is immediately written:

if (n > 5) {printf("Yes");} else {printf("No");}

Er, if you write a line:

printf(n > 5? “Yes”: “No”);
Now, use hash to write:
char hash[2][] = { “No”, “Yes” };printf(hash[n > 5]);
Haha, it looks a bit strange, and it is not as convenient as writing a line above. It is a bit awkward to use hash here. Let's look at the following questions.


1. Hash by ASCII code
Removes the specified letter from the string.
For example, remove the letter T: "aeiou" from S: "Hello World" and then "Hll Wrld ".
I believe this simple problem is not a problem for the experts, but let's start with it.
First, the general person will write this question as follows:
char s[] = "hello world";char t[] = "aeiou";int i, j, len;for (i = len = 0; s[i]; i++) {for (j = 0; t[j]; j++) {if (s[i] == t[j])break;}if (!t[j])s[len++] = s[i];}s[len] = 0;printf(s); 
Use a for loop (or an IF (s [I]! = 'A'...). Do you want to use a hash?
char s[] = "hello world";char t[] = "aeiou";char hash[128];int i, len;memset(hash, 0, sizeof(hash));for (i = 0; t[i]; i++) {hash[t[i]] = 1;}for (i = len = 0; s[i]; i++) {if (!hash[s[i]])s[len++] = s[i];}s[len] = 0;printf(s);
It seems that the Code is not much reduced, but a lot of space is used, but the time complexity is reduced.
As shown in the preceding example, the initialization of a hash table (not a set) is very simple and convenient, and the operations are the same when used and initialized, all are implemented through hash [s [I. This can also be used as a template.
Speaking of this, you can take a look at the poj question. Poj2339 rock, scissors, and paper are stone scissors cloth for back and forth. The simulation is complete. It is mainly to judge the characters around the four sides. If it is the case, cough up. If I am not wrong, write 3*4 = 12 or write three long if characters together, it is estimated that it is necessary to write airsickness. If a hash table is used, one If statement is used:
// These are initialization, that is, just a few rows of char T [128]; t ['R'] = 'P'; t ['P'] ='s '; T ['s '] = 'R'; // you can check whether the upper, lower, and lower sides are mutually exclusive, the following functions can be written separately for char c = T [s [I] [J]; if (s [I-1] [J] = c | s [I + 1] [J] = c | s [I] [J-1] = c | s [I] [J + 1] = C) return C; elsereturn s [I] [J];
Ha, doesn't it feel like the amount of code is dropping sharply? Well, it doesn't affect the speed. How happy is it? Next, let's talk about tianyao 1487. It's time to use a hash to add more water. It's not too slow. 1230, let alone the absolute hash.


2. bitwise hash
Well, here we will learn more about poj2777 count color. I believe that this water question is well known to all. I also believe that everyone can see that it is judged by bit, so it is easy to do it here, each color corresponds to one digit, with a zone of 30 digits. An int is enough. As you may think, bitwise hash, cough, and white are used to save space, but it is better to use bitwise hash. (Let's just insert another note. We all know that we use the line segment tree. However, we can use the lowbit of the tree array to count the number of colors. Limited space, here do not post code, give a Web site to see yourself: http://blog.csdn.net/sg_siqing/article/details/12209027)
However, a bad thing about bitwise hash is that each bit can only represent two States, which imposes certain restrictions on its use. Therefore, it applies: only two cases can be used. Another drawback is that it must use Hash Functions to calculate bits. In this case, it is better to directly use ASCII codes to directly use the current standard. So this is not recommended here, unless it is to save that 7/8 of the space.


3. Data Structure
I) Evolving dictionary tree
This is not about others, but the "dictionary Tree" that everyone is familiar with. I look left, right, up, and down. How do I see it as a hash, maybe this is the so-called "Multi-hash. A hash table is a waste of space. Isn't the dictionary tree composed of so many hash tables a waste of space ?! But for the string, the search efficiency is too small.
Ii) Auxiliary linear table
What does Shenma mean? Should hash be used with linear tables? Yes, sometimes you don't have to worry about the hash function. You just need to take a linear table and store the result in it. You just need to point it with the subscript. In this case, you don't have to worry about conflict, because it will never conflict. You can store one and add it directly at the end of the linear table. It is also convenient to delete the table. Just add the last one. The time complexity is O (1), but the pointer is changed and the table length is changed. Besides, there are not many deletion times. When deleting a file, you may not be able to use any method.
Well, it is useless to say more. Let's take a look at the question. Poj1451 T9 is a dictionary tree. Two dictionary trees, one hash table, and one linear table (haha, Mo Xiao ). First, let the hash table link the two dictionary trees, and then insert a pointer to the linear table in the second dictionary tree structure. In this way, basically all the operations are O (1. All of this is linked by subscript. Coincidentally, the tianti 1985gamez game ranking system can also be used in this way. Based on a balanced binary tree and a hash table, there is no pressure to complete the explosion.
Similarly, only the URL of the Code is provided here:
Poj1451: http://blog.csdn.net/sg_siqing/article/details/12207153
Wikioi1985: http://blog.csdn.net/sg_siqing/article/details/14647649


4. Cryptographic applications
What I will basically talk about on the Internet: MD5, which means that this is a hash. I have never written it or read it in detail. It is unclear. But I wrote a code called DES encryption. The code of this item is definitely used as a hash! Interested can take a look, here do not elaborate too much: http://blog.csdn.net/sg_siqing/article/details/21085471


5. Conflict
Most of the conflicts come from string processing, while most of the problems involve millions or more strings, that is, a minimum of 106. However, an integer can store a number of 231 or about 109 orders of magnitude, but it is a bit difficult to open such a large array. Let's take one off, go to the Internet and find a string hash algorithm that does not conflict with each other within 109. In this way, the hash table capacity is about 100 times the number of strings, and the conflict volume is greatly reduced. When the hash value of the string is calculated but the original string is not saved, the balanced binary tree can be used to record these hash values to reduce the space development, but it also increases the time complexity by a logn. Note: This method is not necessarily feasible. It is just an assumption, because even if there are conflicts in space more than 100 times, it depends on the computing problem of the string hash function.


6. Advantages and Disadvantages
Disadvantages:
I) The waste of space is too serious !!!;
Ii) at first glance, it is not easy to understand and has low readability;
Iii) Other data structures are generally required to provide a key list or value list.
......
Let's talk about the advantages:
I) fast speed, which everyone knows;
Ii) easy to use, and the operational complexity is basically 1;
Iii) aggregates or simplifies the code. Similar code can be written into one sentence, avoiding multiple if or switch judgments;
......


7. Key
As we all know, the binary search compares the value, the size of the value, and finally finds the corresponding subscript. On the contrary, the hash is very straight to the ground to take the value when the subscript is used, one step in place to find the corresponding state. Therefore, the key to hash is the conversion between its value and subscript.
At the beginning of this article, I said that we will not talk about how to handle conflicts here. Why? Because hash itself is a very useful thing. If we want to add the conflict processing, the amount of code will soar instantly, and stealing is not worth the candle, the opposite is that it is easy to use as long as there is no conflict, and the amount of code is very small, very streamlined (is it true that all good algorithms are like this ?! ).
Here are a few examples: the line segment tree and the tree array are opposite to the brother, as long as you do not need to update the range and query the range, I believe more people will choose the tree array. At the same time, the lowbit in the tree array can also obtain the last bit 1 in the number binary, which is not in the line segment tree; the dual stack (not a two-way stack) can achieve O (1) implement manual space allocation and recovery, or the Maximum/minimum problem in the stack; the dual pointer in the quick row can be found in O (n) The first K is large/small, or look up two numbers for a certain value ...... All these kinds of data structures or algorithms are cool, so if they are not good at it, let's do what they are good!


8. Hash?
Some people may have said that this is not a hash. This is called an index. I said, what do you like? What do you do? I didn't give it a proper name, I just called it a hash, I used it, called my own name, let others say it, hahahaha ......

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.