String matching (hash algorithm)

Source: Internet
Author: User

The hash function is not strange to everyone, right?

And this time we use the hash function to implement string matching.

First we'll consider the binary number.

For any binary number, we will convert it to a number of 10 decimal numbers as follows (in binary number 1101101 for example):

Hash with the same principle, for each prefix (also can be suffix, the author is accustomed to 1 base, so like to use the prefix to calculate, hash[i] = hash[i-1] * x + s[i] (where 1 < i <= n,hash[0] = 0).

In general,

The hash value for the L-r interval is:

But what if n is big? Isn't that going to spill over?

So we store the hash value in unsigned long long, that overflow, will automatically take 2 of the remaining 64 square, but this may make 2 do not understand the string hash value is the same, but the probability is very low (do not rule out your bad luck).

So we can compare two strings for equality by a hash value.

To deal with a polynomial hash:

typedef unsigned long long ull;const int N = 100000 + 5;const ull base = 163;char s[n];ull hash[n];void init () {//processing hash value 
   p[0] = 1;    Hash[0] = 0;    int n = strlen (s + 1);   for (int i = 1; I <=100000; i + +) P[i] =p[i-1] * base;   for (int i = 1; I <= n; i + +) hash[i] = hash[i-1] * base + (S[i]-' a ');} ull get (int l, int r, ull g[]) {//Take out the hash value of the string inside the l-r in G    return G[r]-G[L-1] * p[r-l + 1];}

Let's see the title: Portal

Main topic:

There is a document, the front is the text, the back is the original, but the person received this file does not know where the middle from the beginning is the original, so you want to help restore, if the original text less than ciphertext, you will make it complete, the first line is ciphertext conversion format, for example, the second sample is translated into a,w translated into B.

Ideas:

All we have to do is translate the ciphertext into clear text, then compare the suffix of the original string with the longest match length of the string prefix after translation (note: The longest match cannot exceed half of the original length)

Hash water problem (with AC code):

#include <cstdio> #include <cstring> #include <algorithm>using namespace std;typedef unsigned long Long Ull;const int N = 100000 + 5;const ull base = 163;ull Hash1[n], hash2[n], P[n];char s[n], t[30], R[n];int t;int c[30]   ; void Init () {p[0] = 1; for (int i = 1; I <=100000; i + +) P[i] =p[i-1] * BASE;} ull get (int l, int r, ull g[]) {return g[r]-g[l-1]*p[r-l + 1];}    void work () {for (int i = 0; i <; i + +) C[t[i]-' a '] = i;    Puts (r+1);    int n = strlen (s + 1);    Hash1[0] = hash2[0] = 0;        for (int i = 1; I <= n; i + +) {Hash1[i] = hash1[i-1] * base + (S[i]-' a ');    Hash2[i] = hash2[i-1] * base + (C[s[i]-' a ']);    } int ans = n;        for (int i = n; i < n * 2; i + +) {if (I & 1) continue;        int tmp = I/2;        int Len =n-tmp;        ull S1 = Get (1, Len, HASH2);        ull s2 = Get (N-len + 1, N, HASH1);            if (S1 = = S2) {ans = tmp;        Break }//printf ("%llu%llu\n", s1, S2);    }//printf ("ans =%d\n", ans);    for (int i = 1; I <= ans; i + +) printf ("%c", S[i]);    for (int i = 1; I <= ans; i + +) printf ("%c", c[s[i]-' a '] + ' a '); Puts ("");}    int main () {scanf ("%d", &t);    Init ();        while (t--) {scanf ("%s%s", T, S + 1);    Work (); } return 0;}

  

String matching (hash algorithm)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.