BF algorithm for keyword matching-python implementation

Source: Internet
Author: User

#! /Usr/bin/python #-*-coding: UTF-8 # filename BFimport time "t =" this is a big apple, this is a big apple, this is a big apple, this is a big apple. "p =" apple "t =" Why is vector space model? In fact, we can regard each word as a dimension, and the word frequency as its value (directed), that is, vector, in this way, the word and frequency of each article constitute an I-dimensional spatial graph. The similarity between the two documents is the closeness of the two spatial graphs. Assuming that the article only has two dimensions, a spatial graph can be drawn in a plane Cartesian coordinate system, and the reader can imagine two articles with only two words for understanding. "P =" "I = 0 count = 0 start = time. time () while (I <= len (t)-len (p): j = 0 while (t [I] = p [j]): I = I + 1 j = j + 1if j = len (p): breakelif (j = len (p)-1): count = count + 1 else: I = I + 1j = 0 print countprint time. time ()-start

Algorithm idea: the target string t and the mode string p are compared word by word. If the corresponding bits match, the next bits are compared. If they are different, p shifts one bits to the right, start the comparison from the 1st bits of p.
Algorithm features: Overall movement direction: p slides from left to right under fixed conditions, start from the leftmost bits of p and start from the right to compare with the corresponding bits in the t string. The sliding distance of p is 1, which leads to low matching efficiency of BF algorithms (compared with other algorithms, such as BM, KMP, and slide without jumping ).
The time complexity of this algorithm is O (len (t) * len (p), and the space complexity is O (len (t) + len (p ))
 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.