There is a 5,000 or so keyword library, how the fastest match out of an article (about 800 words) on the keywords
Reply content:
There is a 5,000 or so keyword library, how the fastest match out of an article (about 800 words) on the keywords
Create a hash table with keywords, and then find out whether the words in the article are in it.
Knot ... Knot ... BA, stuttering Chinese participle
Build a tree of keywords
Concrete implementation See this article
From your word base and the amount of words in the article, it seems to be very small, if the business needs are really so big, feel no need to do participle, the introduction of external libraries or something.
It's good to go through the words in the thesaurus and match the articles once. Of course, you can also detect the next word in the library, there are no words such as "shoes", "Women's shoes" This, build a tree, and then in the match with the article, the premise is that your thesaurus exists in this word, business permitting.
The keyword library is pre-compiled into automata, and then to the article query, the actual development of the regular expression can be implemented in the way.