How to learn the right wayThe PHP class CMS how to get the keyword automatically, its main steps can be divided into the following three steps:
1,php class CMS uses the segmentation algorithm to separate the title and content, extracting the keywords and frequency
At the time of word segmentation, the main two algorithms are Ictclas and hidden Markov models of CAs. But these two are too high-end, have a certain threshold, and are only support C++/java. There are currently two PHP-based PSCWS and HTTPCWS that are worth recommending.
SCWS released the 1.0.0 official version in 2008-03-08, and the latest version has now reached 1.0.4. PSCWS is the PHP version of it.
And Httpcws is a feast developed, before called PHPCWS. PHPCWS First Use the "Ictclas 3.0 share Chinese word segmentation algorithm" API for the first word processing, and then use the self-written "inverse maximum matching algorithm" for word segmentation and Word merging, and increase the punctuation filtering function, to obtain the results of the word segmentation. Currently, only Linux/unix systems are supported.
2,php class CMS compares the extracted results with the existing thesaurus to get the most compliant keywords
The main thing here is to look at the thesaurus, we can define our own thesaurus, we can also use the existing mature thesaurus.
3, then the PHP class CMS will compare these two sets of keywords, to get the most consistent with the current content of the keyword
At this stage is the specific situation of specific analysis. The current PHP class CMS has its own extraction keyword system. One of the most widely circulated in the network is the dedecms of the word source, I also in my popcms on the test, the effect is very good, like "we" and other meaningless words extracted and be classified as the frequency of the keyword is too high, and sometimes will be the space of the HTML proposed to do as the key words, need to improve. But if it's an auxiliary feature, it's good enough.
In addition, the PHP class CMS and discuz automatic extraction keyword function is also very powerful.
http://www.bkjia.com/PHPjc/446301.html www.bkjia.com true http://www.bkjia.com/PHPjc/446301.html techarticle how to correctly learn how the PHP class CMS can automatically get keywords, its main steps can be divided into the following three steps: 1,php class CMS through the word segmentation algorithm will be the title and content of ...