How PHP class CMS automatically gets the keyword _php tutorial

Source: Internet
Author: User
How to learn the right wayThe PHP class CMS how to get the keyword automatically, its main steps can be divided into the following three steps:

1,php class CMS uses the segmentation algorithm to separate the title and content, extracting the keywords and frequency

At the time of word segmentation, the main two algorithms are Ictclas and hidden Markov models of CAs. But these two are too high-end, have a certain threshold, and are only support C++/java. There are currently two PHP-based PSCWS and HTTPCWS that are worth recommending.

SCWS released the 1.0.0 official version in 2008-03-08, and the latest version has now reached 1.0.4. PSCWS is the PHP version of it.

And Httpcws is a feast developed, before called PHPCWS. PHPCWS First Use the "Ictclas 3.0 share Chinese word segmentation algorithm" API for the first word processing, and then use the self-written "inverse maximum matching algorithm" for word segmentation and Word merging, and increase the punctuation filtering function, to obtain the results of the word segmentation. Currently, only Linux/unix systems are supported.

2,php class CMS compares the extracted results with the existing thesaurus to get the most compliant keywords

The main thing here is to look at the thesaurus, we can define our own thesaurus, we can also use the existing mature thesaurus.

3, then the PHP class CMS will compare these two sets of keywords, to get the most consistent with the current content of the keyword

At this stage is the specific situation of specific analysis. The current PHP class CMS has its own extraction keyword system. One of the most widely circulated in the network is the dedecms of the word source, I also in my popcms on the test, the effect is very good, like "we" and other meaningless words extracted and be classified as the frequency of the keyword is too high, and sometimes will be the space of the HTML proposed to do as the key words, need to improve. But if it's an auxiliary feature, it's good enough.

In addition, the PHP class CMS and discuz automatic extraction keyword function is also very powerful.


http://www.bkjia.com/PHPjc/446301.html www.bkjia.com true http://www.bkjia.com/PHPjc/446301.html techarticle how to correctly learn how the PHP class CMS can automatically get keywords, its main steps can be divided into the following three steps: 1,php class CMS through the word segmentation algorithm will be the title and content of ...

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.