How to automate classification of keywords

Source: Internet
Author: User

Keywords how to do classification has been a relatively painful thing, especially when the size of the keyword is very large, artificial classification is often done long. In fact, the machine for semantic analysis is still operable, of course, human intervention is the best, most accurate, but before this, we can first machine pretreatment, to reduce the efficiency of labor. The following small head bidding price adjustment software to analyze the following:

To automatically classify keywords, there must be a basic Word library table, the Thesaurus list to have a grouping field, as each root group. There is also a grouping table that establishes a keyword, and the grouped tables are based on the thesaurus.

Then you can automatically group, such as a long tail word: Tom Tom, GPs Navi, first participle, according to the thesaurus, should be divided into Tom Tom|gps|navi three words.

Then in the Thesaurus table the pair, found in group 1 appeared 1 times, all in group 2, at which point the number of items in the thesaurus is DF and the number of matches is considered TF, then according to the TF/IDF calculation, the TF/IDF = 1 * 1/3 < Group 2 of the Group 1 equals 3* 1/7, the word is divided into Group 2.

If we divide the group into small enough, then for this group table, we can also create a third layer of aggregation table, the group will be aggregated again, to achieve the aggregation of non vertical content, this in the acquisition of correlation content and the distribution of the internal chain has a certain role.

So, how do you build this Kikuyu and grouped libraries?

First, you need to collect keywords from Baidu or 360, bidding forum benefits are, when you pull back, these words themselves relevance is relatively strong, you have to do is to do participle, participle is a two-step work, one is to find words, one is the statistical frequency, look for word participle idea is such, first put all the words together, Using the forward minimum step-by-step segmentation, the word length threshold you can set according to the industry characteristics, starting from the smallest words to match, statistical frequency, and then gradually increase the number of words, such as a total of http://www.aliyun.com/zixun/aggregation/11629. HTML ">100 Word, any segmentation of words if the word frequency statistics exceeds 70%, we think that there is this word, and then gradually increase, if less than 30%, then think that there is no such word, after the cycle of processing, the high-frequency word, to carry out heavy, is that we need the basic thesaurus.

The base Word library will be grouped, grouping is to be counted before the collection of each long tail words in the thesaurus of the simultaneous hit, will be a large number of simultaneous hit many long tail word of the root statistics out, these roots have a basic grouping, thinking as the above automatic grouping is the same, just do the basic Word library grouping, A certain amount of manual intervention is needed to ensure the accuracy of the data.

Above by the small head bidding price adjustment software to provide trial preparation, trial registration: http://vip.xiaonaodai.com/index.php?act=register&fromid=7.
Consulting qq:928122192 Hotline: 025-68781265

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.