Application of Interpretation TF algorithm in SEO optimization

Source: Internet
Author: User

Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall

TF algorithm is in fact a user information retrieval and information mining commonly used weighting technology, often used by seoer, and many people may not know, in fact, the most intuitive understanding is "site keyword density."

Directly into the subject, how the TF algorithm is calculated:

Formula:

TF: Word Frequency

IDF: Inverse Text frequency index

Tf-idf=tf*idf

We illustrate that the tf word frequency meaning, refers to the number of words appear in the page, if an article of the total number of words is 200, and "website optimization" The word appeared 4 times, then this frequency tf=4/200, that is, 0.02.

And the IDF is very file frequency, refers to the number of pages in the page has been counted as N, the total number of files Count M, then IDF=LG (m/n). Assuming that "site optimization" appears on 2000 pages with a total file number of 100 million, the file frequency idf=lg (100000000/2000) = 4.69897, then calculates the final tf-idf=0.02*4.69897=0.0939794.

This is just a page to determine the relevance of the problem, and SEO site optimization, not just to determine the value of TF, we need a high degree of recognition of the word to add points for the page. For example: Search engine contains 1 trillion pages, it should be said that each page will have ", is, in, land, get" and so on, these high-frequency words are also called noise words or stop words, search engine will remove these words, so these words of the power of decentralization should actually be 0. Calculation formula: Tf-idf=log (1 trillion/1 trillion) =log1=0.

In fact, search engine retrieval, the calculation of weights, according to each word participle to calculate, for example: "SEO site Optimization skills" this word.

Suppose: SEO page search number 20 million, site optimization of the number of searches for 10 million, the skill of the number of searches for 500 million

The total number of search engine indexes is assumed to be 10 billion.

SEO in www.ruihess.com This site page (total number of pages 400) appeared 8 times, site optimization appeared 10 times, skills appear 16 times.

So the frequency of their respective

TF (SEO) =8/400=0.02,

TF (site optimization) =10/400=0.025

TF (TIP) =20/400=0.04

TF () = above mentioned, the high frequency stop word, the weight is 0.

Then search "SEO site optimization Skills" The relevance of this page is: TF (total) =0.02+0.025+0.05=0.095.

and IDF (SEO) =log (10000000000/20000000) =2.69897

IDF (website optimization) = LOG (10000000000/10000000) =3

IDF (Tip) =log (10000000000/100000000) =1.69897

After that, each word for the search "SEO site optimization Skills" for the page weight and relevance contribution values are:

TF (SEO) =0.02*2.69897=0.0539794

TF-DIF (website optimization) =0.025*3=0.075

TF (Skill) =0.04*1.69897=0.0679588

From this can be seen, although the frequency of the skills appear higher, but not the degree of Recognition SEO and site optimization, so the weight of the page is not too much contribution.

The predictive power of a word is the higher the degree of recognition, then the weight of the word is larger, conversely the smaller, see "website optimization" may you have a basic understanding of what this page to say, but to see the skills, you may not be too clear about the theme of the page.

Of course, this support search engine algorithm of a point, we have to combine the tag to achieve the weight of the promotion, such as the H tag, and the key words around the word will also add points, where the surrounding refers to in a label for example: SEO site optimization techniques are mainly some search engine optimization ideas introduced

Design by www.ruihess.com

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.