Analysis search engine How to judge the content of the article is reproduced?

Source: Internet
Author: User
Keywords Search engine nbsp; article reprint

&http://www.aliyun.com/zixun/aggregation/37954.html ">nbsp; Recently a lot of netizens are asking me, included in the article, the next day was the search engine deleted, what is the reason, my general answer is: 1, your article is not original, 2, your article is weighted than your site to collect the site, 3, you collect content in the Internet too high repeatability. Any time the content of the site is critical to both the search engine and the user, search engine is also committed to the user want to display the content to the user, so high-quality content is the search engine needs, but also the needs of users, this article is mainly to help you solve this problem, hope for everyone useful!

Do the king of the site is the content and outside the chain, and compared to the construction of the chain, the content of the site to add sometimes more let webmaster headaches. Original content Everyone knows is a good thing, is not easy to get, write their own words will consume a lot of time, also not necessarily write well. and blindly collect and repeat the content of others, it will become a search engine hate garbage station. Pseudo-Original is between the two emerged. The so-called false original, simple is to deceive search engine, let it think you reprint content is original content.

1, the original quality of the article is undoubtedly very high, but to let search engines know that you this article is your original site

2, the original article generally for the blog is a better solution, but is not a blog or a website, where so many original articles, then here will be used with false original, and pseudo original is a need for some skills, I give you in the false original when making some important suggestions:

(1) The title of the article should be modified to original, for example, you find a title: "How I was successful" article, you go to Baidu search, whether there is such a title, if there is, we need to change to do not have, such as you can be changed to "How the success is tempered out of" and so on.

(2) The paragraph of the simple upset, but to have a logical disruption, or users can not understand what this article is written.

(3) must choose the original high article for false, if you fake this article in the Internet repeat quite large, I advise you or give up this article, choose again.

(4) More in one, for example, you have to write an article of interest, you can search the other people are how to write, read more, reference, and then in the article into their own views and views, write a few words! Although "pseudo" also want to "pseudo" have a bit of their own character, such as Zhanghangfeng's blog, is to often express their views and comments on the latest news of the Internet.

(5) Another lazy person method is, from Google casually search an article, and then to Baidu search, if Google has, Baidu did not, then this article can be used

Pseudo original method is generally in the original author's article on the title, replace synonyms, add or subtract some statements, rewrite the first and last paragraphs, modify the order of paragraphs to achieve the purpose of distinguishing the original article. Many people think that as long as the above steps to change, the article becomes unique, the search engine also can not recognize that is someone else's article, false original will be able to successfully transition into original. But I can not help but ask, this is not our wishful thinking ah? Search engine really do not recognize it? It is how to judge the article is reproduced or original?

In fact, we can use a simple model to resolve the search engine how to determine whether the content is reproduced. Search engine will be included in the database two similar content A and B are divided into n blocks of independent areas, and compare them, when the number of the same area of the same part of the search engine set the threshold of M, the search engine will think that A and B are mutually reproduced content. Here the content is divided into N block area, refers to the search engine's word segmentation technology, and to judge whether the repeat region exceeds the valve value m, refers to the search engine index technology. Of course, the value of N and M is the search engine's own algorithm set, different search engines are different, we are not aware of, but we can from the model above to pry out a lot of useful things.

First, the n value and the M value determine the ability of the search engine to judge the content of the reprint. When the value of n is greater, the M value is more than an hour, the search engine is able to identify the content of the reprint of the higher, conversely, the lower. And these two values are coordinated by the algorithm, the algorithm consumes the resources and so on, so the search engine will not blindly pursue high discernment ability.

Secondly, it can be seen from the model that the pseudo original method mentioned above works for search engines. Search engines are partitioned to determine the repeatability of content, and the order of content does not matter, so the way to modify the order of paragraphs is certainly not workable. Other pseudo original methods, including adding, subtracting, replacing and rewriting content, are valid to some extent by the N value and the m value size. Considering the search engine development so far, the algorithm has been quite mature, the ability to judge content duplication is also quite effective, so simply adding deleted content or replace part of the content does not make the search engine as original.

Simply put, to let the search engine think that our content is original content, our content must have the obvious difference, is most must change. The weight of the site is accumulated, everyone as long as the weight of your site will be better every day, finally I still suggest that we are more original, false original is also indirect plagiarism.

This article by http://www.codetk.com Webmaster Original, respect the author's labor and Intellectual property rights, reprint please keep this information, thank you!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.