Google employees revealed to prevent Web site cheating Technology _ website operation

Source: Internet
Author: User
Google researcher Wu
Since the search engine, there is a search engine page rankings for cheating (SPAM). So that users found in the search engine rankings in front of the page is not necessarily high quality, the saying goes, flash is not necessarily gold.
Search engine cheating, although many methods, the purpose is only one, is the use of improper hands
Paragraph to improve the ranking of their own pages. The most common method of cheating in the early days is to repeat keywords. For example, a website selling digital cameras, repeatedly listing a variety of digital camera brands, such as Nikon, Canon and Kodak and so on. To keep readers from seeing a lot of nasty keywords, smart cheaters often use small fonts and the same color as the background to mask these keywords. In fact, this approach is easily found by search engines and corrected.
After having the page rank (page rank), cheaters find that the more connections a Web page is referenced, the more likely the rankings will be, and then there will be a business that specializes in selling links and buying links. For example, someone creates hundreds of sites of their own that have no real content, only connections to their client sites. This approach is much better than repeating keywords, but it's not too hard to find. Because the so-called help others to improve the rankings of the site, in order to maintain business need to sell a lot of links, so it is easy to get away. (This is like counterfeit money, when a certain kind of counterfeit money in the circulation is quite large, it is easy to find the root cause.) Later, there are all kinds of cheating, we are not here to repeat the details.
A few years ago, the first thing I did when I joined Google was to eliminate internet cheating. When Google first found out that the search engine was cheating on Matt Cutts, he started studying the issue a few months before I joined Google, and later, Singh, Martin and I joined in. After months of hard work, we cleared up half the cheaters. (Of course, the efficiency of cheating in the future will not be so high.) Some of these sites have since "changed", but there are many sites for a cheat to continue to cheat, so cheating has become a long-term cat-and-mouse game. Although there is no one to solve the problem of cheating once and for all, but Google has basically done for any known cheating methods, in a certain period of time to find and clear it, so always will cheat the number of sites in a small proportion of the range.
The method of grasping cheating is much like the way to noise in signal processing. Readers who have studied information theory and experience with signal processing may know the fact that if we use a cell phone in a noisy car, the other person may not be able to hear it, but if we know the frequency of the engine, we can add a signal to the contrary of the engine noise and easily eliminate the engine noise so that The voice of the addressee can not hear the noise of the car at all. In fact, some high-end phones now have the ability to detect and eliminate noise. The process of eliminating noise can be summarized as follows:

In the diagram, the original signal is mixed with the noise, which is mathematically equivalent to two signals for convolution. The process of noise cancellation is a deconvolution process. This is not a problem in signal processing. Because the first, the frequency of the car engine is fixed, second, this frequency of noise repeat, as long as the acquisition of a few seconds of signal processing can be done. In a broad sense, as long as the noise is not completely random and has a correlation before and after, it can be detected and eliminated. (In fact, completely random, unrelated Gaussian white noise is hard to eliminate.) )
What the search engine cheats do, like adding noise to the phone's signal, makes the rankings of search results completely chaotic. However, this kind of people to join the noise is not difficult to eliminate, because the cheater's method can not be random (otherwise it can not improve the rankings). Moreover, cheaters can not be a day to change a method, that is, the cheating method is time related. Therefore, the search engine ranking algorithm, you can collect a period of time cheat information, will cheat caught, restore the original ranking. Of course, this process takes time, just as it takes time to collect car engine noise, during which time the cheaters may experience some sweetness. Therefore, some people see their site through the so-called optimization (in fact, cheating), ranking in the short term, to think that this so-called optimization is effective. But it will soon be found that the rankings fall a lot. This is not a search engine before the tolerant, now strict, but to show that grasping cheating takes a certain amount of time, before just have not detected these cheating sites.
It is also important to emphasize that Google's process of cheating and restoring the original rankings of the site is entirely automatic (and no personal likes and dislikes), just as the phone eliminates the noise is automatic. A site to the long-term ranking by the front, you need to do a good job, but also with those cheating website.
This article is from Google Blackboard

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.