The beauty of the math series 17 flash is not necessarily about gold, SPAM has been applied to search engine web page rankings ). So that users find that the top web pages in the search engine are not necessarily of high quality. As the saying goes, the flash is not necessarily gold.
Although there are many ways to cheat in search engines, there is only one purpose, that is, to use improper means to improve the ranking of your web pages. The most common early cheating method was repeated keywords. For example, a website selling digital cameras repeatedly lists the brands of various digital cameras, such as Nikon, Canon, and Kodak. To prevent readers from seeing a large number of annoying keywords, clever writers often use small fonts and the same color as the background to mask these keywords. In fact, this approach is easily discovered and corrected by search engines.
After the page rank is available, the author finds that the more links a webpage is referenced, the higher the ranking may be. Therefore, there is a business dedicated to selling links and buying links. For example, if someone creates hundreds of websites on their own, these websites do not have any substantive content, but only connect to their customer websites. This method is much more advanced than repeated keywords, but it is not very difficult to find. Because websites that help others improve their rankings need to sell a large number of links to maintain their business, it is easy to show off. (This is like fake money. When the flow of a fake money is large, it is easy to find the root cause .) After that, we will not go into detail here with all kinds of cheating methods.
A few years ago, the first thing I did when I joined Google was to eliminate network cheating. Matt Cutts was the first person to find search engine cheating on Google. He began to study this problem several months before I joined Google. Later, Singh, Martin, and I joined. After several months of efforts, we cleared half of the writers. (Of course, the efficiency of cheating in the future will not be so high .) Some of these websites have been "difficult", but many websites continue to cheat in another way. Therefore, cheating has become a long-term cat-and-mouse game. Although there is no way to solve the problem once and for all, Google has basically discovered and cleared any known cheating methods within a certain period of time, therefore, the number of cheating websites is always limited to a very small proportion.
The method of cheating is similar to the method of Noise Removal in signal processing. Readers who have learned information theory and experience in signal processing may be aware of this fact. If we call a mobile phone in a car with a very loud engine, the other party may not be able to hear it clearly; however, if we know the frequency of the automobile engine, we can add a signal opposite to the engine noise to easily eliminate the engine noise, the recipient cannot hear the noise of the car. As a matter of fact, some high-end mobile phones now have the ability to detect and eliminate noise. The noise elimination process can be summarized as follows:
In the figure, the original signal is mixed with noise, which is equivalent to Convolution of two signals. The noise elimination process is a convolution process. This is not a problem in signal processing. Because first, the frequency of the automobile engine is fixed, and second, the noise at this frequency is repeated, as long as the signal is collected for several seconds for processing. In a broad sense, noise can be detected and eliminated as long as it is not completely random and correlated. (In fact, Gaussian white noise that is completely random and irrelevant is difficult to eliminate .)
What search engine writers do is like adding noise to mobile phone signals, making the search results completely messy. However, this artificial noise is not difficult to eliminate, because the author's method cannot be random (otherwise, the ranking cannot be increased ). Moreover, it is impossible for the attacker to change the cheating method one day, that is, the cheating method is time-related. Therefore, the search engine ranking algorithm developer can capture cheating information for a period of time and restore the original ranking. Of course, this process takes time, just as it takes time to collect car engine noise. During this time, the Caster may taste some sweetness. Therefore, some people see that their websites have gone through the so-called optimization (in fact, cheating), ranking first in the short term, and think this so-called optimization is effective. However, we will soon find that the ranking has fallen a lot. This does not mean that the search engine has been tolerant before and is now harsh. It means that it takes some time to cheat. In the past, it was just that no cheating websites have been detected.
It should also be emphasized that the process of Google cheating and resuming the original ranking of the website is completely automatic (and there are no personal likes and dislikes), just as the process of Eliminating Noise on mobile phones is automatic. If a website needs to rank first for a long time, it needs to do a good job of content and draw a line with those cheating websites.