How Ailsa:seo relies on technical analysis

Last Update:2014-12-17 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall

Alibaba before 07, after several years of efforts, has been SEO to achieve a very high state. At that time, the leadership of the SEO team is to do the technical background, so we have a large number of technical means to analyze and solve many of the problems of SEO, and achieved very good results. Because of the existing business, only a few less sensitive examples can be said.

When Google Webmaster Tools came out, many of the channels on our site couldn't verify the files that Google needed you to upload. Engineers over there to help check a lot of questions, thought is what jump and so did not do a good job. A lot of data have been searched, and there is no relevant solution to the feature matching. The meta validation approach is not technically a problem. So we SEO team to help engineers to find problems. My colleague Gupo will find out where the problem is, the original problem is in the pan-resolution.

The specific process is this:

With a pan-resolved URL, no matter what kind of URL you put together, there will be a normal page for you. For example: If you use the root directory of the site with a pan-resolution, http://www.xxxxxx.com/a.html this URL is your site is a normal URL. Then you randomly enter a nonexistent URL such as http://www.xxxxxx.com/adasdsadw.html or even http://www.xxxxxx.com/@####￥￥.html, the website CMS return is a normal page.

This is done on a large web site, many of which are for business needs. But in doing so, you must not pass the verification aspect of webmaster tools. And why?

So anyone can add this site to their webmaster tools. In fact, this will not happen, because Google will not only verify that the file you uploaded does not exist, but also verify that a file should not exist does not exist. After Google validates the file you uploaded, it will then simulate a page called google404errorpage.html. Google feels that there is a zero probability of a name google404errorpage.html in the root directory of your site, so if you detect the presence of this page, you cannot validate it. Google has known you this time because of the cause of the pan-resolution. To protect your site, Google will not allow this validation to pass.

The above analysis process is not found in the open channel. Now in the Google website Quality guide is just let you give the nonexistent page back to the 4xx status code. http://www.google.com/support/webmasters/bin/answer.py?hl=cn&answer=35638, and the rule was recently added. Previously, there is no relevant information to refer to. Then why did my colleague find out the problem at once? That is because the server log log will be sure to record the process of Google verification, the relevant directory, a period of log logs to check out to see.

Without log analysis, who would have thought there was a process in there? So far, there are a lot of web sites can not verify the file, and now it is possible to see if there is a pan-resolution problem, or to analyze log log to see. Another time, after the site revision, the site traffic suddenly dropped. We know that there are many factors affecting the flow of SEO, that is what causes the flow of the decline. My former supervisor Ben, through his own analysis, felt that there was a problem with the URL.

At that time the URL is like this: http://www.alibaba.com/bin/buyoffer/mp3.html, I think a lot of people do not think this URL is anything unusual. But at the time, this URL had a fatal problem.

In 02, Google's crawler is not very mature, in order to avoid falling into the dead loop, the crawler will not only for those with unnecessary parameters of the URL crawl, but also for some specific directory does not crawl. In such a directory, there are CGI and similar/bin/. Anyone who has learned the CGI language knows that CGI is the place where CGI programs are placed, and it makes no sense to crawl in this directory. /bin/This directory is also a number of other system or language default folder names, these directories do not exist in Google should crawl the page, so the search engine to block such a catalog crawl. The name of the folder that we define is/bin/,google is not going to crawl this directory.

After that, the directory name changed to/trade/, traffic immediately resumed. Now, Baidu also in the use of the CGI file, take the directory to do an example. Http://www.baidu.com/search/robots.html, I believe that even now, no one dares to suspect that Google itself is out of the question. Some people also look for a plausible reason from hundreds of factors, and the real cause is obscured. But Ben, through technical analysis and practice, draws compelling conclusions. Similar things, I also encountered several times, because their experience in inspiring me, so I have done some to make others can not understand, but to the site to bring a lot of traffic things.

Technical analysis in and competitors to rob the flow of time, but also one of the competitiveness. Give a less-than-appropriate example:

Sitemap.xml just came out. We made our own sitemap.xml files, but after all, such a large sitemap file has not been done, especially in the weight of the setting in a large site is very fastidious. So we want to refer to a major foreign competitor's documents. The first way to get their file address, but how can not open the link, always return 404 errors. Through the foreign proxy server to visit the same way. Finally, the simulation of Google Crawler to the normal access to this file. Originally also attaches great importance to SEO this opponent, In order to let their sitemap.xml files are not seen by other people, only the kind of user is Google crawler access to show this file, because the browser user is very easy to determine, it blocked the browser access.

"How To learn SEO" a article, told the study of SEO to learn from the site and search engine-related technology began. And this article is to let you see how the specific application. Alibaba is the first to do SEO of the group of people, as early as in China do not know what is SEO when it has involved a lot of technical issues, and immediately achieved overwhelming advantage. Although they are not doing SEO for some reason now, but their contribution to the website is very big. My personal point of view: In a way, SEO is the achievement of Alibaba.

Original address: http://www.9you8.com.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More