The ultimate answer to the question that Baidu spider leaves 200 0 64 in Web logs can correct the online Paradox

Source: Internet
Author: User
Tags php mod
The starting point of this article: because of the latest project revision, new domain names need to be used. As a result, the system analyzes the access logs of the spider and user every day to detect abnormal requests and site errors. Without much nonsense, go straight to the topic.
Steps:
No1. After the revision, set up the server environment, optimize the configuration parameters, and test the opening of new domain names.
NO2, 1-2 days of Baidu indexing, Google indexing. (Note: I hung up on the homepage with a very high weight)
After 10 days, Baidu only included the home page, and the inner pages remained unchanged. Google has tens of thousands.
No4. Observe the log. Google logs are all 200 0 0 status codes, and Baidu 200 0 64 logs are all these status codes.
No5. Search for a large amount of data to analyze these status codes. The materials mainly come from A5 and chinaz, and some people who do not understand it are also talking about it in disorder. As a result, the content found is not scientifically dominant.

The main meanings on the Internet are as follows: I will answer them one by one.
1. K-station performance. This statement mainly comes from the webmaster who has been crossed by K-station. Then, it is said that as long as 200 0 64 is left, Baidu wants K of you.
A: Yes. This is only a pseudoscience. What evidence do you have? Baidu official said. From a scientific point of view: 200 0 represents a successful connection, and everyone understands. After 64-bit msdn queries, the network becomes unavailable. I have developed C ++ for 3 years and C # for 4 years. This is mainly because the network is reset or when TCP communication is active. Because I am also engaged in network development. As long as one party is not normally disconnected, the other party will cause exceptions,ProgramCorresponding exception handling is required. IIS is also a program, and Baidu spider is also a program. Both parties will handle this exception. The IIS log is 200 0 64. I observe that a normal browser will also produce 200 0 64 code, the same is true for this reason. The log will generate this status code as long as the browser breaks the connection during debugging.
Besides, I have not optimized a new domain name too much. What are the disadvantages.

2. 64-bit operating systems are widely spread over the network.
A: Shit. The Internet in China has so many spam pop-up windows with you.

3. This problem occurs after Gzip is optimized.
A: I will focus on this analysis. I will not talk about the principle of gzip, but why not. Google and most browsers on the market also support gzip, while Baidu's spider also supports gzip. This can be recognized by Baidu's official website, and Baidu's official search engine optimization Guide also advocates this practice. My server has enabled gzip and most of the status codes are as follows:

2012-02-23 00:11:18 w3sv00001308376 192.168.206.2 get http://www.51dianzhu.com/forum.phpmod=viewthread&tid=59286&extra=page%3D1&page=1& 80-123.125.71.98 Mozilla/5.0 + (compatible; + baiduspider/2.0; + http://www.baidu.com/search/spider.html) 200 0 64
2012-02-23 00:18:26 w3sv00001308376 192.168.206.2 get http://www.51dianzhu.com/index.php-80-123.125.71.110 Mozilla/5.0 + (compatible; + baiduspider/2.0; + http://www.baidu.com/search/spider.html) 200 0 64
2012-02-23 01:37:23 w3sv1_1308376 192.168.206.2 get http://www.51dianzhu.com/archiver/index.php action = TID & value = 90013 & 80-123.125.71.56 Mozilla/5.0 + (compatible; + baidusp/ 2.0; + http://www.baidu.com/search/spider.html) 200 0 64

I disabled gzip for tracking and observation. The logs found on the next day are as follows:
2012-02-24 01:46:05 w3sv00001308376 192.168.206.2 get http://www.51dianzhu.com//archiver/index.php action = FID & value = 64 & 80-123.125.71.22 Mozilla/5.0 + (compatible; + baiduspider/2.0; + http://www.baidu.com/search/spider.html) 200 0 0
2012-02-24 01:46:08 w3sv00001308376 192.168.206.2 get http://www.51dianzhu.com//plugin.php id = vgallery: vgallery & tion = view & vid = 59 80-123.125.71.16 Mozilla/5.0 + (compatible; + baiduspider/2.0; ++ http://www.baidu.com/search/spider.html) 200 0 0
2012-02-24 01:38:54 w3sv00001308376 192.168.206.2 get http://www.51dianzhu.com//forum.php mod = viewthread & tid = 90290 & extra = Page % 3d1 & page = 1 & 80-123.125.71.114 mozilla/5.0 + (compatible; + baiduspider/2.0; + http://www.baidu.com/search/spider.html) 200 0

Why? My analysis is as follows:
1. When Baidu requests the page content, it obtains the gzip encrypted string and decrypts it. This process does not promptly read the remaining resources to do your own work. As a result, the service provider has encountered an exception, so the network is reset and the network name is unavailable. Google has done a very good job in this aspect and is working in full accordance with the process. In fact, this is also irrelevant. Because Baidu has obtained what it wants.
2. Baidu fails to decrypt the encrypted gzip string when requesting the content of the page. Haha. That's not enough .... This is what the majority of webmasters are most worried about, and Baidu has not explained it. I don't think this is the case.
3. Some people say that I didn't enable gzip. Why is there 200 0 64, because your content hasn't changed while the spider is capturing your content, it only determines the first part of the content stream and directly closes the communication. As a result, your server program becomes unavailable or the network name does not exist, that is, 64. In fact, Baidu is doing this to improve the capture efficiency.

Through my own analysis, we recommend that you disable gzip first. In fact, it is a psychological concern. What is the so-called precursor to the K-station on the Internet? The 64-bit system has no evidence, so ignore it.
In addition, according to my own tests, Baidu does have a new site inspection period, ranging from one week to one month.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.