Bo: Search crawl should follow the rules violation of the agreement will cause chaos

Source: Internet
Author: User
Keywords Crawl Bo

Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall

For the recent odd Tiger 360 comprehensive search was burst into disregard of the international General Roberts Agreement, crawl Baidu, Google and other search engine content, resulting in many sites for security and privacy considerations do not allow search engine crawl intranet information leaked, senior Internet Observer Bo pointed out that Do search to comply with the search industry recognized rules of the game, ignoring the rules, wanton violation of the rules is the real unfair competition, this behavior can not be from the law and government supervision in time to stop, will cause the industry chaos.

The search engine principle is through a kind of crawler spider program, automatically collects the webpage on the Internet and obtains the related information. And in view of network security and privacy considerations, each site will be set up their own robots protocol, to express the search engine, which content is willing and allowed to be included in the search engine, which is not allowed. The search engine will be in accordance with the Protocol to give their own permissions to crawl. The robots protocol has become an international practice that all search engines must follow. This is like a normal person to go to someone else's home, need to knock first, get permission to enter the living room. You cannot enter the inner sanctum or stroll around other people's homes unless you have further permission and invitation from your host.

Therefore, when the new online two weeks ago, the 360 comprehensive search ignored the protocol, the direct capture of unauthorized information data, the practice has been widely questioned by insiders.

It is understood that the website of the Baidu Web site does not authorize the 360 search crawler crawl, but the 360 search ignore this setting, unauthorized implementation of crawl behavior. Considering that a lot of content source Web site to prohibit search engines crawl most of the pages involved in the server on the background database, user privacy, passwords and other information. This means that 360 ignoring the settings in the robots.txt protocol of the content source Web site will result in the discovery of privacy information that is not found on the server, or even displayed directly in the search results.

Zhou 祎 has been unable to deny the fact that he was accused of violating the protocol, but he also countered that it was unfair competition for Baidu to ban 360 of reptiles in the robots agreement. Bo says the protocol gives websites the right to ban any search crawlers, which has nothing to do with unfair competition. 360 ignoring the industry's default rules is the real unfair competition.

"Do search is to comply with the search industry recognized rules of the game, ignoring the rules, wanton violation of the rules is the real unfair competition." "In Bo's view, Baidu did not prohibit all crawler crawling questions and answers, know and paste content, Baidu just banned the irregular, there are potential security risks of reptiles, which is to protect the market order, protect the privacy of users reasonable measures." He pointed out that 2008 Taobao also banned Baidu Crawler, and Baidu strictly adhere to the protocol, stop grasping Taobao content, and not Taobao unfair competition for the pretext of violating the protocol.

3,601 straight boast this is to use the innovative way to do the search, Bo said his point of view: "A basic rules of the game do not comply with the search engine, how the nerve to put their own ' innovation ' label." Perhaps in Zhou's dictionary, ignoring the rules equals innovation. Bo said such behaviour was not stopped in time from legal and government regulation, that 360 today's illegal crawl is Baidu content, tomorrow will be able to grab a lot of privacy of Renren's community information, other sites and search engines can be emulated, by the Beijing-East shielding a Amoy can also grab competitors of the merchandise information. And so on, the entire Internet industry will be in turmoil.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.