No crawler in Nginx

Source: Internet
Author: User

Simulation Crawl:

Curl-i- A ' Baiduspider ' hello.net

The resulting effect:

http/1.1 OK
Server:nginx

date:wed, 07:26:48 GM

The above instructions allow crawlers

If it's http/1.1 403 Forbidden

----------------------------------------------------------------------------------------

Method 1,

Writing in the server segment

multiple HTTP User Agent Pipelines |

server {

if ($http _user_agent ~* "qihoobot| baiduspider| Googlebot ")

{
return 403;

}

}

Refuse to httpuseragent in wget way, add the following content
# # Block HTTP User Agent-wget # #
if ($http _user_agent ~* (Wget)) {
return 403;
}
# # Block software Download user Agents # #
if ($http _user_agent ~* lwp::simple| Bbbike|wget) {
return 403;
}

Method 2

Use the robots.txt file: for example, to prevent crawling of all crawlers, but this effect is not very obvious

User-agent: *
Disallow:/

Method 3. Separate separation

Enter the Conf directory under the Nginx installation directory and save the following code as agent_deny.conf
Cd/usr/local/nginx/conf
Vim agent_deny.conf


#禁止Scrapy等工具的抓取
if ($http _user_agent ~* (scrapy| curl| HttpClient)) {
return 403;
}


#禁止指定UA及UA为空的访问
if ($http _user_agent ~ "feeddemon| jikespider|^$ ") {
return 403;
}


#禁止非GET | head| Post-mode fetching
if ($request _method!~ ^ (get| head| POST) ($) {
return 403;
}
Then, insert the following code after location/{in the site-related configuration:
Include agent_deny.conf;


The last recommendation is to use method one

No crawler in Nginx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.