How to intercept a specific user proxy in Nginx

Source: Internet
Author: User

How to intercept a specific user proxy in Nginx

This article mainly introduces how to intercept specific user proxies in Nginx and sets a blacklist for these intercepted users for convenient management. For more information, see

The modern Internet breeds a large variety of malicious robots and web crawlers, such as malware bots, spam programs, or content spammers, which have been scanning your website secretly, do something like detecting potential website vulnerabilities, obtaining email addresses, or just stealing content from your website. Most robots can identify them through their "User proxy" signature strings.

To prevent malicious software robots from accessing your website, you can try to import the Robot User Agent to the robots.txt file. However, unfortunately, the operator is designed to comply with the specifications of robots.txt. Some malicious software robots can easily skip robots.txt and scan your website at will.

Another way to block a specific robot is to configure your network server to reject requests that require content through a specific user proxy string. This article describes how to block specific user proxies on the nginx network server.

Blacklist specific user proxies in Nginx

To configure the user proxy blocking list, open the nginx configuration file of your website and find the server definition section. This file may be stored in different places, depending on your nginx configuration or Linux version (for example,/etc/nginx. conf,/etc/nginx/sites-enabled/ ,/Usr/local/nginx/conf/nginx. conf,/etc/nginx/conf. d/ ).

The Code is as follows:

Server {

Listen 80 default_server;

Server_name xmodulo.com;

Root/usr/share/nginx/html;

....

}

After opening the configuration file and finding the server section, add the following if declaration to a certain part of the section.

The Code is as follows:

Server {

Listen 80 default_server;

Server_name xmodulo.com;

Root/usr/share/nginx/html;

# Case-sensitive matching

If ($ http_user_agent ~ (Antivirx | Arian ){

Return 403;

}

# Case-insensitive matching

Copy the Code as follows:

If ($ http_user_agent ~ * (Netcrawl | npbot | malicious )){

Return 403;

}

....

}

As you think, these if statements use regular expressions to match any bad user string and return the 403 HTTP status code to the matched object. $ Http_user_agent is a variable in an HTTP request that contains a user proxy string. '~ 'Operator performs case-sensitive matching on user proxy strings, while '~ * 'Operator is case-insensitive. The '|' operator is logical or. Therefore, you can add many user proxy keywords in the if statement and block them all.

After modifying the configuration file, you must re-load nginx to activate blocking:

?

1

$ Sudo/path/to/nginx-s reload

You can use wget with the "-- user-agent" option to test user proxy blocking.

?

1

$ Wget -- user-agent "malicious bot" http: // <nginx-ip-address>

Manage the user proxy blacklist in Nginx

So far, I have demonstrated how to block HTTP requests from some user proxies in nginx. What if you have many different types of Web Crawler robots to block?

As the user proxy black list will increase significantly, it is not a good idea to place them on the nginx server. Instead, you can create an independent file that lists all blocked user proxies. For example, let's create/etc/nginx/useragent. rules and define the following format to define the map of all blocked user proxies.

?

1

$ Sudo vi/etc/nginx/useragent. rules

Copy the Code as follows:

Map $ http_user_agent $ badagent {

Default 0;

~ * Malicious 1;

~ * Backdoor 1;

~ * Netcrawler 1;

~ Antivirx 1;

~ Arian 1;

~ Webbandit 1;

}

Similar to the previous configuration ,'~ * 'Match keywords in both upper and lower case insensitive modes, while '~ 'A case-sensitive regular expression is used to match keywords. The "default 0" line indicates that user agents not listed in any other files are allowed.

Next, open the nginx configuration file of your website, find the section containing http, and add the following line to a location in the http section.

Copy the Code as follows:

Http {

.....

Include/etc/nginx/useragent. rules

}

Note that the include declaration must appear before the server section (that is why we add it to the http section ).

Now, open the nginx configuration to define the part of your server and add the following if statement:

The Code is as follows:

Server {

....

If ($ badagent ){

Return 403;

}

....

}

Finally, reload nginx.

?

1

$ Sudo/path/to/nginx-s reload

Now, any user agent containing the keywords listed in/etc/nginx/useragent. rules will be automatically disabled by nginx.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.