Learning notes for Rewrite rules in Nginx

Source: Internet
Author: User
Tags error code http request regular expression nginx server port number nginx reverse proxy

Route rewriting is an important basic function in Web servers. Through route rewriting, you can structure URLs and make them more semantic (good for SEO ). In addition, shared URLs may become invalid due to program route changes, and rewrite of routes can effectively solve these problems.

The appropriate use of the Rewrite function can bring us many benefits. The Rewrite function in Nginx is based on perl-compatible regular expressions. Therefore, you must install the PREC library before compiling and installing nginx. In Nginx, the Rewrite function is implemented based on the ngx_http_rewrite_module. Therefore, ensure that this module is installed.

Rewrite rules

The core of Rewrite is regular expressions. Therefore, to use Nginx Rewrite skillfully, you must be familiar with regular expressions. We recommend a good regular expression testing tool, Regex Match Tracer.

In addition to regular expressions, Nginx also has some built-in Rewirte commands and related variables, providing a complete set of functions.

If Command

The if command supports condition determination. The basic syntax is as follows:


If (condition ){
// Do something
}
In the above code, curly brackets represent the scope, indicating the configuration applied when the condition is true. Variable names are supported in condition. If the variable is null or any string starting with "0", it is expressed as false; otherwise, it is true.


If ($ isTrue ){
// Do something
}
In addition, you can also use "=" and "! = "To compare variables and strings, as follows:


If ($ request_method = POST ){
Return 405;
}
Note: When comparing with a string, you do not need to enclose the string with quotation marks.

In addition to the above methods, you can also use regular expressions for matching and processing. There are usually the following expressions :~ Indicates case-sensitive matching ;~ * This parameter is case insensitive ;!~ It indicates case-sensitive and the matching result is reversed. Likewise ,!~ * This parameter indicates that it is case insensitive and the final matching result is reversed. In a regular expression, we can also use parentheses to capture the value of the corresponding variable, and get the strain value through $1,


If ($ http_user_agent ~ MSIE ){
# Check whether the browser user agent contains the MSIE string
}
 
 
If ($ http_cookie ~ * "Id = ([^;] + )(? : | $ )"){
# For Nginx configuration, you can use $1 and $2 to obtain the matched values, as shown in figure
# Set $ id $1; save the captured results in $ id for subsequent use
}
Note: Regular expressions generally do not require quotation marks. However, if a regular expression contains '}' or ';', quotation marks must be added to the entire expression.

To determine whether a request file or directory exists, there are many if statements. The code is as follows:


If (-f $ request_filename ){
# Determine whether the request file exists
}
 
If (! -F $ request_filename ){
# Determine whether the requested file does not exist
}
As shown in the code above,-f is used to indicate whether the requested file exists, and of course to determine the directory, as shown in the following table:

Parameter name function example
-F: determines whether the requested file exists. If yes, it is true. Add "! "Indicates the reverse type.
-D whether the requested directory exists; add "! "Indicates the reverse type.

If (-d $ request_filename ){
# Determine whether the requested directory exists
}
-E: determines whether the requested directory or file exists. If yes, the value is true.
-X determines whether the current request file is an executable file.
Break command

The break command is used to terminate Nginx configurations in the same scope. When the Nginx server encounters this command in the process of processing requests according to the configuration, it will return to the previous scope and continue to read the configuration downward; that is, the Nginx configuration in the same scope is located after the command, will be invalid.

The basic syntax is as follows:


Break;
Example:


Location /{
If ($ slow) {## scope of generation
Set $ id $1; # Before break, valid
Break;
Limit_rate 10 k; # The configuration is invalid after the break.
    }
 
# Other Nginx configurations (still valid)
}
Return Command

The return command is used to process the request. You can directly return the response status code to the client. All Nginx processes this command will be invalid. This command is usually used in combination with the if command in the server block and location block. The specific syntax is as follows:


Return [text];
Return code URL;
Return URL;
The status code returned to the client. It can be 0 ~ Any HTTP status code before 999 (a non-standard 444 can force the connection between the server and the client to be closed ).
Text: The response body content returned to the client. Variables are supported.
URL, returns a URL address to the client
Generally, the return command is configured to process domain name redirection. The code is as follows:


If ($ true ){
Return 301 http://www.111cn.net
}
Rewrite command

This command uses a regular expression to change the URI. One or more commands can exist at the same time, and URL matching and processing are performed in sequence.

This command can be configured in the server block or in location. Its syntax structure is as follows:


Rewrite regex replacement [flag];
Regex, used to match the regular expression of the URI. Use brackets "()" to mark the content to be intercepted.
The following is an error example:


Rewrite myblog.net http://www.111cn.net permanent;
Tip: The URI received in rewrite does not contain the host address. Therefore, regex cannot match the Host address of the URL.

For the above example, rewrite http://111cn.net/sourceis not feasible because the URL received by the rewriteinstruction is "/source" and does not contain "myblog.net"

In addition, the parameters in the request are not included in the URI received by rewrite. For example:

Http://111cn.net/source? Arg1 = value1 & arg2 = value2
The URI received in rewrite is "/source", excluding "arg1 = value1 & arg2 = value2"

Replacement. After successful match, it is used to replace the string of the intercepted content in the URI. By default, if the string starts with "http: //" or "https: //", other processing operations on the URI will not continue, instead, the overwritten URI is directly returned to the client.
If you want to match the host information, you can determine in the if condition,


If ($ host = '111cn. Net '){
Rewrite ^. * $ http://www.111cn.net $ request_uri? Permanent;
}
Flag, used to set the rewrite action on the URI. It can be any of the following flag.
Last, terminate the processing of the received URI in this location block, and use the processed URI as a new URI to process it using each location block. This flag re-executes the rewritten URI in the server block, providing the opportunity to transfer the rewritten URI to other location blocks. The following example is used to deepen understanding:

Location /{
Rewrite ^ (/111cn/. *)/media/(. *) \ .. * $1/mp3/listen 2.mp3 last;
Rewrite ^ (/111cn/. *)/audio/(. *) \... * $1/mp3/$ 2.ra last;
}
For the above rewrite code, if a request is successfully matched by the above 2nd rows of rules, Nginx will re-execute it in all the locations with the matched URI.

Break, which uses the rewritten URI as the new URI and continues processing in this block. This identifier is used to execute the overwritten address in the current location block, and the new URI is not switched to other location blocks. See the following example:

Location/111cn /{
Rewrite ^ (/111cn/. *)/media/(. *) \ .. * $1/mp3/ipv2.mp3 break;
Rewrite ^ (/111cn/. *)/audio/(. *) \... * $1/mp3/$ 2.ra break;
}
If the URI is successfully matched in row 2nd and processed, the Nginx server continues matching the new URI in the location rule. The new URI is always in the same location block.

Redirect: returns the rewritten URI to the client. The status code is 302, indicating that it is a temporary redirect URI. It is mainly used when the replacement variable is not "http: //" or "https: // "condition.

Permanent: returns the rewritten URI to the client. The status code is 301, which indicates a permanent jump.

When using the flag command, pay attention to the cooperation between the various identifiers. Let's look at the above example. What if we change break to last? Because the rewritten URI also contains 111cn, an endless loop may occur during location rule matching. In this case, the Nginx server returns the 500 error code after trying 10 cycles.

Rewrite_log command

Whether to enable the URK re-write log output function in this command configuration. The specific syntax is as follows:


Rewrite_log on | off
The default value is off. If it is configured as on, logs related to URL rewriting are output to the configuration log file of the error_log command at the notice level.

Set command

The set command is mainly used to set variables. Its syntax structure is:


Set variable value
Variable, variable name, must use the "$" symbol as the first character of the variable, and cannot be the same as the default Nginx variable
Value, which is the value of a variable. It can be a string, a combination of other variables or variables.
Uninitialized_variable_warn command

This command is used to configure whether to record warning logs (on by default) when uninitialized variables are configured. Its syntax structure is as follows:


Uninitialized_variable_warn on | off
Common global variables of Rewrite

In the Rewrite function configuration process, may use Nginx global variables for easy reference, specially recorded (in the following example are "http://www.111cn.net: 8081/server/source? Arg1 = value1 & arg2 = value2 ):

Example of variable name description
The $ args variable contains the request parameters in the request URL. For example, arg1 = value1 & arg2 = value2
$ Content_length the Content-length field in the request header is stored in the http header
$ Content_type Content-type field in the http header
$ Document_root stores the root path for the current request
$ Document_uri stores the URI of the current request and does not contain the request parameters. For example:/server/source
$ Host the host field of the host request field, for example, www.111cn.net
$ Http_user_agent: stores the user agent information of the requested client.
$ Http_cookie cookie information in the client
$ Limit_rate: the limit on the network connection rate of the Nginx server, that is, the configuration value for Nginx to configure the limit_rate command.
$ Remote_addr client IP address
$ Remote_port the port on which the client establishes a connection with the server, for example, 8081.
$ Remote_user stores the client user name
$ Request_body_file stores the name of the local file resource sent to the backend server.
$ Request_method: the method used to store client requests, such as GET, POST, and OPTION.
$ Request_filename the path name of the currently requested resource file
$ Request_uri the URI of the current request with the request parameters such as/server/source? Arg1 = value1 & arg2 = value2
$ Query_string has the same meaning as $ args.
$ Protocols used by scheme clients and server requests, such as http, https, and ftp
$ Server_protocol the protocol version requested by the client. For example: "HTTP/1.0", "HTTP/1.1"
$ Server_addr server IP address
$ Server_name name of the client request to reach the server
$ Server_port the port number of the client request to the server
$ Uri has the same meaning as $ document_uri.
Use of Rewrite

Ngx_http_rewrite_module is the Rewrite module of the Nginx server. pathInfo and common nginx Reverse proxy functions can be implemented through Rewrite. The following examples illustrate the usage of the preceding commands.

Domain jump

Rewrite can jump to a level-1 domain name or multi-level domain name. Example:


# Example 1
...
Server {
Listen 80;
Server_name 111cn.net;
Rewrite ^/http://www.111cn.net/; # domain jump
...
}
...
 
# Example 2
...
Server {
Listen 80;
Server_name 111cn.net www.111cn.net;
If ($ host ~ Myweb \. net ){
Rewrite ^ (. *) http://www.111cn.org $1 permanent; # Multi-domain jump
    }
}
 
 
# Example 3
...
Server {
Listen 80;
Server_name demo1.111cn.net demo2.111cn.net;
If ($ http_host ~ * ^ (. *) \. 111cn \. net $ ){
Rewrite ^ (. *) http://demo.111cn.net $1; # jump to a third-level domain name
    }
}
Domain name image

An image website is a website that places identical websites on several servers and uses independent URLs. The website on one server is called the main site, and the others are image websites. The mirror station can be seen as a copy of the master station. It can be used as a backup server when there is a problem with the master site. In addition, the response speed of websites in different regions can be improved. Image websites can respond to website traffic loads to solve problems such as network bandwidth blocking.

The Rewrite function in Nginx allows you to easily redirect domain name images. The implementation principle is very simple, that is, rewrite the URLs of different images to the specified URL. The following is an example of configuration:


Server {
...
Listen 80;
Server_name google.111cn.net;
Rewrite ^ (. *) http://www.google.com $1 last;
}
Server {
...
Listen 81;
Server_name bings.111cn.net;
Rewrite ^ (. *) http://bings.cn $1 last;
}
Of course, we can also use an image under a directory to implement the following:


Server {
Listen 80;
Server_name cdn.111cn.net;
Location ^ ~ /Source {
...
Rewrite ^/source (. *) http://cdn.google.com/websrc21_1 last;
    }
}
 
Server {
Listen 81;
Server_name cdn1.111cn.net;
Rewrite ^ (. *) http://cdn.baidu.com/
 
Location ^ ~ /Source2 {
...
Rewrite ^/source2 (. *) http://cdn.baidu.com/websrc21_1 last;
    }
}
Automatically add "/" Before the Directory "/"

If a default resource file is set for the website, the client can access the file without adding a specific resource file name. For example, when you access: http://www.111cn.net, you can directly access the "/index.html" file.

If you access a second-level Directory, such /. In this case, it is impossible to ask the user to input this type. You can use the Rewrite function to leave the end with no slash "/":


Server {
...
Listen 81;
Server_name www.111cn.net;
Location ^ ~ /Bbs {
...
If (-d $ request_filename ){
Rewrite ^/(. *) ([^/]) $ http: // $ host/$1 $2/permanent;
        }
    }
}
Directory merging

Search engine optimization is a way to improve the ranking of websites by using search engine index rules. The directory is also a way to enhance SEO. For example, the path of a website is as follows:


[Root]/server/12/34/56/78/9.html
If the user accesses this resource, the URL must also be written as a response. For the above URL, you can rewrite it to http://www.111cn.net/server/12-34-56-78-9.html. the configuration is as follows:


Server {
...
Listen 80;
Server_name www.111cn.net;
Location ^ ~ /Server {
...
Rewrite ^/server-([0-9] +)-([0-9] +)-([0-9] +)-([0-9] +) -([0-9] + )\. html $/server/$1/$2/$3/$4/latest 5.html last;
Break;
    }
 
}
Anti-Leech

Leeching is a type of attack that damages the legitimate interests of the original website and brings additional burden to the original server. First, let's take a look at the anti-Leech principles.

When a client requests resources from the server, in order to reduce network bandwidth and improve the response speed, the server generally does not transmit all resources to the client at a time. For example, to request a webpage, first return the webpage text file. When the server parses the webpage, it starts downloading resource files, such as images and style sheets, and executable scripts. If these resource files are not stored on the server but on other servers. This constitutes leeching.

To prevent leeching, you need to know the Referer header field in the HTTP request header and the source address that uses the URL format to access the current webpage or file. The value of the header field can detect the source address used to access the target resource. In this way, if the value in the Referer header field is not the URL of your site, you can take preventive measures to implement anti-Leech protection. However, the value of the Referer header field can be changed, so this method cannot completely block all leeching.

The Nginx configuration contains the valid_referers command to obtain the value in the Referer header field and assign a value to the $ invalid_referer variable based on the value. If the Referer header does not meet the value configured by the valid_referers command, the $ invalid_referer variable is assigned a value of 1. The syntax structure of the valid_referers command is:


Valid_referers none | blocked | server_names | string ...;
None. Check if the Referer header does not exist.
Blocked: checks whether the Referer header field value is deleted or disguised by the firewall or proxy server. In this case, the value of this header field does not start with "http: //" or "https: //"
Server_names: sets one or more URLs to check whether the value of the Referer header field is one of these URLs. Wildcard "*" is supported after Nginx 0.5.33 "*"
With the valid_referers command and the $ invalid_referer variable, anti-leech can be implemented with the Rewrite function. There are two implementation schemes: 1. Based on the resource file type; 2. Based on the requested Directory

The following is based on the resource file type:


Server {
...
Listen 80;
Server_name www.111cn.net;
Location ~ * ^. + \. (Gif | jpg | png | swf | flv | rar | zip) $ {
...
Valid_referers none blocked server_names * .111cn.net;
If ($ invalid_referer ){
Rewrite ^/yun_qi_img/default.jpg;
        }
    }
 
}
The following describes how to configure anti-Leech based on directories:


Server {
...
Listen 80;
Server_name www.111cn.net;
Location/file /{
...
Root/server/file /;
Valid_referers none blocked server_names * .myblog.net;
 
If ($ invalid_referer ){
Rewrite ^/yun_qi_img/default.jpg;
        }
    }
 
}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.