Nginx regular expression matching

Source: Internet
Author: User
Tags php source code regular expression

1. nginx configuration basics

1. Regular Expression Matching

~ Case-sensitive matching

~ * Case-insensitive matching

!~ And !~ * Case-insensitive and case-insensitive

^ Match starting

$ End match

Escape characters. Yes .*? And so on

* Represents any character

2. File and directory matching

-F and! -F is used to determine whether a file exists.

-D and! -D is used to determine whether a directory exists.

-E and! -E is used to determine whether a file or directory exists.

-X and! -X is used to determine whether a file is executable.

Example:

Location =/

# Match any query because all requests start. However, regular expression rules and long block rules are preferentially matched with queries.

Location ^ ~ /Images /{

# Match any queries starting with/images/and stop searching. No regular expression will be tested.

Location ~ *. (Gif | jpg | jpeg) $ {

# Match the request ending with "He Wei .gif、.jpg" or ". jpeg"

Getting started

1. if command
All Nginx built-in variables can be matched using the if command and regular expression, and some operations are performed according to the matching result, as shown below:

The code is as follows: Copy code
If ($ http_user_agent ~ MSIE ){
Rewrite ^ (. *) $/msie/$1 break;
}
 
If ($ http_cookie ~ * "Id = ([^;] + )(? :; | $ )"){
Set $ id $1;
}

Use symbols ~ * And ~ Regular expression for pattern matching:

1 .~ For case-sensitive matching.
2 .~ * Case-insensitive match (matching firefox's regular expression matches FireFox at the same time ).
3 .!~ And !~ * Indicates "unmatched ".
Nginx has built-in variables in many modules. Common built-in variables are in the HTTP core module. These variables can be matched using regular expressions.

2. Commands that can be matched using regular expressions
Location
View Wikipedia: location
Maybe this command is the most commonly used command for regular expression matching:

The code is as follows: Copy code

Location ~ . *. Php? $ {
Fastcgi_pass 127.0.0.1: 9000;
Fastcgi_index index. php;
Fastcgi_param SCRIPT_FILENAME/data/wwwsite/test.com/?fastcgi_script_name;
Fcinclude GI. conf;
    }

Almost every LEMP-based host is like the previous code. The matching rule is similar to the if command, but it has three more Identifiers, ^ ~ , = ,@. And

And it does not have an anti-operator !, The functions of these three identifiers are:

1. ^ ~ The identifier is followed by a string. Nginx will stop matching the regular expression after matching the string (expressed in the location command

), Such as location ^ ~ /Images/, you want to perform some special operations on the/images/directory, such as adding

Expires header, anti-Leech, etc., but you want to add only the expires header to all images except the images in this directory. This operation may

Another location is used, for example: location ~ *. (Gif | jpg | jpeg) $. In this way, if there is a request/images/1.jpg, how does nginx decide?

Which location is the specific operation? The result depends on the identifier ^ ~, If you write: location/images/, nginxwill

Matched to location ~ *. (Gif | jpg | jpeg) $ in this location, this is not the result you need, but adds ^ ~ After this identifier, it

With the/images/character string, you can stop searching for other locations with regular expressions.
2. = indicates the exact search address. For example, location =/matches only requests whose uri is/. If the request is/index.html

Location, but does not match this, of course, you can write two location, location =/And location/, so that/index.html will match the latter

If your site has a large number of requests to/, you can use this method to speed up the request response.
3. @ indicates to name a location, that is, to customize a location. This location cannot be accessed by the outside world and can only be used for Nginx-generated

Child requests, mainly error_page and try_files.
Note that these three identifiers are not followed by regular expressions. Although the configuration file is checked and there is no warning, they do not match

.
To sum up, the matching order of the post-par value of the location command is as follows:

1. The location of the identifier "=" is matched first. If the request uri matches the location, the request uses the location configuration.
2. Perform String Matching. If the matched location has ^ ~ This identifier. If it matches to stop, the configuration of this location will be returned.
3. Perform regular expression matching according to the sequence defined in the configuration file. The configuration in the oldest matched location will be returned.
4. If the regular expression can match the requested uri, the location corresponding to the regular expression is used. If no, the second matching result is used.
Server_name
View Wikipedia: server_name
Server_name is used to configure virtual hosts based on domain names or IP addresses. This command can also use regular expressions, but note that the regular expressions in this command

The expression does not contain any identifier, but must use ~ Start:

The code is as follows: Copy code
Server {
Server_name www.example.com ~ ^ Wwwd + .example.com $;
}

Regular expressions in the server_name instruction can be referenced. For advanced applications, see this article: use regular expressions in server_name.

Fastcgi_split_path_info
View Wikipedia: fastcgi_split_path_info
This command sets the SCRIPT_FILENAME (SCRIPT_NAME) and PATH_INFO variables according to CGI standards. It is a variable divided into two parts (two references

. As follows:

 

The code is as follows: Copy code
Location ~ ^. +. Php {
(...)
Fastcgi_split_path_info ^ (. +. php) (. *) $;
Fastcgi_param SCRIPT_FILENAME/path/to/php $ fastcgi_script_name;
Fastcgi_param PATH_INFO $ fastcgi_path_info;
Fastcgi_param PATH_TRANSLATED $ document_root $ fastcgi_path_info;
(...)
}

The first reference (. +. php) plus/path/to/php will be used as SCRIPT_FILENAME, and the second reference (. *) will be PATH_INFO. For example, the request is complete.

If the URI is show. php/article/0001, the value of SCRIPT_FILENAME in the above example is/path/to/php/show. php, and PATH_INFO.

It is/article/0001.
This command is usually used in some frameworks (such as CodeIgniter) that beautify the URI through PATH_INFO ).

Gzip_disable
View Wikipedia: gzip_disable
Use a regular expression to specify the browsers in which gzip compression is disabled.


Gzip_disable "msie6"; rewrite
View Wikipedia: rewrite
This command should also be used a lot. It needs to use a complete regular expression containing references:

The code is as follows: Copy code

Rewrite "/photos/([0-9] {2}) ([0-9] {2}) ([0-9] {2 }) "/path/to/photos/$1/$1 $2/1_11_21_3.png; normally

It will be used in combination with if:

The code is as follows: Copy code

If ($ host ~ * Www .(.*)){
Set $ host_without_www $1;
Rewrite ^ (. *) $ http: // $ host_without_www $1 permanent; #$1 is '/Foo', not 'www .mydomain.com/foo'
}

How to match Chinese characters with regular expressions in Nginx
First, make sure that the enable-utf8 parameter is added during pcre Compilation. If not, re-compile pcre, and then you can use this in the Nginx configuration file

Regular expression: "(* UTF8) ^/[x {4e00}-x {9fbf}] +) $" pay attention to the quotation marks and the preceding (* UTF8), (* UTF8) will tell this regular expression to switch to UTF8 mode

.

The code is as follows: Copy code

[Root @ backup conf] # pcretest
PCRE version 8.10

Re>/^ [x {4e00}-x {9fbf}] +/8
Data> test
0: x {6d4b} x {8bd5}
Data> Nginx module reference manual Chinese version
No match
Data> Reference Manual Chinese version
0: x {53c2} x {8003} x {624b} x {518c} x {4e2d} x {6587} x {7248}

The location sequence error causes the download of. php source code without executing the php program.

Take a look at the example below (server segment and wordpress are installed in multiple directories ):
============================================

The code is as follows: Copy code

Location /{
Try_files $ uri // index.html;
}

Location/user1 /{
Try_files $ uri // user1/index. php? Q = $ uri & $ args;
}

Location ~ * ^/(User2 | user3 )/{
Try_files $ uri // $1/index. php? Q = $ uri & $ args;
}

Location ~ . Php $ {
Fastcgi_pass 127.0.0.1: 9000;
Fastcgi_index index. php;
Include fastcgi_params;
}

============================================

The configuration code of nginx. conf does not seem to have any problems, but in fact:
Access/user1/and the php program will be executed normally.
Access/user2/or/user3/will not execute the program, but directly download the source code of the program.

Why? Do you see their differences?
/User1/is a common location statement
/User2/or/user3/is the location matched by the regular expression.

The problem is that the/user2/or/user3/matching location command uses a regular expression. Therefore, you must pay attention to the sequence of code segments.

Location ~ Move the. php $ {...} segment up and put it in front of it.

Correct code example:
============================================

The code is as follows: Copy code

Location /{
Try_files $ uri // index.html;
}

Location/user1 /{
Try_files $ uri // user1/index. php? Q = $ uri & $ args;
}

Location ~ . Php $ {
Fastcgi_pass 127.0.0.1: 9000;
Fastcgi_index index. php;
Include fastcgi_params;
}

Location ~ * ^/(User2 | user3 )/{
Try_files $ uri // $1/index. php? Q = $ uri & $ args;
}

============================================

[Note] there is no need for any sequence for normal location command lines. If you have encountered a similar problem, you can try to adjust the regular expression.

The location command segment sequence of the expression for debugging

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.