In this article, I will introduce nginx about location processing, we all know nginx configuration file There will be a lot of location,nginx configuration instructions can be divided into the scope of main,server,location these 3 species, In fact, these 3 are not in order to contain the relationship, but the independent relationship, such as a main level of the scope of the directive, is not written in a server or location, a module of a directive can have main,server, Location These 3 scopes, and each module has main,srv,loc these 3 levels of configuration, a module of the main level of configuration for all servers and location are shared, the SRV level configuration for all location are shared, Location only its own independent LOC level configuration, which is why a module's SRV and LOC level configurations need to be merge, and the main-level configuration does not need to be the cause of the merge. It looks a little round here, to distinguish between main,server,location as a scope level and a subject, similar to the difference between adjectives and names, the configuration of Nginx is not difficult to understand.
In general, a request URL comes along and nginx resolves it to a location to handle. This parsing process is actually based on the configuration of location can be divided into string matching and regular expression matching these 2 kinds. The simplest way to organize the location is simply to save them as a linked list, parse the URL in one traversal to find the corresponding location, but this is too inefficient for a high-performance server like Nginx, which is completely undesirable, Nginx the string-matching location into a three-pronged string-sorting tree, which also takes into account the balance of the tree. After the article I said detailed introduction of the implementation of the source code.
First of all, I would like to introduce the types of location and matching rules, Nginx wiki (http://wiki.nginx.org/HttpCoreModule#location) as an example to illustrate:
location =/{# matches the query/only.
[Configuration A]} Location/{# matches any query, since all queries begin with/, but regular # expressions and any longer conventiona
L blocks'll be # matched.
[Configuration B]} Location ^~/images/{# matches any query beginning With/images/and halts searching, # so regular expressions would
Not to be checked.
[Configuration C]} Location ~* \. (Gif|jpg|jpeg) $ {# matches any request ending in gif, JPG, or JPEG.
However, all # requests to The/images/directory is handled by # Configuration C.
[Configuration D]} Location @named {# Such locations are not used during normal processing of requests, # They are-only to pro
Cess internally redirected requests (for example Error_page, try_files).
[Configuration E]}
You can see that there are 5 different types of location in the above example, where the 4th one with the "~" prefix is a location,nginx that requires a regular match and has different precedence rules for these 5 different types of location when parsing the URL, the general rules are as follows:
1, the string is precisely matched to a location with a "=" prefix, then stops, and uses this location configuration;
2, the string matches the remaining non regular and location, if the match to a location with "^~" prefix, then stop;
3, the matching order is the order in which the location appear in the configuration file. If the match is to a regular location, stop and use the location configuration; otherwise, use the location configuration obtained in step 2 with the largest string match.
For example, the following requests are:
1,/-> exact match to 1th location, match stop, use configuration a
2,/some/other/url-> The first prefix part string to the 2nd location, then a regular match, obviously no match, then use the 2nd location configuration configurationb
3,/images/1.jpg-> First the prefix part string matches to the 2nd location, but then the 3rd location prefix is also matched, and it is already the configuration file that matches the maximum string for the URL, and location has "^~" prefix, a regular match is no longer performed, and configuration C is eventually used
4,/some/other/path/to/1.jpg-> First the prefix part of the same string matches to the 2nd location, then a regular match, when the match is successful, then use Congifuration D
Nginx URL matching rules are actually a bit inappropriate, in most cases, a URL must first match the string, and then do a regular match, but in fact, if you do a regular match, there is no match to do string matching, in many cases can save the time to do string matching. Anyway, first look at the implementation of the Nginx source code, in the introduction of matching location process, first to introduce the Nginx inside the organization of the location, in fact, in the configuration parsing phase, Nginx the location and regular matching location of the strings are stored in the following 2 fields of the LOC configuration ngx_http_core_loc_conf_t structure of the HTTP core module:
ngx_http_location_tree_node_t *static_locations;
(Ngx_pcre)
ngx_http_core_loc_conf_t **regex_locations;
If
As you can see from the types of these 2 fields, the string-matching location is organized into a location tree, while the regular matching location is just an array,
Location tree and regex_locations array establishment process in Ngx_http_block:
/* Create location Trees * for
(s = 0; s < cmcf->serv Ers.nelts; s++) {
CLCF = cscfp[s]->ctx->loc_conf[ngx_http_core_module.ctx_index];
if (ngx_http_init_locations (cf, Cscfp[s], CLCF)!= ngx_ok) {return
ngx_conf_error;
}
if (ngx_http_init_static_location_trees (cf, CLCF)!= ngx_ok) {return
ngx_conf_error;
}
}
After a configured read, all servers are saved in the servers array in the main configuration of the HTTP core module, and the location in each server is saved in the order in which they appear in the HTTP In the locations queue for the LOC configuration of the core module, the preceding code first sorts and classifies the location of each server, which occurs in the Ngx_http_init_location () function:
Static ngx_int_t ngx_http_init_locations ngx_conf_t *cf, ngx_http_core_srv_conf_t *cscf, ngx_http_core_loc_conf_t *pc
LCF) {... locations = pclcf->locations; .../* sort location by type, sorted after queue: (Exact_match or inclusive) (sorted well, if a exact_match name is the same as inclusive location, Exact_match is in front )
| Regex (not sorted) | Named (sort of good) |
Noname (unsorted) * * Ngx_queue_sort (locations, ngx_http_cmp_locations);
named = NULL;
n = 0;
#if (ngx_pcre) regex = NULL;
R = 0;
#endif for (q = ngx_queue_head (locations);
Q!= Ngx_queue_sentinel (locations);
Q = ngx_queue_next (q)) {LQ = (ngx_http_location_queue_t *) q; CLCF = lq->exact?
lq->exact:lq->inclusive; /* Due to the possible existence of nested location, that is, location nested location, here it is necessary to recursively deal with the current location below nested location/if (Ngx_http_init_locat
Ions (cf, NULL, CLCF)!= ngx_ok) {return ngx_error;
} #if (Ngx_pcre) if (Clcf->regex) {r++; if (regex = NULL) {regex = Q;
} continue;
#endif if (clcf->named) {n++;
if (named = = NULL) {named = q;
} continue;
} if (Clcf->noname) {break;
} if (q!= ngx_queue_sentinel (locations)) {Ngx_queue_split (locations, q, &tail);
}/* If there are named location, save them in the named_locations array of the owning server */if (named) {CLCFP = Ngx_palloc (Cf->pool,
(n + 1) * sizeof (ngx_http_core_loc_conf_t *));
if (CLCFP = = NULL) {return ngx_error;
} cscf->named_locations = CLCFP;
for (q = named;
Q!= Ngx_queue_sentinel (locations);
Q = ngx_queue_next (q)) {LQ = (ngx_http_location_queue_t *) q;
* (clcfp++) = lq->exact;
} *CLCFP = NULL;
Ngx_queue_split (locations, named, &tail); #if (ngx_pcre)/* If there are regular matching location, save them in the LOC configuration regex_locations array of the HTTP core module of the owning server, here and named location save location Different fromThe reason is that named location can only exist within the server, and the regex location can be used as nested location/if (regex) {CLCFP = Ngx_palloc (Cf->poo
L, (r + 1) * sizeof (ngx_http_core_loc_conf_t *));
if (CLCFP = = NULL) {return ngx_error;
} pclcf->regex_locations = CLCFP;
for (q = regex;
Q!= Ngx_queue_sentinel (locations);
Q = ngx_queue_next (q)) {LQ = (ngx_http_location_queue_t *) q;
* (clcfp++) = lq->exact;
} *CLCFP = NULL;
Ngx_queue_split (locations, regex, &tail);
#endif return NGX_OK;
}
The steps above will preserve the regular matching location, location tree is built in ngx_http_init_static_location_trees:
Static ngx_int_t ngx_http_init_static_location_trees (ngx_conf_t *cf, ngx_http_core_loc_conf_t *pclcf) {Ngx_queue
_t *q, *locations;
ngx_http_core_loc_conf_t *CLCF;
ngx_http_location_queue_t *LQ;
Locations = pclcf->locations;
if (locations = = NULL) {return NGX_OK;
} if (Ngx_queue_empty (locations)) {return NGX_OK;
}/* Here is also due to nested location, need to be recursive * * for (q = ngx_queue_head (locations);
Q!= Ngx_queue_sentinel (locations);
Q = ngx_queue_next (q)) {LQ = (ngx_http_location_queue_t *) q; CLCF = lq->exact?
lq->exact:lq->inclusive;
if (ngx_http_init_static_location_trees (cf, CLCF)!= ngx_ok) {return ngx_error; }//* The inclusive and exact type location with the same name in the join queue, that is, if a exact_match location name matches the same location name as the normal string, it is combined into a node.
stored separately under the exact and inclusive of the nodes, the purpose of this step is actually to go heavy, prepare for the subsequent establishment of the sort tree/if (ngx_http_join_exact_locations (cf, locations)!= NGX_OK) {
return ngx_error; }
/* Recursively each location node, get the list of location with the name of the current node as its prefix, saved under the current node's list field/Ngx_http_create_locations_list (locations, Ngx_queue
_head (locations));
/* Recursive establishment of location tree/pclcf->static_locations = Ngx_http_create_locations_tree (cf, locations, 0);
if (pclcf->static_locations = = NULL) {return ngx_error;
return NGX_OK;
}
After the ngx_http_init_location () function processing, the locations queue is already sorted out, the main work of the process of establishing the three-fork tree is ngx_http_create_locations_list () and NGX_HTTP_ Create_locations_tree (), all 2 functions are recursive functions, and the 1th function recursively locations each node in the queue, gets the location prefixed by the name of the current node, and is saved under the current node's list field, for example, For the following location:
location/xyz {
}
location =/xyz {
}
Location/xyza {} location/xyzab {}
location/ XYZB {} location/abc {}
location/efg {
}
location/efgaa {
}
The result of the order is/abc/efg/efgaa =/XYZ/XYZ/XYZA/XYZAB/XYZB, after which the result is/abc/efg/efgaa/xyz/xyza/xyzab/xyzb,ngx_http_c The results of Reate_locations_list () are:
Finally, take a look at the Ngx_http_create_locations_tree function:
Static ngx_http_location_tree_node_t * Ngx_http_create_locations_tree (ngx_conf_t *cf, ngx_queue_t *locations, size_t
Prefix) {.../* root node is a locations queue of intermediate node * * q = ngx_queue_middle (locations);
LQ = (ngx_http_location_queue_t *) q;
Len = lq->name->len-prefix;
node = Ngx_palloc (Cf->pool, Offsetof (ngx_http_location_tree_node_t, name) + len);
if (node = = null) {return null;
} node->left = NULL;
Node->right = NULL;
Node->tree = NULL;
Node->exact = lq->exact;
Node->inclusive = lq->inclusive; Node->auto_redirect = (U_char) (lq->exact && lq->exact->auto_redirect) | |
(lq->inclusive && lq->inclusive->auto_redirect));
Node->len = (U_char) len;
ngx_memcpy (Node->name, &lq->name->data[prefix], Len);
/* Disconnect from Intermediate node * * Ngx_queue_split (locations, q, &tail); if (Ngx_queue_empty (locations)) {/* * nGx_queue_split () insures that if-is-empty, * then right one is empty too/goto inclusive;
} * * from the left half of the locations Zoozi tree/node->left = Ngx_http_create_locations_tree (cf, locations, prefix);
if (Node->left = = null) {return null;
} ngx_queue_remove (q);
if (Ngx_queue_empty (&tail)) {goto inclusive;
}/* from locations right half to get right subtree/node->right = Ngx_http_create_locations_tree (cf, &tail, prefix);
if (node->right = = null) {return null;
} inclusive:if (Ngx_queue_empty (&lq->list)) {return node;
}/* Get the tree subtree from the list queue * * Node->tree = ngx_http_create_locations_tree (cf, &lq->list, prefix + len);
if (Node->tree = = null) {return null;
} return node; NGX_HTTP_LOCATION_TREE_NODE_S Structure of Location tree node: struct ngx_http_location_tree_node_s {ngx_http_location_tree_
node_t *left;
ngx_http_location_tree_node_t *right; Ngx_http_location_tree_node_t *tree;
ngx_http_core_loc_conf_t *exact;
ngx_http_core_loc_conf_t *inclusive;
U_char Auto_redirect;
U_char Len;
U_char Name[1];
};
The location tree structure uses the left,right,tree of these 3 fields, location trees are actually a three-pronged string sorter, and if a node considers only the left and right subtree, it is a balanced tree, The process of its establishment is somewhat similar to the process of establishing a balanced binary tree, which is first sorted and then inserted in the order of the nodes found by the binary lookup, and the ngx_http_location_tree_node_s tree node is also a balanced-sorted one, which is used by the node by the NGX_HTTP_ The list created by Create_locations_list () is the name of the node that is the prefix of all the nodes in its tree subtree, so the names of all nodes in the tree subtree do not have to hold the public prefix, and when they are searched, If you are turning to the tree node, you do not need to compare the string of the parent node.
Ngx_http_create_locations_tree () function is written clearly, it has a parameter is the queue locations, it returns a three-fork tree, the root node is locations intermediate node, The left subtree is a location tree established by the left half of the locations queue, and the right subtree is established for the right half of the location queue as the Tree,tree node established for the root node's list queue.
The location tree is eventually established as follows (for the convenience of reading, the complete name of the tree node is listed in the figure):
PS: About location modifier
1. =
This will exactly match the specified pattern, and the pattern here is limited to a simple string, which means that regular expressions cannot be used here.
Example:
server {
server_name jb51.net;
Location =/ABCD {
[...]
}
}
Match case:
HTTP://JB51.NET/ABCD # exactly matches
Http://jb51.net/ABCD # If the system running Nginx server itself is not case sensitive, such as Windows, it also matches
http://jb51.net/abcd?param1¶m2 # ignores query string parameters (arguments), this is/abcd behind? param1¶m2
http:// jb51.net/abcd/ # does not match because there is a backslash (trailing slash) at the end, Nginx does not think that this is an exact match
HTTP://JB51.NET/ABCDE # mismatch, Because it's not exactly a match.
2. (None)
Can not write location modifier, Nginx can still go to match pattern. In this case, match the URI that starts with the specified Patern, note that the URI here can only be a normal string and cannot use regular expressions.
Example:
server {
server_name jb51.net;
LOCATION/ABCD {
[...]
}
}
Match case:
HTTP://JB51.NET/ABCD # exactly matches
Http://jb51.net/ABCD # If the system running Nginx server itself is not case sensitive, such as Windows, it also matches
http://jb51.net/abcd?param1¶m2 # ignores query string parameters (arguments), this is/abcd behind? param1¶m2
http:// A backslash (trailing slash) exists at the end of the jb51.net/abcd/# and is also within the matching range
http://jb51.net/abcde # still matches because the URI begins with a pattern
3. ~
This location modifier is sensitive to case, and pattern must be regular expression
Example:
server {
server_name jb51.net;
Location ~ ^/abcd$ {
[...]
}
}
Match case:
HTTP://JB51.NET/ABCD # Perfect Match
Http://jb51.net/ABCD # mismatch, ~ is sensitive to case
http://jb51.net/abcd?param1 ¶M2 # Ignores query string parameters (arguments), this is/abcd behind? param1¶m2
http://jb51.net/abcd/ # mismatch, Because there is a backslash (trailing slash) at the end, the regular expression ^/abcd$
http://jb51.net/abcde # does not match the regular expression ^/abcd$
Note: For some systems that are not case sensitive, such as Windows, ~ and ~*, this is primarily the cause of the operating system.
4. ~*
Similar to ~, but this location modifier is case-insensitive, pattern must be regular expression
Example:
server {
server_name jb51.net;
Location ~* ^/abcd$ {
[...]
}
}
Match case:
HTTP://JB51.NET/ABCD # exactly matches the
Http://jb51.net/ABCD # Match, which is its case-insensitive feature
HTTP://JB51.NET/ABCD? PARAM1¶M2 # Ignores query string parameters (arguments), this is/abcd behind? param1¶m2
http://jb51.net/abcd/ # mismatch, Because there is a backslash (trailing slash) at the end, the regular expression ^/abcd$
http://jb51.net/abcde # does not match the regular expression ^/abcd$
5. ^~
The match situation is similar to 2. (None), the URI that begins with the specified match pattern is matched, and, if the match succeeds, then Nginx stops looking for other Location blocks to match (Location matching order)
6. @
is used to define a Location block that cannot be accessed by an external Client and can only be accessed by Nginx internal configuration directives, such as Try_files or error_page