Simple analysis on the writing of variables in Nginx configuration file use _nginx

Source: Internet
Author: User
Tags curl data structures echo command nginx server

Nginx's profile uses a miniature programming language, and many real-world nginx configuration files are a small program. Of course, the "Turing complete" aside, at least as far as I've observed, is designed to be influenced by both the Perl and Bourne shell languages. At this point, compared to the Apache and Lighttpd other WEB server configuration notation, not to mention that Nginx is a major feature of the. Since it is a programming language, there is generally no such thing as a "variable" (except, of course, the odd functional language of Haskell).
Friends who are familiar with the imperative programming languages of Perl, Bourne Shell, C + +, and so on must know that the variable is simply a container for "value". The so-called "value", in many programming languages, can be 3.14 of such numbers, can also be the Hello world, such as strings, can even be like arrays, hash table, such as complex data structures. However, in a Nginx configuration, a variable can hold only one type of value, because there is only one type of value, and that is the string.
For example, our nginx.conf file has the following line of configuration:

Set $a "Hello World"; 

We used the standard Ngx_rewrite module's set configuration instruction to assign the variable $a. In particular, we assign the string Hello world to it.
We see that the Nginx variable name is preceded by a $ symbol, which is the notation requirement. All Nginx variables must be prefixed with the $ prefix when referenced in the Nginx configuration file. This representation is similar to the languages of Perl and PHP.
Although a variable prefix modification such as $ can make Orthodox Java and C # programmers uncomfortable, the benefits of this presentation are also obvious: You can embed variables directly into string constants to construct a new string.

Set $a hello;  
Set $b "$a, $a"; 

Here we construct the value of the variable $b by the value of the existing Nginx variable $a, so that after the two instruction sequence is executed, the value of the $a is Hello, and the $b value is Hello, hello. This technique, known in the Perl World as "variable interpolation" (variable interpolation), makes specialized string concatenation operators less necessary. We may also use this term here.
Let's take a look at a more complete configuration example:

server {  
  listen 8080;  
 
  location/test {  
    set $foo hello;  
    echo "foo: $foo";  
  }  


This example omits the outermost HTTP configuration block and the events configuration block in the nginx.conf configuration file. Using Curl This HTTP client requests this/test interface on the command line, we can get

$ Curl ' http://localhost:8080/test '  
Foo:hello 

Here we use the Echo configuration instruction of the third party Ngx_echo module to output the value of the $foo variable as the response body of the current request.
We see that the parameters of the Echo configuration directive also support variable interpolation. However, it should be explained that not all configuration directives support "variable interpolation". In fact, whether the instruction parameter allows "variable interpolation" depends on the implementation module of the instruction.
If we want to output a string that contains the dollar character ($) directly through the echo command, is there a way to escape the special $ character? The answer is no (at least to the current Nginx stable version of 1.0.10). Fortunately, however, we can circumvent this restriction by specifically constructing a Nginx variable with a value of $ through a module configuration directive that does not support "variable interpolation", and then using this variable in echo. Look at the following example:

Geo $dollar {  
  default "$";  
}  
 
server {  
  listen 8080;  
 
  location/test {  
    echo "This is a dollar sign: $dollar";  
  }  
} 

The test results are as follows:

$ Curl ' http://localhost:8080/test ' is  
a dollar sign: $ 

Here we use the configuration instruction Geo provided by the standard module Ngx_geo to assign the string "$" to the variable $dollar so that we can refer to our $dollar variables directly below where we need to use the dollar symbol. In fact, the most common use of the Ngx_geo module is to assign the specified Nginx variable based on the IP address of the client, which is only borrowed to "unconditionally" Assign "dollar" value to our $dollar variable.
In the context of variable interpolation, there is also a special case where the name of the variable named after the reference is followed by the constituent character of the variable name characters (for example, followed by letters, numbers, and underscores), we need to use special notation to eliminate ambiguity, for example:

server {  
  listen 8080;  
 
  location/test {  
    set $first "Hello";  
    echo "${first}world";  
  }  


Here, we refer to the variable $first in the parameter value of the ECHO configuration directive, followed by the word world, so if you write "$firstworld" directly, the Nginx "variable interpolation" compute engine will recognize it as referencing the variable $firstworld. To address this dilemma, Nginx's string notation supports the use of curly braces to enclose the variable names after $, such as the ${first here. The output of this example above is:

$ Curl ' Http://localhost:8080/test  
Hello World 

The set directive (and the GEO Directive mentioned earlier) not only has an assignment function, it also has the side effect of creating the Nginx variable, which is automatically created when the variable that is the assignment object does not already exist. For example, in the above example, if $a this variable has not yet been created, the set instruction automatically creates $a the user variable. If we use its value directly without creating it, an error will be made. For example

 server {  
  listen 8080;  
  
  Location/bad {  
    echo $foo;  
  }  
 } 

The Nginx server will refuse to load the configuration at this time:

[Emerg] Unknown "foo" variable

Yes, we can't even start the service!
Interestingly, the creation and assignment of Nginx variables occurs at a completely different time period. The creation of the Nginx variable can occur only when the Nginx configuration is loaded, or when the Nginx is started, while the assignment operation only occurs when the request is actually processed. This means that using a variable directly without creating it will cause the startup to fail, and also means that we cannot dynamically create a new Nginx variable while the request is being processed.
Once the Nginx variable is created, the visible range of its variable name is the entire Nginx configuration, and can even span the server configuration blocks of different virtual hosts. Let's take a look at an example:

server {  
  listen 8080;  
 
  Location/foo {  
    echo "foo = [$foo]";  
  }  
 
  Location/bar {  
    set $foo;  
    echo "foo = [$foo]";  
  }  


Here we create the variable $foo with the set instruction in Location/bar, so the variable is visible throughout the configuration file, so we can refer to the variable directly in the Location/foo without worrying about Nginx's error.
The following is the result of accessing the two interfaces using the Curl tool on the command line:

$ Curl ' http://localhost:8080/foo '  
foo = []  
$ Curl ' http://localhost:8080/bar '  
foo = [$]  
$ Curl ' Http://localhost:8080/foo '  
foo = [] 

As we can see from this example, the set instruction is used in Location/bar, so the assignment operation is performed only in requests that access/bar. When the/foo interface is requested, we always get an empty $foo value, because the user variable is not assigned to the output, the result is an empty string.
Another important feature we can see from this example is that the visible range of the Nginx variable name is the entire configuration, but each request has a separate copy of all the variables, or a separate copy of the container that each variable uses to hold the value, without interfering with each other. For example, after we have requested the/bar interface, the $foo variable is given a value of 32, but it does not affect the $foo value of subsequent requests to the/foo interface (it is still empty!). Because each request has a copy of its own independent $foo variable.
One of the most common mistakes for novice Nginx is to interpret the Nginx variable as something that is shared globally between requests, or "global variables." In fact, the lifetime of the Nginx variable is unlikely to cross the request boundary.

Another common misconception about nginx variables is that the lifetime of the variable container is bound to the location configuration block. Fact Let's look at an example that involves an "internal jump":

server {  
  listen 8080;  
  Location/foo {  
    set $a hello;  
    Echo_exec/bar;  
  }  
  Location/bar {  
    echo "a = [$a]";  
  }  
} 

Here in Location/foo, we use the ECHO_EXEC configuration instructions provided by the third party module Ngx_echo to initiate the "internal jump" to Location/bar. The so-called "internal jump", is in the process of processing the request, in the server, from one location to another location process. This is different from the use of HTTP status Code 301 and 302 of the "external jump", because the latter is the HTTP client in conjunction with the jump, and at the client, the user can use the browser address bar interface, see the request URL address changed. The internal jump is like the EXEC command in the Bourne shell (or Bash), and it's all "there's no return." Another similar example is the goto statement in C.
Since it is an internal jump, the request currently being processed is still the original one, but the current location has changed, so it is still the original set of Nginx variable container copy. To the example above, if we are requesting/foo this interface, then the whole workflow is this: first in the Location/foo through the set instruction to the value of the $a variable to the string hello, and then through the ECHO_EXEC instructions to initiate an internal jump, and then enter the Locatio N/bar, the value of the $a variable is then output. Because $a is still the original $a, so we can expect to get hello for this line of output. The test confirms this:

$ Curl Localhost:8080/foo  
a = [Hello] 

But if we access the/bar interface directly from the client, we get the value of the null $a variable because it relies on location/foo to initialize the $a. From the above example, we see that a request is using a copy of the same set of Nginx variables even though it undergoes several different location configuration blocks during its processing. Here, for the first time, we involve the concept of "internal jump". It is worth mentioning that the standard Ngx_rewrite module rewrite configuration instructions can also initiate an "internal jump", such as the above example with rewrite configuration instructions can be written in the following form:

server {  
  listen 8080;  
  Location/foo {  
    set $a hello;  
    rewrite ^/bar;  
  }  
  Location/bar {  
    echo "a = [$a]";  
  }  
}


The effect is exactly the same as the use of echo_exec. We will also focus on the more usage of this rewrite directive, such as launching "external jumps" such as 301 and 302. From the above example, we can see that the lifetime of the Nginx variable value container is bound to the request currently being processed, regardless of the location. All we have in front of us are Nginx variables that are implicitly created through the set directives. These variables are commonly referred to as "user-defined variables" or, more simply, "User variables." Since there are "user-defined variables", naturally there are "predefined variables" provided by the Nginx core and each Nginx module, or "Built-in variables" (Builtin variables). The most common use of Nginx variables is to get a variety of information about a request or response. For example, the built-in variable $uri provided by the Ngx_http_core module can be used to obtain the URI of the current request (decoded and without request parameters), while $request _uri is used to obtain the most original URI of the request (without decoding and containing the request parameters). Take a look at the following example:

location/test {  
  echo "uri = $uri";  
  echo "Request_uri = $request _uri";  
} 

For the sake of simplicity, even the server configuration block is omitted, and as with all previous examples, we are still listening on port 8080. In this example, we output the values of $uri and $request _uri to the response body. Here's a different request to test this/test interface:

$ Curl ' http://localhost:8080/test '  
uri =/test  
Request_uri =/test  
$ Curl ' http://localhost:8080/test?a=3 &b=4 '  
uri =/test  
Request_uri =/test?a=3&b=4 
$ Curl ' Http://localhost:8080/test/hello%20world?a =3&b=4 '  
uri =/test/hello World  
Request_uri =/test/hello%20world?a=3&b=4 

Another particularly common built-in variable is not a single variable, but a group of variables with infinitely many variants, all variables that begin with Arg_, which we estimate and call $arg _xxx variable group. An example is $arg _name, which is the value of the URI parameter with the current request name of name, and the value of the original form that was not decoded. Let's look at a more complete example:

location/test {  
  echo "name: $arg _name";  
  echo "Class: $arg _class";  


The/test interface is then requested using a variety of parameter combinations on the command line:

$ Curl ' http://localhost:8080/test '  
Name:  
class:  
$ Curl ' Http://localhost:8080/test?name=Tom&class =3 '  
name:tom  
class:3 
$ Curl ' http://localhost:8080/test?name=hello%20world&class=9 '  
name: Hello%20world  
Class:9 

In fact $arg _name not only matches the name parameter, it can also match the name parameter, or name, and so on:

$ Curl ' Http://localhost:8080/test?NAME=Marry '  
Name:marry  
class:  
$ Curl ' http://localhost:8080/test ? Name=jimmy '  
name:jimmy  
class: 

Nginx automatically adjusts the parameter names in the original request to all lowercase before matching the parameter names.
If you want to decode an encoding sequence such as%xx in a URI parameter value, you can use the Set_unescape_uri configuration instructions provided by a Third-party Ngx_set_misc module:

location/test {  
  Set_unescape_uri $name $arg _name;  
  Set_unescape_uri $class $arg _class;  
  echo "Name: $name";  
  echo "class: $class";  
} 

Now let's look at the effect:

$ Curl ' http://localhost:8080/test?name=hello%20world&class=9 '  
Name:hello World  


The space is decoded!
As we can see from this example, the Set_unescape_uri instruction also has the function of automatically creating Nginx variables as set instructions. We will also specifically introduce the Ngx_set_misc module. Like $arg _xxx This type of variable has an endless number of possible names, so they do not correspond to any container that holds the value. Moreover, this variable is specially handled in the Nginx core, and the third party Nginx module is not able to provide such a magic-filled built-in variable. There are also a number of built-in variables such as $arg _xxx, such as $cookie _xxx variable groups used to fetch cookie values, to take the $http _xxx variable group of the request header, and the $sent _http_xxx variable group to take the response header. This is not an introduction, and interested readers can refer to the official document of the Ngx_http_core module. It should be noted that many of the built-in variables are read-only, such as the $uri we just introduced and the $request _uri. Assigning values to read-only variables should be absolutely avoided because of unintended consequences, such as:

 Location/bad {  
  set $uri/blah;  
  echo $uri;  
 } 

This problematic configuration will allow Nginx to quote an incredible error at startup:

[Emerg] The duplicate "uri" variable in ... 

If you try to rewrite other read-only built-in variables, such as $arg _xxx variables, it may even cause a process crash in some Nginx versions.
There are also some built-in variables that support rewriting, and one example is $args. This variable returns the URL parameter string for the current request at read time (that is, after the question mark in the request URL, if any), and the parameter string can be modified directly when the value is assigned. Let's take a look at an example:

location/test {  
  set $orig _args $args;  
  Set $args "a=3&b=4";  
  echo "original args: $orig _args";  
  echo "args: $args";  
} 

Here we first save the original URL parameter string in the $orig _args variable, and then rewrite the $args variable to modify the current URL parameter string, and finally we use the echo instruction to output $orig _args and $args variable values respectively. So here's how we test this/test interface:

$ Curl ' http://localhost:8080/test '  
original args:  
args:a=3&b=4 
$ Curl ' http://localhost:8080/test ? a=0&b=1&c=2 '  
original args:a=0&b=1&c=2 


In the first test, we did not set any URL parameter strings, so the output $orig the value of the _args variable is empty. In the first and second tests, whether we provide a URL parameter string or not, the parameter strings are forcibly rewritten in the location/test to a=3&b=4.
In particular, the $args variables here, like $arg _xxx, are no longer using containers of their own stored values. When we read $args, Nginx executes a small piece of code that reads the data from the location of the current URL parameter string from the Nginx core, and when we overwrite the $args, Nginx executes another small piece of code that rewrites the same position. The rest of the Nginx will read the data from that location when the current URL parameter string is needed, so our modifications to the $args affect all parts of the function. Let's take a look at an example:

location/test {  
  set $orig _a $arg _a;  
  Set $args "a=5";  
  Echo "Original A: $orig _a";  
  echo "A: $arg _a";  
} 

Here we first put the built-in variable $arg _a value, that is, the original request URL parameter a value, stored in the user variable $orig _a, and then through the built-in variable $args to the assignment, the current request to rewrite the parameter string to a=5, and finally with the echo command respectively output $orig _a and $arg the value of the _a variable. Because the modification of the internal variable $args directly causes the URL parameter string of the current request to change, the built-in variable $arg _xxx will naturally change as well. The results of the tests confirm this:

$ Curl ' http://localhost:8080/test?a=3 '  
original A:3 
A:5 

We see that because the URL parameter string for the original request is a=3, $arg _a initial value is 3, but then by overwriting the $args variable, the URL parameter string is also forcibly modified to a=5, so the value of the final $arg _a is automatically changed to 5. Let's look at another one by modifying $ar GS variables affect the standard HTTP proxy module Ngx_proxy Example:

server {  
  listen 8080;  
  location/test {  
    set $args "foo=1&bar=2";  
    Proxy_pass Http://127.0.0.1:8081/args  
  }  
}  
server {  
  listen 8081;  
  Location/args {  
    echo "args: $args";  
  }  
}


Here we have defined two virtual hosts in the HTTP configuration block. The first virtual host listens on port 8080 and its/test interface modifies the current requested URL parameter string unconditionally to foo=1&bar=2 by overwriting the $args variable. The/test interface then configures a reverse proxy via the proxy_pass instruction of the Ngx_proxy module, pointing to the HTTP service/args on the 8081 port on this computer. By default, the Ngx_proxy module forwards HTTP requests to a remote HTTP service, automatically forwarding the URL parameter string of the current request to a distance. The HTTP service on the 8081 port of this machine is provided by the second virtual host we define. In the second virtual host Location/args, we use the echo command to output the URL parameter string of the current request to check the URL request parameter string that the/test interface actually forwards over through the Ngx_proxy module. Let's actually access the/test interface for the first virtual host:

$ Curl ' http://localhost:8080/test?blah=7 '  
args:foo=1&bar=2 

We see that although the request itself provided the URL parameter string blah=7, in the Location/test, the parameter string was forcibly rewritten as foo=1&bar=2. Then we forwarded our rewritten parameter string to the/args interface configured on the second virtual host via the proxy_pass instruction, and then output the URL parameter string of the/args interface. It turns out that the assignment operation of $args variables also successfully affects the behavior of the Ngx_proxy module.
This special code that executes when reading a variable is called a fetch handler (get handler) in Nginx, and the special code that executes when the variable is overwritten is called a "save Handler" (set handler). The different Nginx modules typically prepare different "access handlers" for their variables, making the variables behave magically. In fact, this technique is not uncommon in the computational world. In object-oriented programming, for example, the designer of a class generally does not expose the member variables of a class directly to the users of the class, but instead provides two methods (method) for both read and write operations for that member variable, often referred to as an "accessor" (accessor).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.