Extend your nginx with Lua (very well written)

Source: Internet
Author: User
Tags auth connection pooling curl extend http request php and port number nginx reverse proxy

I. Overview Nginx is a high-performance, lightweight Web server that supports highly concurrent. Currently, Apache is still the boss of Web servers, but in the world's top 1000 Web servers, Nginx has a share of 22.4%. Nginx uses a modular architecture, the official version of Nginx Most of the functions are provided through the module, such as the HTTP module, mail module and so on.     Through the development of the module extension nginx, you can create an all-purpose application server, so that some features in the front-end nginx reverse proxy layer, such as login check, JS merge, and even database access and so on. However, Nginx module needs to be developed in C, and must conform to a series of complex rules, the most important C development module must be familiar with Nginx source code, so that developers are daunting. Taobao's Agentzh and chaoslawful developed the Ngx_lua module by integrating the LUA interpreter into Nginx, which enables the business logic to be implemented using LUA scripts, due to the compact, fast and built-in process of LUA,     Therefore, it can greatly reduce the cost of business logic implementation while guaranteeing high concurrent service capability. This article introduces you to Ngx_lua, and some of the problems I encountered while using it to develop projects.

two. Preparation

First of all, introduce some features of Nginx, it is convenient to introduce the related characteristics of Ngx_lua.


Nginx Process Model

Nginx uses a multi-process model, single master-multi-worker, from master processing external signals, configuration file reading and worker initialization, the worker process uses a single-threaded, non-blocking event model (events Loop, Event loop) for port monitoring and client request processing and response, while the worker also processes signals from master. Because the worker uses a single thread to handle various events, it is important to ensure that the main loop is non-blocking, otherwise the worker's responsiveness is greatly reduced.

Nginx processes HTTP requests on the surface, when nginx processes a request from the client, based on the host, IP, and port of the request header to determine which server processing, after determining the server, The corresponding location is then found according to the requested URI, which is handled by this location. The actual nginx divides the processing of a request into several different stages (phase), which are executed sequentially in sequence, i.e. Ngx_http_post_read_phase in the first, ngx_http_log_phase in the last.

<span style= "font-size:10px;" >ngx_http_post_read_phase,     //0 read Request PHASENGX_HTTP_SERVER_REWRITE_PHASE,//1 This phase is primarily processing the global (SERVER block) Rewritengx_http_find_config_phase,   //2 This phase is mainly through the URI to find the corresponding location, and then according to the loc_conf set r corresponding variable Ngx_http_rewrite_ PHASE,       //3, the main processing location of the rewritengx_http_post_rewrite_phase,  //4postrewrite, this is mainly to do some calibration and finishing work, For easy handing over to the rear module. Ngx_http_preaccess_phase,     //5 such as flow control this type of access is placed in this PHASE, which means that it is mainly for some coarse-grained access. Ngx_http_access_phase,        //6 This, for example, access control, authorization verification is placed in this PHASE, in general, the processing action is given to the following modules to do. This is mostly done with some fine-grained accessngx_http_post_. Access_phase,   //7 Generally, when the above access module gets Access_code, it is operated by this module according to the Access_code ngx_http_try_files_phase,     // 8try_file module, which is the try_files instruction in the corresponding configuration file, can receive multiple paths as parameters, the resource of the current path cannot be found, then automatically finds the next path   ngx_http_content_phase,       // 9 Content Processing Module   ngx_http_log_phase            //10log Module

Handler can be registered at each stage, and processing requests is the handler that is registered at each stage of the run. The configuration instructions provided by the Nginx module will only normally be registered and run at one of the processing stages.

For example, the set directive belongs to the rewrite module and runs in the rewrite phase, with deny and allow running in the access phase.
Sub-request (subrequest)

There are actually two types of "requests" in the Nginx world, one called "Master Request" and the other called "Subrequest". The so-called "master Request" is a request initiated by an HTTP client from the Nginx external. For example, accessing Nginx from a browser is a "master request". A "sub-request" is a cascade request that is initiated within Nginx by the request that Nginx is processing. A "sub-request" looks like an HTTP request, but it doesn't have anything to do with the HTTP protocol or even network traffic. It is an abstract call inside Nginx, in order to facilitate the user to break down the task of "master request" into several smaller granular "internal requests", to access multiple location interfaces concurrently or serially, and then work together with these location interfaces to complete the entire "Master request". Of course, the concept of "sub-request" is relative, and any "sub-request" can also initiate more "sub-sub-requests", and even can play recursive calls (that is, they call themselves).

When a request initiates a "sub-request", it is customary to refer to the former as the "parent request" of the latter, according to Nginx's terminology.

Location/main {    echo_location/foo;     # Echo_location sends a child request to the specified location    Echo_location/bar;} Location/foo {    echo foo;} Location/bar {    echo bar;}

Output: $ curl Location/main
$ foo 03. Bar

Here, the main location is to send 2 sub-requests, respectively, to Foo and bar, which is similar to a function call.

The "sub-request" mode of communication is carried out within the same virtual host, so the Nginx core in the implementation of "sub-request", only a number of C functions are called, does not involve any network or UNIX sockets (socket) communication. From this we can see that "sub-request" Execution efficiency is very high.

Co-process (Coroutine)

The association is similar to a multi-threading, and the difference between multithreading is:

1. The process is not an OS thread, so creating and switching overhead is smaller than the thread.

2. The co-process has its own stack, local variables and so on, but the stack is simulated in user process space, so the cost of creating and switching is very small.

3. Multithreaded programs are executed concurrently by multiple threads, that is, there are multiple control flows executing at a moment. And the process is to emphasize a multi-process collaboration between the relationship, only when one of the process actively abandons the execution, the other can get execution, so in a moment, there is only one in a number of processes running.

4. Since there is only one running on multiple processes, access to the critical section does not need to be locked, and the multithreading situation must be locked.

5. Multi-threaded programs are not controllable because they have multiple control flows, and the execution of multiple threads is defined by the developer and therefore controllable.

Nginx Each worker process is on the event model such as Epoll or kqueue, encapsulated into a coprocessor, each request has a process to handle. This is exactly the same as the LUA model, so even if Ngx_lua needs to execute LUA, the relative C has some overhead, but it can still guarantee high concurrency.

three. Ngx_lua principle
Ngx_lua embeds LUA into Nginx, allowing Nginx to execute LUA scripts and handle requests with high concurrency and non-blocking. Lua built-in, so that it is good to change the asynchronous back to the form of sequential calls. Ngx_lua IO operations in Lua are delegated to Nginx's event model for non-blocking calls. Developers can write the program in a serial way, Ngx_lua will automatically interrupt the blocking IO operation, save the context, and then delegate the IO operation to the Nginx event processing mechanism, after the IO operation is completed, Ngx_lua resumes the context, the program continues to execute, These actions are transparent to the user program. Each nginxworker process holds a LUA interpreter or Luajit instance, which is shared by all requests processed by the worker. The context of each request is segmented by the LUA lightweight process, ensuring that individual requests are independent. Ngx_lua uses the "one-coroutine-per-request" processing model, for each user request, the Ngx_lua wakes up a process that is used to execute the user code processing request and the process is destroyed when the request processing is complete. Each coprocessor has a separate global environment (variable space) that inherits from the globally shared, read-only "Comman data". Therefore, any variable injected into the global space by user code does not affect the processing of other requests, and these variables are freed after the request processing is complete, so that all user code is run in a sandbox, which has the same life cycle as the request. Thanks to the support of the LUA coprocessor, Ngx_lua only requires very little memory to handle 10,000 concurrent requests. According to the test, Ngx_lua only needs 2KB of memory to process each request, and less if you use Luajit. So Ngx_lua is ideal for implementing scalable, high-concurrency services.

Typical applications

Official Web list:

· Mashup ' ing and processing outputs of various nginx upstream outputs (proxy, drizzle, postgres, Redis, memcached, and etc) I N lua,  Doing arbitrarily complex access control and security checks in Luabefore requests actually reach the upstream backends, Manipulating response headers in a arbitrary (by Lua) • Fetching backend information from external storage backends (Likeredis, memcached, MySQL, PostgreSQL) and use that Informa tion to choose Whichupstream Backend to access on-the-fly, Coding up arbitrarily complex web applications in a content handlerusing synchronous but still non-blocking access to the Database backends andother storage, Doing very complex URL dispatch in Lua at rewrite phase, Using Lua to implement advanced caching mechanism for nginxsubrequests and arbitrary locations.
Hello Lua.
# nginx.confworker_processes 4;events {     worker_connections 1024;} HTTP {    server {        listen;        server_name localhost;        Location=/lua {            Content_by_lua '                ngx.say ("Hello, lua!")            ';        }}}    
Output:
$ Curl ' Localhost/lua '
Hello,lua.

This enables a very simple Ngx_lua application, if such a simple module to use C to develop, the code is estimated to be about 100 lines, from this can be seen Ngx_lua development efficiency.

Benchmark
With nginx access to static files and Nodejs comparison, look at the high concurrency capabilities provided by Ngx_lua. The returned content is "Hello world!", 151bytes through. Ab-n 60000 take 10 averaging
As can be seen from the chart, in various concurrency conditions Ngx_lua RPS are the highest, and basically maintain around 10000rps, nginx read static files because there will be disk IO so performance slightly worse, and Nodejs is the relatively worst. With this simple test, you can see the high concurrency capability of Ngx_lua. Ngx_lua's developers have also done a test comparison between nginx+fpm+php and Nodejs, and the result is that Ngx_lua can reach 28000rps, while Nodejs has a bit more than 10,000, and PHP has the worst of 6000. There may be some configuration I did not match to cause Ngx_lua RPS not that high.


Ngx_lua installation Ngx_lua installation can download module source code, compile Nginx, but recommend the use of openresty. Openresty is a packaging program that contains a large number of third-party nginx modules, such as Httpluamodule,httpredis2module,httpechomodule. Save the download module, and the installation is very convenient. Ngx_openresty bundle:openresty./configure--with-luajit&& make && make install default openresty in Ngx_ The LUA module uses the standard Lua5.1 interpreter, using Luajit through--with-luajit.
The use of Ngx_lua
The Ngx_lua module provides configuration directives and Nginx APIs.        Configuration instructions: Used in Nginx, and the set instruction and pass_proxy instructions using the same method, each command has the use of the context. Nginx API: Used to access Nginx variables in Lua scripts, invoke the functions provided by Nginx. The following examples illustrate common directives and APIs.

Configuration directives

Set_by_lua and Set_by_lua_file

The same as the set directive for setting Nginx variables and executing in the rewrite phase, except that this variable is computed and returned by the Lua script.
Syntax: Set_by_lua$res <lua-script-str> [$arg 1 $arg 2 ...]

Configuration:

Location =/adder {    Set_by_lua $res "            Local a = Tonumber (ngx.arg[1])                local B = Tonumber (ngx.arg[2])                Return a + B "$arg _a$arg_b;        Echo$res;}
Output:
$ Curl ' localhost/adder?a=25&b=75 '
$100

Set_by_lua_file executes a LUA script outside of Nginx to avoid the use of a large number of escapes in the configuration file.

Configuration:

Location =/fib {        set_by_lua_file $res "Conf/adder.lua" $arg _n;        echo $res;} </span>


Adder.lua:

Local A=tonumber (ngx.arg[1]) local b=tonumber (ngx.arg[2]) return a + b

Output:
$ Curl ' localhost/adder?a=25&b=75
$100

Access_by_lua and Access_by_lua_file run in the access phase for access control. Nginx native allow and deny is based on IP, through Access_by_lua can complete complex access control, for example, access to the database for user name, password authentication and so on.

Configuration:

Location/auth {    Access_by_lua '        if ngx.var.arg_user = = "Ntes" then            return to        else            ngx.exit (ngx. Http_forbidden)        end    ';    Echo ' Welcome ntes ';}
Output:
$ Curl ' Localhost/auth?user=sohu '
$ Welcome Ntes

$ Curl ' localhost/auth?user=ntes '
$ <body bgcolor= "White" >
<center></body>

Rewrite_by_lua and Rewrite_by_lua_file

Implements URL rewriting, which is performed in the rewrite phase. Configuration:

Location =/foo {        Rewrite_by_lua ' ngx.exec ("/bar");    echo ' in Foo ';} Location =/bar {        echo ' in Bar ';}
Output:
$ Curl ' Localhost/lua '
$ Hello, lua!

Content_by_lua and Content_by_lua_file

ContentHandler is executed in the content phase, generating an HTTP response. Since the content stage can only have one handler, it cannot be applied at the same time as the Echo module, and the result of my test is that Content_by_lua will overwrite echo. This is similar to the previous example of Hello World.


Configuration (direct response):

Location =/lua {        Content_by_lua ' Ngx.say ("Hello, lua!") ';}

Output:
$ Curl ' Localhost/lua '
$ Hello, lua!

Configuration (accessing Nginx variables in Lua):
Location =/hello {        Content_by_lua '            local who = ngx.var.arg_who            ngx.say ("Hello,", who, "!")        ';}

Output:
$ Curl ' Localhost/hello?who=world
$ Hello, world!

Nginx API
Nginx API is packaged in NGX and NDK two packages. For example, Ngx.var.NGX_VAR_NAME can access nginx variables. Here are some highlights of ngx.location.capture and Ngx.location.capture_multi.

Ngx.location.capture
Syntax: res= ngx.location.capture (URI, options?) Used to emit a synchronous, non-blocking nginxsubrequest (sub-request). Non-blocking internal requests can be made to other location via Nginx Subrequest, which can be configured for reading folders or other C modules such as Ngx_proxy, ngx_fastcgi, NGX_MEMC,     Ngx_postgres, Ngx_drizzle even Ngx_lua himself. Subrequest just simulates the HTTP interface and does not have the extra HTTP or TCP transport overhead, it runs on the C level and is very efficient. Subrequest differs from HTTP 301/302 redirection, as well as internal redirection (via Ngx.redirection).

Configuration:
Location =/other {    Ehco ' Hello, world! ';} # LUA non-blocking iolocation =/lua {    content_by_lua '        Local res = ngx.location.capture ("/other")        if res.status = = t Hen            ngx.print (res.body)        end    ';}

Output:
$ Curl ' Http://localhost/lua '
$ Hello, world!

In fact, a location can be called by an external HTTP request or by an internal child request. Each location is equivalent to a function, and sending a sub-request is similar to a function call, and the call is non-blocking, which constructs a very powerful model to become, and later we see how to do non-blocking communication through the location and backend memcached, Redis.
Ngx.location.capture_multi

Syntax: res1,res2, ... = Ngx.location.capture_multi ({{URI, options?}, {URI, options?}, ...}) As with the Ngx.location.capture function, multiple child requests can be made in parallel, non-blocking. This method returns after all child request processing is complete, and the entire method's run time depends on the longest running child request, not the sum of the elapsed time of all child requests.

Configuration:
# Send multiple sub-requests at the same time (subrequest) location =/moon {    Ehco ' moon ';} Location =/earth {    Ehco ' earth ';} Location =/lua {    Content_by_lua '        local res1,res2 = Ngx.location.capture_multi ({{"/moon"}, {"Earth"}})        if Res1.status = =            Ngx.print (res1.body)        end        Ngx.print (",")        if res2.status = =            Ngx.print (res2.body)        end    ';}


Output:
$ Curl ' Http://localhost/lua '
$ moon,earth

Attention
Network IO operations in Lua code can only be done through the Nginx Lua API, which can cause a steep drop in performance if the Nginx event loop is blocked by the standard LUA API. The standard LUA IO library can be used for disk IO with a fairly small amount of data, but this is not possible when reading and writing large files because it blocks the entire nginxworker process.     For greater performance, it is strongly recommended that all network IO and disk IO be delegated to the Nginx request completion (via Ngx.location.capture).     The following accesses the/html/index.html file to test the efficiency of delegating disk IO to Nginx and direct access through LUA IO. To delegate disk IO via ngx.location.capture:

Configuration:

Location/{    internal;    root html;} location/capture {    Content_by_lua '        res = ngx.location.capture ("/")        echo res.body    ';}
accessing disk files via standard LUA IO:

Configuration:
location/luaio{    Content_by_lua '        local IO = require ("io")        local chunk_size = 4096        local F = assert (Io.op En ("html/index.html", "R")) while        true does            local chunk = F:read (chunk)            if not chunk then                break            End            Ngx.print (Chunk)            Ngx.flush (true)        end        f:close ()    ';}

Here through AB de-pressure, in various concurrency conditions, respectively, return 151bytes, 151000bytes of data, take 10 times average, get two ways RPS. Static files: 151bytes
7000 10000 capture 11067 8880 8873 8952 9023 Lua io 11379 9724 8938 9705 9561

Static files: 151000bytes, under 10000 concurrent memory consumption is too serious, no results in this case, the file is small, access to the static file through Nginx requires additional system calls, performance is slightly less than Ngx_lua.
7000 10000 capture 3338 3435 3178 3043/lua io 3174 3094 3081 2916/

In the case of large files, capture will be slightly better than Ngx_lua. There is no optimized configuration for nginx read static files, only the sendfile is used. If optimized, the performance of Nginx reading static files may be better, this is not familiar. Therefore, all kinds of IO in Lua are sent by Ngx.location.capture to the Nginx event model, which ensures that IO is non-blocking.

Four. Summary This article briefly introduces the basic usage of Ngx_lua, the next one will be Ngx_lua access to Redis, Memcached has been connected to the pool for detailed introduction.

Five. advanced in the previous article, has introduced some basic introduction of Ngx_lua, this article mainly focuses on how to Ngx_lua with the back end of the memcached, Redis for non-blocking communication.

Memcached in Nginx access Memcached need module support, here choose Httpmemcmodule, this module can with the Memcached of the back end of non-blocking communication. We know that the memcached is officially available, this module only supports get operations, and MEMC supports most memcached commands. The MEMC module is passed as an entry variable as a parameter, and all variables prefixed with $MEMC_ are MEMC entry variables. Memc_pass points to the back-end of memcached Server.

Configuration:

#使用HttpMemcModulelocation =/MEMC {    set $MEMC _cmd $arg _cmd;    Set $MEMC _key $arg _key;    Set $MEMC _value $arg _val;    Set $MEMC _exptime $arg _exptime;    Memc_pass ' 127.0.0.1:11211 ';}
Output:
$ Curl ' Http://localhost/memc?cmd=set&key=foo&val=Hello '
$ STORED
$ Curl ' Http://localhost/memc?cmd=get&key=foo '
$ Hello

This enables memcached access, and below is a look at how to access memcached in Lua.

Configuration:
#在Lua中访问Memcachedlocation =/MEMC {    internal;   #只能内部访问    set $MEMC _cmd get;    Set $MEMC _key $arg _key;    Memc_pass ' 127.0.0.1:11211 ';} Location =/LUA_MEMC {    Content_by_lua '        Local res = ngx.location.capture ("/memc", {            args = {key = Ngx.var.arg_ Key}        )        if res.status = =            Ngx.say (res.body)        end    ';}

Output:
$ Curl ' Http://localhost/lua_memc?key=foo '
$ Hello

Access to memcached through Lua is achieved primarily through the use of a similar function call in a child request. First, a MEMC location is defined for communication over the backend memcached, which is equivalent to memcached storage. Because the entire MEMC module is non-blocking, the ngx.location.capture is also non-blocking, so the entire operation is non-blocking.


Redis Access Redis requires httpredis2module support, and it can also be non-blocking with Redis. However, the response of Redis2 is the native response of Redis, so when used in Lua, this response needs to be resolved. The luaredismodule can be used to build the native request of Redis and resolve the native response of Redis.

Configuration:

#在Lua中访问Redislocation =/redis {    internal;   #只能内部访问    redis2_query get $arg _key;    Redis2_pass ' 127.0.0.1:6379 ';} Location =/lua_redis {#需要LuaRedisParser    content_by_lua '        local parser = require ("Redis.parser")        local Res = Ngx.location.capture ("/redis", {            args = {key = Ngx.var.arg_key}        })        if res.status = = then            reply = Parser.parse_reply (res.body)            Ngx.say (Reply)        end    ';}

Output:
$ Curl ' Http://localhost/lua_redis?key=foo '
$ Hello

Similar to accessing memcached, you need to provide a Redis storage dedicated to querying Redis, and then calling Redis through child requests.

Redis Pipeline
When you actually access Redis, it is possible to query multiple keys at the same time. We can take ngx.location.capture_multi by sending multiple sub-requests to redis storage, and then parsing the response content. However, there is a limitation that the Nginx kernel specifies that no more than 50 child requests can be initiated at a time, so this scheme is no longer applicable when the number of keys is more than 50 o'clock.
Fortunately, Redis provides the pipeline mechanism to execute multiple commands in a single connection, which reduces the round-trip delay of executing commands multiple times. After the client sends multiple commands through pipeline, Redis sequentially receives the commands and executes them, then prints out the results of the command in order. Using pipeline in Lua requires a native request query for Redis using the redis2_raw_queries of the Redis2 module.

Configuration:

#在Lua中访问Redislocation =/redis {    internal;   #只能内部访问    redis2_raw_queries $args $echo_request_body;    Redis2_pass ' 127.0.0.1:6379 ';} Location =/pipeline {    Content_by_lua ' Conf/pipeline.lua ';}

Pipeline.lua

--Conf/pipeline.lua filelocal parser=require (' redis.parser ') local reqs={    {' Get ', ' one '}, {' Get ', ' both '}}-- Constructs a native Redis query, get one\r\nget two\r\nlocal raw_reqs={}for I, req in Ipairs (reqs) do      Table.insert (Raw_reqs, Parser.build_query (req)) endlocal res=ngx.location.capture ('/redis? '). #reqs, {body=table.concat (Raw_reqs, ')}) if Res.status and Res.body then       --parsing the native response of Redis       local replies= Parser.parse_replies (Res.body, #reqs)       for I, reply in Ipairs (replies) do          Ngx.say (reply[1])       endend

Output:
$ Curl ' http://localhost/pipeline '
$ first
Second

Connection Pool in front of the Redis and memcached examples, each time a request is processed, a connection is established with the backend server and then the connection is freed after the request is processed. In this process, there will be 3 handshake, timewait and other overhead, which is not tolerated for high concurrency applications. The connection pool is introduced here to eliminate this overhead. Connection pooling requires the support of the Httpupstreamkeepalivemodule module.

Configuration:

HTTP {    # requires Httpupstreamkeepalivemodule    upstream Redis_pool {        server 127.0.0.1:6379;        # can accommodate 1024 connected pools        keepalive;    }    server {        Location=/redis {            ...            Redis2_pass Redis_pool;}}}    

This module provides the keepalive instruction, and its context is upstream. We know that upstream use Nginx to do reverse proxy, the actual upstream refers to "upstream", this "upstream" can be Redis, memcached or MySQL and other servers. Upstream can define a virtual server cluster, and these backend servers can enjoy load balancing. KeepAlive 1024 defines the size of the connection pool, and when the number of connections exceeds this size, subsequent connections are automatically degraded to short connections.      The use of connection pooling is simple, just replace the original IP and port number. It has been measured that, in the absence of a connection pool, access to memcached (using the previous MEMC module), RPS is 20000. After using the connection pool, RPS went all the way to 140000. In fact, such a large increase may not be achieved, but basically 100-200% improvement is still possible.

Summary

Here's a summary of the memcached and Redis visits. 1. Nginx provides a powerful programming model, location equivalent function, sub-request equivalent function call, and location can also send themselves sub-request, so as to form a recursive model, so adopt this model to implement complex business logic. 2. Nginx IO operation must be non-blocking, if the nginx blocking, it will greatly reduce the performance of Nginx. Therefore, in Lua it is necessary to send these IO operations to the Nginx event model through Ngx.location.capture. 3. Use connection pooling whenever possible when you need to use a TCP connection. This eliminates the overhead of creating and releasing connections in large numbers.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.