To explain the acceptance process of the request body in the core configuration module of Nginx _nginx

Source: Internet
Author: User
Tags current time data structures event timer readable

This article will mainly introduce the receiving process of the request in Nginx, including the parsing of the request header and the reading process of the request body.


First, introduce the HTTP request basic format defined in rfc2616:

Request = Request-line  
   * (general-header      
      | Request-header      
      | entity-header) CRLF)  
     CRLF 
     [ Message-body]  </span> 

The first line is the request line, which describes the request method, the resource to access, and the HTTP version used:
Request-line = Method sp Request-uri SP http-version crlf</span>

The request method is defined as follows, the most common of which is the Get,post method:

method = "Options"  
| "Get"  
| "Head"  
| "POST"  
| "Put"  
| "DELETE"  
| "TRACE"  
| "CONNECT"  
| extension-method  
Extension-method = token

The resource to be accessed is determined by the Uniform Resource status URI (uniform Resource Identifier), which is a more common constituent format (rfc2396) as follows:

<scheme>://<authority><path>?<query> 

In general, depending on the request method, the format of the request URI differs, usually by simply writing out the path and Query section.

The HTTP version (version) is defined as follows and is now commonly used in versions 1.0 and 1.1:

Http/<major>.<minor> 


The next line of the request line is the request header, with 3 different types of request headers defined in rfc2616, General-header,request-header and Entity-header, and some common headers defined in each type of RFC. Where the Entity-header type can contain a custom header.


To begin with the parsing of the request headers in Nginx, the Nginx request processing process involves 2 very important data structures, ngx_connection_t and ngx_http_request_t, which are used to represent connections and requests, respectively. These 2 data structures in the previous chapter of the book has done a more detailed introduction, no impression of the reader can be turned back to review, the entire request processing process, corresponding to the 2 data structure allocation, initialization, use, reuse and destruction.


Nginx in the initialization phase, specifically in the Ngx_event_process_init function in the init process phase, a connection structure (ngx_connection_t) is assigned to each listening socket. The event handler for the Read event member (read) of the connection structure is set to ngx_event_accept, and if the accept mutex is not used, the read event is mounted on the Nginx event-handling model (poll or epoll, etc.) in this function. Conversely, when the init process phase ends, a process grabs a accept lock to mount the read event in the event-handling loop of the worker process.

The static ngx_int_t Ngx_event_process_init (ngx_cycle_t *cycle) {.../* initializes the red-black tree/if (NGX_EVENT_TIM) used to manage all timers. * 
  Er_init (cycle->log) = = Ngx_error) {return ngx_error; /* Initialize Event model * * for (m = 0; ngx_modules[m]; m++) {if (Ngx_modules[m]->type!= ngx_event_module) {Co 
    Ntinue; 
    } if (Ngx_modules[m]->ctx_index!= ecf->use) {continue; 
 
 
    } module = ngx_modules[m]->ctx; 
    if (Module->actions.init (cycle, ngx_timer_resolution)!= NGX_OK) {/* Fatal/exit (2); 
  } break; 
  }/* For each listening socket///* Assign a connection structure to each listening socket * * ls = cycle->listening.elts; 
 
 
    for (i = 0; i < cycle->listening.nelts; i++) {c = ngx_get_connection (LS[I].FD, Cycle->log); 
    if (c = = NULL) {return ngx_error; 
 
 
    } C->log = &ls[i].log; 
    c->listening = &ls[i]; 
 
 
    Ls[i].connection = C; Rev = C->reAd 
    Rev->log = c->log; 
 
 
    /* Identifies this read event as a new request connection Event/* rev->accept = 1; #if (NGX_WIN32)/* Windows environment does not do analysis, but the principle is similar to * * #else/* The processing function of the Read event structure is set to Ngx_event_accept/Rev 
    ->handler = ngx_event_accept; 
    * * If the use of accept lock, to grab the lock in the back to the listening handle mount on the event processing model/if (Ngx_use_accept_mutex) {continue; }/* Otherwise, the listener handle is mounted directly on the event-handling model/if (Ngx_event_flags & ngx_use_rtsig_event) {if (Ngx_add_conn (c) = = Ngx_ 
      ERROR) {return ngx_error; 
      } else {if (Ngx_add_event (rev, ngx_read_event, 0) = = ngx_error) {return ngx_error; 
} #endif} return NGX_OK;
 }

When a worker process mounts an event-handling model at some point in time, Nginx can formally receive and process requests from clients. If a user enters a domain name in the browser's address bar, and the domain resolution server resolves the domain name to a server that is monitored by nginx, the Nginx event-handling model receives the read event and speeds up to the previously registered event handler Ngx_event_ Accept to deal with.


In the Ngx_event_accept function, Nginx invokes the Accept function, obtains a connection from the connected queue and the corresponding socket, assigns a connection structure (ngx_connection_t), and saves the new socket in the connection structure. Some basic connection initialization work is also done here:
First the connection is allocated a memory pool, the initial size defaults to 256 bytes, can be set through the connection_pool_size instruction;
Allocate the log structure and save it for subsequent log system use;
Initialize the corresponding IO transceiver function, the specific IO transceiver function and the use of the event model and operating system related;
Assign a socket address (SOCKADDR), and copy the Accept to the end of the address, and save it in the Sockaddr field;
Save the local socket address in the LOCAL_SOCKADDR field because the value is available from the listening structure ngx_listening_t, and the listener structure holds only the listener address set in the configuration file, but the configured listener address may be a wildcard *, that is, on all addresses, So the value saved in the connection may eventually change, and it will be identified as the real receiving address;
Sets the write event for the connection to be ready, that is, to set ready as the 1,nginx default connection for the first time writable;
If the Tcp_defer_accept property is set on the listening socket, the packet is already on the connection, and the Read event is set to be ready;
Format the Sockaddr field saved to a readable string and save it in the Addr_text field;
Finally, the Ngx_http_init_connection function is called to initialize other parts of the connection structure.

The most important task of the


Ngx_http_init_connection function is to initialize the processing function of a read-write event: The processing function of the write event for that connection structure is set to Ngx_http_empty_handler, and the event handler does nothing. In fact, the Nginx default connection is first writable and does not mount the write event. If there is data to send, Nginx will write directly to this connection, only in the case of an endless write, will mount the event to the event model, and set the real write event handler, the following chapters will be introduced in detail The processing function of the Read event is set to Ngx_http_init_request, and if there is already data on the connection (deferred accept is set), the Ngx_http_init_request function is called directly to process the request. Instead, set up a timer and mount a read event on the event-handling model, waiting for the data to arrive or timeout. Of course, whether there is already data coming, or need to wait for data arrival, or wait for time out, will eventually enter the processing function of the Read event-ngx_http_init_request.
 

The

Ngx_http_init_request function is primarily the initialization request, because it is an event handler function that has only one parameter of the ngx_event_t * type, and the ngx_event_t structure represents an event in Nginx. The context of event handling is similar to the context of an interrupt processing, in order to get relevant information in this context, the nginx of the connection structure is typically saved in the data field of the event structure, and the reference to the request structure is saved in the data field of the connection structure. In this way, the corresponding connection structure and request structure can be conveniently obtained in the event processing function. Inside the function look, first determine whether the event is a timeout event, if it is a direct closure of the connection and return, and vice versa refers to the accept connection before the request to deal with, Ngx_http_init_ The request function first assigns a ngx_http_request_t structure to the demand in the connected memory pool, which is used to hold all the information for the request. After the assignment, the reference to this structure will be in the request field of the HC member connected to the package so that it can be reused in a long connection or pipelined request. In this function, Nginx finds a default virtual server configuration based on the receiving port and address of the request (the Default_server property of the Listen directive identifies a default virtual server, or listens to multiple virtual servers on the same port and address. The first definition is the default, because in the Nginx configuration file you can set up multiple virtual servers that listen on different ports and addresses (each server block corresponds to one virtual server), in addition to the domain name (Server_ The name directive can configure the domain name of the virtual server to differentiate between virtual servers that listen on the same port and address, and each virtual server can have different configuration content that determines how Nginx handles the request after receiving a request. Once found, the corresponding configuration is saved in the ngx_http_request_t structure of the request. Note that the default configuration found here based on the port and address is only temporarily used, and eventually Nginx will find a real virtual server configuration based on the domain name, followed by the initialization of the following tasks include:

Sets the processing function of the connected read event to the Ngx_http_process_request_line function, which is used to parse the request line and set the requested Read_event_handler to the ngx_http_block_reading function. This function actually does nothing (of course, when the event model is set to a horizontal trigger, the only thing to do is to remove the event from the event model listener list to prevent the event from being triggered), and then say why the Read_event_handler is set to this function.
Allocate a buffer for this request to hold its request header, the address is saved in the Header_in field, the default size is 1024 bytes, you can use the client_header_buffer_size instruction to modify, here need to pay attention to, The buffer that Nginx uses to hold the request headers is allocated in the memory pool where the request is connected. And the address is saved in the connected buffer field, the purpose of this is to reuse the buffer for the next request to the connection, and if the client sends a request header greater than 1024 bytes, Nginx will redistribute larger buffers, the default for large-Request headers with a maximum buffer of 8K, up to 4, 2 values can be set with the large_client_header_buffers instruction, followed by the request line and a request header can not exceed the size of a maximum buffer ;
The same nginx will allocate a memory pool for this request, and all subsequent memory allocations associated with the request will typically use that memory pool, with a default size of 4,096 bytes, which can be modified using the REQUEST_POOL_SIZE directive;
Assign the response header list to this request with an initial size of 20;
Create a context ctx pointer array for all modules, variable data;
Set the main field of the request to itself, which means that it is a master request, and that the Nginx has a child request concept, which will be described in detail later in this chapter.
Set the Count field of the request to the 1,count field to indicate the reference count of the request;
Keeping the current time in the Start_sec and Start_msec fields, which is the start of the request, will be used to compute the processing time of a request (request times), and the starting point used by Nginx is slightly different from Apache. The starting point of the request in Nginx is to receive the first packet of the client, and Apache starts after receiving the client's entire request line;
The other fields of the request, such as setting Uri_changes to 11, indicate that the requested URI can be overwritten 10 times, and Subrequests set to 201, which indicates that a request can initiate up to 200 child requests;
After all this initialization work, the Ngx_http_init_request function calls the processing function of the Read event to truly parse the data sent by the client, which means it is processed into the Ngx_http_process_request_line function.


The main function of the Ngx_http_process_request_line function is to parse the request line, and also because of the network IO operation, even a very short row of requests may not be read at one time, so in the previous Ngx_http_init_request function , the Ngx_http_process_request_line function is set to the processing function of the Read event, it also only has a unique ngx_event_t * type parameter, and at the beginning of the function, it also needs to determine whether it is a timeout event, and if so, Closes the request and the connection, otherwise the normal parsing process begins. Call the Ngx_http_read_request_header function to read the data first.


because it is possible to enter the Ngx_http_process_request_line function multiple times, the Ngx_http_read_request_header function first checks for data in the buffer area that the requested header_in points to. Some words are returned directly, otherwise the data is read from the connection and stored in the requested header_in, and as long as there is space in the buffer, it reads as much data as possible and reads how much it returns; if the client has not sent any data for a while and returns Ngx_again, Before returning to do 2 things: 1, set a timer, Shichangme think 60s, can be set through the instruction Client_header_timeout, if the timed event arrives without any readable event, Nginx will close this request; 2, call Ngx_handle_ The Read_event function handles read events-if the connection has not mounted a read event on the event-handling model, is mounted, if the client closes the connection prematurely or if there are other errors in reading the data, return a 400 error to the client (there is no guarantee that the client will receive the response data, of course). Because the client may have closed the connection, the last function returns Ngx_error;


If the Ngx_http_read_request_header function reads the data normally, the Ngx_http_process_request_line function will call Ngx_http_parse_request_ Line function to parse, this function implements a finite state machine based on the definition of the request row in the HTTP protocol specification, through which Nginx records the request method in the request line, the request URI, and the HTTP protocol version's starting position in the buffer. Some other useful information is also recorded during the parsing process for later use in the process. If no problem occurs during the parsing of the request line, the function returns NGX_OK, and if the request row does not meet the protocol specification, the function immediately terminates the parsing process and returns the corresponding error number; If the buffer data is insufficient, the function returns Ngx_again. Throughout the state machine that resolves HTTP requests, two important principles are always followed: reduced memory copying and backtracking. The memory copy is a relatively expensive operation, and a large amount of memory copy brings low run-time efficiency. Nginx to copy only the starting and ending addresses of memory, rather than the memory itself, in places where you need to make a copy of the memory, so that only two assignment operations are required, which greatly reduces overhead, and of course the effect is that subsequent operations cannot modify the memory itself, if modified, Affects all references to the memory range, so it must be managed carefully and a copy is required when necessary. Here we have to mention the data structure that best embodies this idea in Nginx, ngx_buf_t, which is used to represent caching in Nginx, in many cases by simply storing the starting and ending addresses of a piece of memory in its POS and last member, and then placing its memory flag at 1 , you can represent a memory interval that cannot be modified, and in another case where you need a cache that can be modified, you must allocate a memory of the required size and save its starting address, and then place the ngx_bug_t temprary flag at 1, indicating that it is an area of memory that can be modified.


Back to the Ngx_http_process_request_line function, if the Ngx_http_parse_request_line function returns an error, it returns a 400 error directly to the client;
If you return to Ngx_again, you need to determine if there is insufficient buffer space or insufficient read data. If the buffer size is not enough, Nginx will call the Ngx_http_alloc_large_header_buffer function to allocate another large buffer, if the large buffer is not enough to install the entire request line, Nginx will return 414 error to the client, Otherwise, after allocating a larger buffer and copying the previous data, continue to call the Ngx_http_read_request_header function to read the data to enter the request line automata processing until the request line parsing ends;
If NGX_OK is returned, the request line is parsed correctly, then the start address and length of the request line are recorded, and the path and parameter portions of the request URI are saved in the URI field of the request structure, and the request method start position and length are saved in the Method_name field. The HTTP version start position and length are recorded in the Http_protocol field. Also, the parameters are parsed from the URI and the extension name of the requested resource is saved in the args and Exten fields.

Discard Request Body

A module wants to actively discard the client sent the request body, you can invoke the Nginx core provided by the Ngx_http_discard_request_body () interface, active discard can be many kinds of reasons, such as the module's business logic does not need the request body, The client sends an too large request body, and in addition to the pipeline request for compatibility with the HTTP1.1 protocol, the module has the obligation to actively discard unwanted request bodies. In a nutshell, to maintain good client compatibility, Nginx must actively discard unwanted request bodies. The following starts the analysis of the Ngx_http_discard_request_body () function:

ngx_int_t ngx_http_discard_request_body (ngx_http_request_t *r) {ssize_t size; 
 
  ngx_event_t *rev; 
  if (r!= R->main | | r->discard_body) {return NGX_OK; 
  } if (Ngx_http_test_expect (r)!= Ngx_ok) {return ngx_http_internal_server_error; 
 
  Rev = r->connection->read; 
 
  Ngx_log_debug0 (ngx_log_debug_http, Rev->log, 0, "HTTP set Discard Body"); 
  if (rev->timer_set) {Ngx_del_timer (rev); 
  } if (r->headers_in.content_length_n <= 0 | | | r->request_body) {return NGX_OK; 
 
  Size = r->header_in->last-r->header_in->pos; 
      if (size) {if (R->headers_in.content_length_n > Size) {r->header_in->pos = = size; 
 
    R->headers_in.content_length_n-= size; 
      else {R->header_in->pos = = (size_t) r->headers_in.content_length_n; 
      R->headers_in.content_length_n = 0; 
    return NGX_OK; }} R->read_event_handler = Ngx_httP_discarded_request_body_handler; 
  if (ngx_handle_read_event (rev, 0)!= Ngx_ok) {return ngx_http_internal_server_error; 
 
  } if (Ngx_http_read_discarded_request_body (r) = = NGX_OK) {r->lingering_close = 0; 
    else {r->count++; 
  R->discard_body = 1; 
return NGX_OK; 
 }

Because the function is not long, here it is fully listed, the beginning of the function also first judged that no need to do the processing of the case: the child request does not need processing, has called the function does not need to be processed. Then call Ngx_http_test_expect () to handle the http1.1 expect situation, according to the http1.1 expect mechanism, if the client sends the expect header, and the server does not want to receive the request body, must return 417 ( Expectation Failed) error. Nginx did not do this, it simply lets the client send the request body and discard it. Next, the function deletes the timer on the read event, because at this time does not need to request the body, so it does not matter whether the client sent fast or slow, of course, will come later, when Nginx has processed the request but the client has not sent out the useless request body, Nginx will be on the read event and then hang up the timer.
The function also checks the Content-length header in the request header, and if the client intends to send the request body, it must send a content-length header and see if the request body has been read elsewhere. If the requested body is really waiting to be processed, the function then checks the prefetch data in the request header buffer, and the read-ahead data is discarded directly, although the function is returned directly if the request body has been read all over.

Next, if there is still the remaining request body unhandled, the function calls Ngx_handle_read_event () to mount the read event in the event-handling mechanism and set the processing function of the read event to Ngx_http_discarded_request_body_ Handler After doing these preparations, the function finally calls the Ngx_http_read_discarded_request_body () interface to read the client's request body and discard it. If the client does not send the body of the request at once, the function returns, and the remaining data is passed to Ngx_http_discarded_request_body_handler () when the next read event is over, and the requested Discard_ The body will be set to 1 to identify the situation. The additional requested reference count (count) is also added by 1, so that the client may not be able to send the requested body in full after the request has been processed by Nginx, and the addition of the reference is to prevent the Nginx core from releasing the requested resource directly after processing the request.

The

Ngx_http_read_discarded_request_body () function is so simple that it loops through the link to read the data and discard it until all the data in the receiving buffer is read, and if the request body has been read, the function sets the Read event's processing function to Ngx_ Http_block_reading, this function simply deletes the level-triggered read event to prevent the same event from being triggered continuously.
Take a look at the processing function of the Read event Ngx_http_discarded_request_body_handler, which is invoked each time the event is read, and first look at its source code:

void Ngx_http_discarded_request_body_handler (ngx_http_request_t *r) {... c = r->connection; 
 
  Rev = c->read; 
    if (rev->timedout) {c->timedout = 1; 
    C->error = 1; 
    Ngx_http_finalize_request (R, Ngx_error); 
  Return 
 
    } if (r->lingering_time) {timer = (ngx_msec_t) (R->lingering_time-ngx_time ()); 
      if (timer <= 0) {r->discard_body = 0; 
      R->lingering_close = 0; 
      Ngx_http_finalize_request (R, Ngx_error); 
    Return 
  } else {timer = 0; 
 
  rc = Ngx_http_read_discarded_request_body (r); 
    if (rc = = NGX_OK) {r->discard_body = 0; 
    R->lingering_close = 0; 
    Ngx_http_finalize_request (R, Ngx_done); 
  Return 
    }/* rc = = Ngx_again */if (Ngx_handle_read_event (rev. 0)!= ngx_ok) {c->error = 1; 
    Ngx_http_finalize_request (R, Ngx_error); 
  Return } if (timer) {CLCF = ngx_http_get_module_loc_conf (R, Ngx_http_Core_module); 
 
    Timer *= 1000; 
    if (Timer > clcf->lingering_timeout) {timer = clcf->lingering_timeout; 
  Ngx_add_timer (rev, timer); 
 } 
}

The function starts with a read event timeout, and when it comes to the timer that has already been deleted in the Ngx_http_discard_request_body () function, when will the timer be set? The answer is that when Nginx has finished processing the request, but has not completely discarded the requested body of the request (the client may not have sent it over), in the Ngx_http_finalize_connection () function, if you check that there are no discarded request bodies, Nginx adds a read event timer, which is specified by the lingering_timeout instruction, defaults to 5 seconds, but this time is just two times the timeout between read events, waiting for the total length of the request body to be specified by the lingering_time instruction, default is 30 seconds. In this case, the function returns directly and disconnects if a timeout event is detected. Also, it is necessary to control that the length of the entire discard request body cannot exceed the time of the Lingering_time setting, and if the maximum length is exceeded, the connection is returned directly and disconnected.
If the Read event occurs before the request is processed, the timeout event is not handled and the timer is not set, and the function simply calls Ngx_http_read_discarded_request_body () to read and discard the data.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.