Fastdfs-nginx extension module source code analysis, fastdfsng.pdf

Source: Internet
Author: User

Fastdfs-nginx extension module source code analysis, fastdfsng.pdf
FastDFS-Nginx extension module source code analysis 1. Background

In most business scenarios, http download services are often required for files stored in FastDFS. Although FastDFS has built-in http services in its storage and tracker, its performance is not satisfactory;
In later versions, Yu Qing added an extension module (including nginx/apache) based on the current mainstream web servers. It is intended to use web servers to directly provide http services for local storage data files, to improve the file download performance.

 

2. Overview

About FastDFS architecture principle will not go into detail, interested can refer to: http://code.google.com/p/fastdfs/wiki/Overview

2.1 reference architecture

The reference architecture for integrating Nginx with FastDFS is as follows:

 

Note: The Nginx and FastDFS extension modules are deployed on each storage server host. The Nginx module provides the http download service for storage files. If the current storage node cannot find the filesSource storageThe host initiates a redirect or proxy action.
Note: The tracker in the figure may be a cluster composed of multiple trackers. The Nginx extension module of FastDFS currently supports multiple groups on a single machine.

 

2.2 concepts

Storage_id: Refers to the storage server id. Starting from FastDFS4.x, tracker can define a group of ip-to-id mappings for storage and manage storage in the form of IDs. The file name is not written into the storage ip address but the id. This method is very advantageous for data migration.
Storage_sync_file_max_delay: The maximum time delay of synchronization of a file on the storage node, which is a threshold value. If the gap between the current time and the file creation time exceeds this value, the synchronization is completed.
Anti_steal_token: Indicates the file ID anti-leech method. FastDFS uses the token Authentication Method to perform the file anti-leech check.

 

3. Implementation principle 3.1 source code package description

The downloaded source code package is small and only contains the following files:

Ngx_http_fastdfs_module.c // nginx-module Interface implementation file, used to access the logic common of the fastdfs-module core module. c // fastdfs-module core module, which implements the main logic common for initialization and file download. h // corresponds to common. c header file config // The configuration used by the compilation module, which defines some important constants, such as the extension configuration file path and File Download chunk size mod_fastdfs.conf // demo of the Extension Configuration File

 

3.2 Initialization

3.2.1 load the configuration file

Target file:/etc/fdfs/mod_fastdfs.conf

3.2.2 read extension module configuration

Some important parameters include:

Group_count // number of groups url_have_group_name // whether the url contains group. store_path // The storage path connect_timeout for the group // connection timeout network_timeout // receiving or sending timeout storage_server_port // storage_server port, this method is used to connect to the source storage to download a file (this method is outdated) response_mode // response mode, proxy or redirect load_fdfs_parameters_from_tracker // whether to download server configurations from tracker
3.2.3 load server configurations

Determine whether to obtain server configuration information from tracker Based on the load_fdfs_parameters_from_tracker parameter.

  • Load_fdfs_parameters_from_tracker = true:
  • Load_fdfs_parameters_from_tracker = false:

 

3.3 download process

3.3.1 parse the access path

Obtain the group and file_id_without_group parameters;

 

3.3.2 anti-leech check
  • AccordingG_http_params.anti_steal_tokenConfiguration (see the http. conf file) to determine whether to perform anti-leech check;
  • The token method is used to implement anti-leech protection. This method requires a token and the token is time-sensitive (specified by the ts parameter );

Check Method:

Md5 (fileid_without_group + privKey + ts) = token; ts does not exceed the ttl range (see JavaClient CommonProtocol)

Call method: fdfs_http_check_token
For more information about FastDFS anti-Leech, see: http://bbs.chinaunix.net/thread-1916999-1-1.html

 

3.3.3 getting file metadata

Obtain metadata information based on the file ID, including:Source storage ip address, file path, name, size 
Code:

    if ((result=fdfs_get_file_info_ex1(file_id, false, &file_info)) != 0)...

InFdfs_get_file_info_ex1There is a clever logic in implementation:
After obtaining the ip segment of the file, you still need to determine whether the segment is the storage id or ip address.
Code:

  fdfs_shared.func.c  -> fdfs_get_server_id_type(ip_addr.s_addr) == FDFS_ID_TYPE_SERVER_ID  ...       if (id > 0 && id <= FDFS_MAX_SERVER_ID) {          return FDFS_ID_TYPE_SERVER_ID;       } else  {         return FDFS_ID_TYPE_IP_ADDRESS;       }

 

Determines whether the integer of the standard ip segment is 0 to->FDFS_MAX_SERVER_ID(See tracker_types.h;
FDFS_MAX_SERVER_ID = (1 <24)-1. This method utilizes the characteristics of ipv4 addresses (consisting of 4*8 binary digits). That is, the ipv4 address value must be greater than the threshold.

3.3.4 check whether the local file exists

CallTrunk_file_stat_ex1Obtain local file information. This method is implemented as follows:

Code:

    if (bSameGroup)    {            FDFSTrunkHeader trunkHeader;        if ((result=trunk_file_stat_ex1(pStorePaths, store_path_index, \            true_filename, filename_len, &file_stat, \            &trunkInfo, &trunkHeader, &fd)) != 0)        {            bFileExists = false;        }        else        {            bFileExists = true;        }    }    else    {        bFileExists = false;        memset(&trunkInfo, 0, sizeof(trunkInfo));    }
3.3.5 processing of nonexistent files
  • Check Validity

Check items include:

A. The source storage is local or the gap between the current time and the file creation time has exceeded the threshold. An error is returned;

Code:

     if (is_local_host_ip(file_info.source_ip_addr) || \        (file_info.create_timestamp > 0 && (time(NULL) - \            file_info.create_timestamp > '''storage_sync_file_max_delay''')))

 

B. If it is a scenario after redirect, the same error is reported;
If it is a request from another storage node redirect, the url parameter contains a redirect item.


Proxy or redirection will be performed after the validity check is passed

  • Redirection Mode

Configuration item response_mode = redirect, the server returns the 302 response code, the url is as follows:

Http: // {source storage address }:{ current port} {current url} {parameter "redirect = 1"} (marked as redirected)

 

Code:

      response.redirect_url_len = snprintf( \                response.redirect_url, \                sizeof(response.redirect_url), \                "http://%s%s%s%s%c%s", \                file_info.source_ip_addr, port_part, \                path_split_str, url, \                param_split_char, "redirect=1");

 

Note: In this mode, the source storage must be configured with a publicly accessible webserver, the same port (usually 80), and the same path.

  • Proxy Mode

The configuration item response_mode = proxy. The working principle of this mode is similar to that of reverse proxy.Only use the source storage address as the proxy hostAnd the rest remain unchanged.
Code:

If (pContext-> proxy_handler! = NULL) {return pContext-> proxy_handler (pContext-> arg, \ file_info.source_ip_addr );} // The proxy_handler method comes from the ngx_http_fastdfs_module.c file. The ngx_http_fastdfs_proxy_handler method // sets a large number of callback and variables in its implementation, and finally calls the proxy request method. The returned result is rc = callback (r, ngx_http_upstream_init); // executes the proxy request and returns the result.
3.3.6 output local files
When a local file exists, it is output directly.
  • Depending on whether or notTrunkfileGet the file name, file name length, and file offset;

Code:

    bTrunkFile = IS_TRUNK_FILE_BY_ID(trunkInfo);    if (bTrunkFile)    {        trunk_get_full_filename_ex(pStorePaths, &trunkInfo, \                full_filename, sizeof(full_filename));        full_filename_len = strlen(full_filename);        file_offset = TRUNK_FILE_START_OFFSET(trunkInfo) + \                pContext->range.start;    }    else    {        full_filename_len = snprintf(full_filename, \                sizeof(full_filename), "%s/data/%s", \                pStorePaths->paths[store_path_index], \                true_filename);        file_offset = pContext->range.start;    }

 

  • If nginx enables the send_file switch and is not a chunkFile, tryUse sendfile to optimize performance;

Code:

    if (pContext->send_file != NULL && !bTrunkFile)    {        http_status = pContext->if_range ? \                HTTP_PARTIAL_CONTENT : HTTP_OK;        OUTPUT_HEADERS(pContext, (&response), http_status)        ......        return pContext->send_file(pContext->arg, full_filename, \                full_filename_len, file_offset, download_bytes);    }

 

  • Otherwise, use lseek to randomly access the file and output the corresponding segments;

Practice: Use chunk to read cyclically and output...
Code:

    while (remain_bytes > 0)    {        read_bytes = remain_bytes <= FDFS_OUTPUT_CHUNK_SIZE ? \                 remain_bytes : FDFS_OUTPUT_CHUNK_SIZE;        if (read(fd, file_trunk_buff, read_bytes) != read_bytes)        {            close(fd);            ......            return HTTP_INTERNAL_SERVER_ERROR;        }        remain_bytes -= read_bytes;        if (pContext->send_reply_chunk(pContext->arg, \            (remain_bytes == 0) ? 1: 0, file_trunk_buff, \            read_bytes) != 0)        {            close(fd);            return HTTP_INTERNAL_SERVER_ERROR;        }    }

 

The chunk size can be found in the config file Configuration:-DFDFS_OUTPUT_CHUNK_SIZE = '2017*66661'

4. Extended reading

Anti-leech Based on Referer:
Http://www.cnblogs.com/wJiang/archive/2010/04/04/1704445.html

FastDFS FAQ:
Http://bbs.chinaunix.net/thread-1920470-1-1.html

FastDFS-Nginx Extension Configuration reference:
Http://blog.csdn.net/poechant/article/details/7036594

FastDFS configuration and deployment materials-CSDN blog:
Http://blog.csdn.net/poechant/article/details/6996047

Differences between C-language open and fopen
Http://blog.csdn.net/hairetz/article/details/4150193

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.