PHP-TSRM Thread Safety Manager-source code Analysis

Source: Internet
Author: User
Tags php source code zts zend
When looking at PHP source code or developing PHP extensions, there will be a large number of Tsrmls_ macros in the function parameters of the location, these macros are Zend for the thread safety mechanism (Zend thread ' Safety, referred to as ZTS) to ensure thread security, is to prevent the multi-threaded environment in the form of modules to load and execute the PHP interpreter, resulting in some internal public resources read errors, and provide a workaround.

When do I need to use TSRM

As long as the server is a multithreaded environment and PHP is provided as a module, then it needs to be TSRM enabled, such as the worker mode (multi-process multithreading) environment under Apache, which must use the thread-safe version of PHP, which is to enable TSRM, PHP is compiled under Linux to specify whether to turn on TSRM, Windows is to provide a thread-safe version and a non-thread-safe version of PHP.

How PHP implements TSRM

Normal multi-threaded environment operation of the public resources are added to the mutex, and PHP does not choose to lock, because the lock may be some performance loss, PHP solution is to copy a copy of the current PHP kernel for each thread all the public resources, each thread point to their own public resource area, do not affect each other, The public resources of each operation.

What is public resources?

Is the definition of various struct struct bodies

TSRM Data structure

Tsrm_tls_entry thread structure, each thread has a copy of the struct

typedef struct _tsrm_tls_entry tsrm_tls_entry;struct _tsrm_tls_entry {    void **storage;       int count;    thread_t thread_id;    Tsrm_tls_entry *next;} Static Tsrm_tls_entry   **tsrm_tls_table = NULL//thread pointer table header pointer static int  tsrm_tls_table_size;  Number of current thread structure bodies

Field description

void **storage: The resource pointer, which is the public resource memory area that points to itself int count: The number of resources, which is how much public resources are registered by the PHP kernel + extension thread_t thread_id: Thread Idtsrm_tls_entry *next : Point to the next thread pointer, since each thread pointer currently has a thread pointer table (similar to a hash table), this next can be interpreted as a hash conflict chaining solution. Tsrm_resource_type Public resource type structure, How many public resources are registered, and how many of those structures
typedef struct {    size_t size;    Ts_allocate_ctor ctor;    Ts_allocate_dtor dtor;    int done; } tsrm_resource_type;static tsrm_resource_type   *resource_types_table=null;  Public Resource Type table header pointer static int  resource_types_table_size;//Current number of public resource types

Field description

size_t Size: Resource Size Ts_allocate_ctor ctor: constructor pointer, when the resource is created for each thread, invokes the current ctor pointer ts_allocate_dtor dtor: destructor pointer, When the resource is freed, it calls the current dtor pointer int done: Whether the resource has been destroyed 0: normal 1: Destroyed

Global Resource ID

typedef int ts_rsrc_id;static ts_rsrc_id   id_count;

What is the global resource ID

TSRM when registering a public resource, a unique ID is generated for each resource, and a corresponding resource ID is assigned to the resource at a later time.

Why global resource ID is required

Because each of our threads will copy all the public resources currently registered, that is, a malloc () a large array, the resource ID is the index of the array, that is, to obtain the corresponding resource, you need to specify the corresponding resource ID.

Easy to understand said:
Because TSRM is to have each thread point to its own heap of public resources (arrays), and in this heap of public resources to find the resources you want to use the corresponding resource ID, if not this thread-safe version, it will not aggregate these public resources into a heap, It's good to get it directly from the corresponding name.

Approximate execution process

The kernel initializes the TSRM, registers the common resources involved in the kernel, and registers the public resources involved in the external extension.

The corresponding thread invokes the PHP interpreter function entry location, initializing the current thread's public resource data.

You need that public resource to get it through the corresponding resource ID.

TSRM initialization structure diagram


TSRM Source file path

/php-5.3.27/tsrm/tsrm.c/php-5.3.27/tsrm/tsrm.h

TSRM involves the main function

Initialize TSRM

Tsrm_startup ()

Registering Public resources

TS_ALLOCATE_ID ()

Gets, registers all public resources, does not exist, initializes, returns the &storage pointer

#define Tsrmls_fetch () void ***tsrm_ls = (void * * *) ts_resource_ex (0, NULL)

Get the corresponding resource by specifying the resource ID

#define TS_RESOURCE (ID)    ts_resource_ex (ID, NULL)

Initializes the current thread and copy the existing public resource data to the storage pointer

Allocate_new_resource ()

TSRM some common macro definitions

#ifdef zts#define tsrmls_d   void ***tsrm_ls#define tsrmls_dc  , tsrmls_d#define tsrmls_c   tsrm_ls#define TSRMLS_CC  , Tsrmls_c#else#define tsrmls_d   void#define tsrmls_dc#define tsrmls_c#define TSRMLS_CC#endif

You can see that if TSRM is turned on then Zts is true, then this set of TSRM macros will be defined, and those macros that are often seen in the extension in the list of function arguments will be replaced with a void ***tsrm_ls pointer, which is actually the current thread calling the function to put the thread's public resource area address & storage** is passed in to ensure that the function internal execution process obtains the corresponding thread's public resources accurately.

TSRM Approximate call function mode

Call
Tsrmls_fetch () Replace void ***tsrm_ls

Perform

-  Test (int a  tsrmls_cc), test_1 (int b tsrmls_cc)

Replace

-  Test (int a  , Tsrm_ls), test_1 (int b, Tsrm_ls)

TSRM How to release

The above said that the Apache worker mode multi-threaded, is a process to open multiple threads to invoke the PHP interpreter, when each thread ends not immediately destroy the current thread created resource data (because it is possible that the thread will be used immediately, Instead of re-initializing the thread for all of the public resource data, it can be used directly, but when the process ends, it traverses all the threads, releasing all the threads and the corresponding resource data.

Source code Comment

Tsrm_startup function Description

TSRM_API int tsrm_startup (int expected_threads, int expected_resources, int debug_level, char *debug_filename) {    // Omit ...        Default number of threads    tsrm_tls_table_size = expected_threads;    Create tsrm_tls_entry pointer array    tsrm_tls_table = (tsrm_tls_entry * *) calloc (tsrm_tls_table_size, sizeof (Tsrm_tls_entry *) );    Omit ...        Global resource Unique ID initialization    id_count=0;    Default number of resource types    resource_types_table_size = expected_resources;    Omit ...        Create an array of tsrm_resource_type structures    resource_types_table = (Tsrm_resource_type *) calloc (resource_types_table_size, sizeof (Tsrm_resource_type));    Omit ...        return 1;}

In general, the function is called when the PHP kernel is initialized, in order to save memory, the default will be a number of threads and a number of resource types, then if not enough will be expanded

ts_allocate_id Function Description

Tsrm_api ts_rsrc_id ts_allocate_id (ts_rsrc_id *rsrc_id, size_t size, ts_allocate_ctor ctor, Ts_allocate_dtor dtor) {int    I    Omit ...//generate a unique ID for the current resource *rsrc_id = Tsrm_shuffle_rsrcidd (id_count++);        Tsrm_error ((Tsrm_error_level_core, "obtained resource ID%d", *rsrc_id)); Determines whether the current resource type table is less than the current number of resources//if it is less than the Resource Type table if (Resource_types_table_size < id_count) {resource_types_table =        (Tsrm_resource_type *) realloc (resource_types_table, sizeof (Tsrm_resource_type) *id_count);    Omit ... resource_types_table_size = Id_count;    }//Assign the size of the public resource, constructor and destructor pointer resource_types_table[tsrm_unshuffle_rsrc_id (*rsrc_id)].size = size;    resource_types_table[tsrm_unshuffle_rsrc_id (*rsrc_id)].ctor = ctor;    resource_types_table[tsrm_unshuffle_rsrc_id (*rsrc_id)].dtor = dtor;        resource_types_table[tsrm_unshuffle_rsrc_id (*rsrc_id)].done = 0; Iterate over the thread structure, assigning the currently created resource data to the memory space that storage points to (i=0; i<tsrm_tls_table_size; i++) {tsrm_tls_entry *p = Tsrm_tls_table[i];        In the first case,//p may be null, because the thread struct pointer has not been called tsrmls_fetch () or//So resource_types_table first temporarily saves the size of the resource, and then initializes When the thread struct pointer is created, it automatically creates the memory space of the public resource and assigns a value of storage////The second case//initialized the corresponding thread struct pointer, then directly according to the current newly created resource ID number pair//p-& Gt;storage is scaled up because the resource ID is incremented, and a specific resource memory space is created based on the size//malloc of the current resource, and a callback ctor while (p) {if (P-&G                T;count < Id_count) {int J;                P->storage = (void *) realloc (p->storage, sizeof (void *) *id_count); For (j=p->count; j<id_count; j + +) {P->storage[j] = (void *) malloc (resource_types_table[j].si                    Ze); if (resource_types_table[j].ctor) {resource_types_table[j].ctor (p->storage[j], &p->stora                    GE);            }}//id_count each time +1, is actually the total number of our public resources P->count = Id_count;     }//point to the next thread-struct-body pointer       p = p->next; }}//Omit ...//return just id_count++ return *rsrc_id;}

When you need to register to create a public resource data to call the function, generally in a multithreaded environment will be called, it can also be seen that the function will traverse all the thread structure pointer, and constantly ralloc and malloc so repeated calls to the function will also have performance loss.

Tsrmls_fetch ()-ts_resource_ex function description

Tsrm_api void *ts_resource_ex (ts_rsrc_id ID, thread_t *th_id) {thread_t thread_id;    int hash_value;    Tsrm_tls_entry *thread_resources; Omit ... if (tsrm_tls_table) {//get current thread ID if (!th_id) {//omit ... thread_id = tsrm_        THREAD_ID ();        } else {thread_id = *th_id;    } tsrm_error ((Tsrm_error_level_info, "Fetching resource ID%d for thread%ld", ID, (long) thread_id));        Tsrm_mutex_lock (Tsmm_mutex); #define THREAD_HASH_OF (thr,ts) (unsigned long) thr% (unsigned long) TS//takes a modulo operation by thread ID and the current number of initialized threads, calculating the current thread pointer position because//when the front-pass pointer    There are tsrm_tls_table tables, if a thread pointer already exists in the current position//Then Tsrm_tls_table->next is actually a hash conflict chaining solution.    Hash_value = thread_hash_of (thread_id, tsrm_tls_table_size);    Thread_resources = Tsrm_tls_table[hash_value];    If it does not exist, go to create the current thread and copy all of the public resources that were created before the ts_allocate_id registration was called.        if (!thread_resources) {Allocate_new_resource (&tsrm_tls_table[hash_value], thread_id); Return TS_RESOURCE_EX (ID, &thread_id);                } else {do {///determine if thread ID is equal if (thread_resources->thread_id = = thread_id) {            Break }//If not equal then next if (thread_resources->next) {thread_resources = thread_resources-&            Gt;next; } else {//If it does not exist then initialize to create the current thread Allocate_new_resource (&thread_resources->next, thread_id                );            return ts_resource_ex (ID, &thread_id);    }} while (Thread_resources); }//After the current thread has been found or created, return the current thread public resource area &storage pointer/////If the resource ID is specified return storage[id] pointer tsrm_safe_return_rsrc (thread_resources ->storage, ID, thread_resources->count);}

Allocate_new_resource Function Description

static void Allocate_new_resource (Tsrm_tls_entry **thread_resources_ptr, thread_t thread_id) {int i; THREAD_RESOURCES_PTR//There may be &tsrm_tls_table[hash_value] pointer//There may be a &tsrm_tls_table[hash_value]->next pointer,    This situation is a hash conflict (*thread_resources_ptr) = (Tsrm_tls_entry *) malloc (sizeof (tsrm_tls_entry));    (*thread_resources_ptr)->storage = (void *) malloc (sizeof (void *) *id_count);    (*thread_resources_ptr)->count = Id_count;    (*thread_resources_ptr)->thread_id = thread_id;        (*thread_resources_ptr)->next = NULL;    /* Set thread Local storage to this new thread resources structure */tsrm_tls_set (*THREAD_RESOURCES_PTR); if (Tsrm_new_thread_begin_handler) {Tsrm_new_thread_begin_handler (thread_id, & (*THREAD_RESOURCES_PTR)->st    Orage)); }//This loop is to take out all of the resource type data in the Resource_types_table table//To create a specific memory space based on the size and assign a value to the current thread's storage//Because the function called ts_allocate_id just now May    There is a case where the thread pointer is not initialized//So only the global resource type data is created, and no specific resource data is created. for (i=0; I<id_counT        i++) {if (Resource_types_table[i].done) {(*thread_resources_ptr)->storage[i] = NULL;            } else {(*thread_resources_ptr)->storage[i] = (void *) malloc (resource_types_table[i].size); if (resource_types_table[i].ctor) {resource_types_table[i].ctor ((*thread_resources_ptr)->storage[i            ], & (*THREAD_RESOURCES_PTR)->storage);        }}}//Call the function pointer, copy the configuration information and callback the configuration item that has the configuration callback function to//populate the current thread corresponding to the storage global zone if (Tsrm_new_thread_end_handler) {    Tsrm_new_thread_end_handler (thread_id, & (*THREAD_RESOURCES_PTR)->storage); }}

Extended TSRM Use

We also develop the extension in accordance with the thread-safe version of the development, through the ZTS macro to determine whether the current PHP thread-safe version.

Definition of public resources in extensions:

Define the public resource data, replaced by a zend_ module name Structure zend_begin_module_globals (module_name) int Id;char name; Zend_end_module_globals (module_name)//corresponding macro definition # define ZEND_BEGIN_MODULE_GLOBALS (module_name)    typedef struct _ zend_# #module_name # #_globals {#define ZEND_END_MODULE_GLOBALS (module_name)} zend_# #module_name # #_globals;// After replacing the typedef struct _ZEND_MODULE_NAME_GLOBALS {   int id;   char name;} Zend_module_name_globals;

Resource ID definition in the extension

#ifdef ZTS  #define Zend_declare_module_globals (module_name)                        ts_rsrc_id module_name# #_globals_id; #else # Define Zend_declare_module_globals (module_name)                                         zend_# #module_name # #_globals module_name# #_globals; #endif

(1) Thread-safe version: The global resource Unique ID is automatically declared, because each thread goes through the current ID to storage point to the memory area to get the resource data
(2) Non-thread-safe version: automatically declares the current struct variable, each time through the variable name to get the resources, because there is no other thread scramble for the situation

Access to public resource data in the extension

#ifdef ZTS    #define MODULE_G (v) tsrmg (xx_globals_id, zend_xx_globals *, v) #else    #define MODULE_G (v) (xx_ GLOBALS.V) #endif

If each fetch resource is obtained through the MODULE_G () macro defined by itself, the resource ID data specified by the current thread is obtained through the corresponding TSRM manager if it is thread safe, if it is not directly obtained by the resource variable name.

Initializing public resources in the extension

General initialization of the public resource data, will be performed in the extended Minit function//If it is zts ts_allocate_id called. Php_minit_function (myextension) {    #ifdef ZTS       ts_allocate_id (&xx_globals_id,sizeof (zend_module_name_ Globals), Ctor,dtor)    #endif}

End

Described above is the implementation of the PHP-TSRM thread safety manager, after understanding TSRM, whether it is to see the kernel source code or develop PHP extension has a great advantage, because the kernel and the extension is filled with a lot of TSRM_ macro definition.


Related reading:

The use of TSRM and its macros in PHP (thread safety Management)

PHP CGI vs. FPM relationships

PHP CGI FastCGI php-fpm FAQ

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.