Threading security Model Analysis _php techniques for PHP and Zend engine

Source: Internet
Author: User
Tags data structures sapi unique id zts zend
Do not know how it is always uncomfortable, so I read the source code and check the limited information to briefly understand the relevant mechanism, this article is my summary of the research content. This paper first explains the concept of thread safety and the background of PHP thread security, and then studies the threads Security mechanism of PHP zts (Zend thread Safety) and the realization TSRM, including related data structure, implementation details and operation mechanism. Finally, the selective compilation of Zend for single-threaded and multi-threaded environments is studied.

Thread Safety
Thread safety problem, Word is how to secure access to public resources in a multi-threaded environment. We know that each thread has only one private stack and shares the heap of the owning process. In C, when a variable is declared outside any function, it becomes a global variable, at which point the variable is allocated to the shared storage space of the process, and the different threads refer to the same address space, so if a thread modifies the variable, it affects all threads. This may seem to be convenient for threads to share data, but PHP often handles a request per thread, so you want each thread to have a copy of a global variable, rather than trying to interfere with each other. Early PHP is often used for single-threaded environments, with each process starting only one thread, so there is no thread safety problem. Then there was the scenario of using PHP in a multithreaded environment, so Zend introduced the Zend thread security mechanism (Zend thread Safety, short zts) to secure the thread.

the basic principle and realization of ZTS
Basic ideas
The basic idea of zts is very intuitive, is not the need for each global variable in each thread has a copy? Then I offer this mechanism: in a multithreaded environment, the application global variable is no longer a simple declaration of a variable, but the entire process allocates a block of memory on the heap as a "thread global variable Pool", initializing the memory pool at process startup, whenever a thread needs to request a global variable, Call TSRM (the specific implementation of Thread Safe Resource manager,zts) and pass the necessary parameters (such as variable size, and so on), TSRM is responsible for allocating the corresponding chunk of memory in the memory pool and returning the reference identity of that memory. So the next time this thread needs to read and write this variable, it will be responsible for true read-write operations by passing the unique reference identity to TSRM,TSRM. This enables thread-safe global variables. The following figure gives a schematic diagram of the ZTS principle:
Thread1 and Thread2 are in the same process, each of which requires a global variable globally VAR,TSRM each allocated an area for both the thread Global memory pool (the yellow part), and is identified by a unique ID. So two threads can access their variables through TSRM without interfering with each other. The following is a detailed code snippet to see how Zend concretely implements this mechanism. Here I use the source code of PHP5.3.8. TSRM's implementation code in the PHP source "tsrm" directory.

Data structure
The more important data structures in TSRM are two: Tsrm_tls_entry and Tsrm_resource_type. Let's see Tsrm_tls_entry first. Tsrm_tls_entry is defined in TSRM/TSRM.C:
Copy Code code as follows:

typedef struct _TSRM_TLS_ENTRY tsrm_tls_entry;

struct _tsrm_tls_entry {
void **storage;
int count;
thread_t thread_id;
Tsrm_tls_entry *next;
}

Each tsrm_tls_entry structure is responsible for representing all the global variable resources of a thread, where the thread_id storage thread Id,count records the number of global variables, and next refers to the next node. Storage can be viewed as an array of pointers, where each element is a global variable that points to the thread represented by this node. Finally, the tsrm_tls_entry of each thread is composed of a linked list structure, and the chain header pointer is assigned to a global static variable tsrm_tls_table. Note that because tsrm_tls_table is a genuine global variable, all threads share this variable, which enables memory management consistency between threads. The schematic diagram of the tsrm_tls_entry and tsrm_tls_table structures is as follows:
The internal structure of the Tsrm_resource_type is relatively simple:
Copy Code code as follows:

typedef struct {
size_t size;
Ts_allocate_ctor ctor;
Ts_allocate_dtor dtor;
int done;
}

Tsrm_resource_type; Tsrm_tls_entry is threaded (one node per thread), and Tsrm_resource_type is a resource (or global variable), each time a new resource is allocated, You will create a tsrm_resource_type. All Tsrm_resource_type constitute tsrm_resource_table in the form of an array (linear table), whose subscript is the ID of the resource. Each Tsrm_resource_type stores the size and structure of this resource, and the destructor pointer. In a way, tsrm_resource_table can be seen as a hash table, key is a resource id,value is a tsrm_resource_type structure.

Implementation Details
This section analyzes the implementation details of some TSRM algorithms. Because the whole TSRM involves more code, here's a typical two-function analysis. The first noteworthy is the Tsrm_startup function, which is called by SAPI at the start of the process to initialize the TSRM environment. As the tsrm_startup is slightly longer, here are excerpts from what I think should be noted:
Copy Code code as follows:

/* Startup TSRM (call once for the entire process) */
Tsrm_api int tsrm_ Startup (int expected_threads, int expected_resources, int debug_level, char *debug_filename)
{
/* Code ... * *

Tsrm_tls_table_size = expected_threads;

Tsrm_tls_table = (tsrm_tls_entry * *) calloc (tsrm_tls_table_size, sizeof (Tsrm_tls_entry *));
if (!tsrm_tls_table) {
Tsrm_error (Tsrm_error_level_error, unable to allocate TLS table));
return 0;
}
Id_count=0;

Resource_types_table_size = expected_resources;
Resource_types_table = (Tsrm_resource_type *) calloc (resource_types_table_size, sizeof (Tsrm_resource_type));
if (!resource_types_table) {
Tsrm_error (tsrm_error_level_error, "Unable to allocate resource types table");
Free (tsrm_tls_table);
Tsrm_tls_table = NULL;
return 0;
}

/* Code ... * *

return 1;
}

In fact, the main task of Tsrm_startup is to initialize the two data structures mentioned above. The first and most interesting is its top two parameters: Expected_threads and expected_resources. These two parameters are passed in by SAPI, indicating the estimated number of threads and resources, and you can see that tsrm_startup allocates space (through Calloc) according to both parameters. So TSRM first allocates expected_threads threads and expected_resources resources. To see what each SAPI by default, you can see the source code of each SAPI (in the SAPI directory), I briefly looked at:
You can see that the more commonly used sapi such as MOD_PHP5, PHP-FPM, and CGI are preconfigured with one thread and one resource, because you don't want to waste memory space, and most of the time PHP is still running in a single-threaded environment. Here you can also see a id_count variable, which is a global static variable, whose effect is to generate the resource ID by itself, which is initialized to 0 here. So the way the TSRM generates the resource ID is very simple: it's the self augmentation of a reshaping variable. The second one needs to be analyzed carefully is ts_allocate_id, the friend who has written PHP extension is certainly not unfamiliar to this function, this function ...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.