Multithreading c calls the Python API trap

Source: Internet
Author: User
Tags fpm new set python script

It is well known that services written in scripting languages (WSGI interfaces) require a server container, common such as PHP php-fpm, LIGHTD, and so on. Python is generally used Uwsgi,uwsgi is a new protocol based on WSGI that can be used to deploy scripts such as Python to run. However, there are some unexpected problems in developing code architectures that are unfamiliar with Uwsgi and C calling Python's APIs.

We first look at a piece of code, the following code is used by the flask framework, each time the request will be the value of the count minus the number of times plus one, and finally multiply by two. If the request is 50 times, the final result should be a power of 2 to 50.


From flask import flask, RequestCount = 1 app = Flask (__name__) @app. Route ('/test_uwsgi ') def index ():    Global count
   
    count=count-1    count=count+1    count=count*2    print COUNT    return ' OK '
   


1717986918434359738368687194767361374389534722748779069445497558138881099511627776219902325555243980465111048796093022208 1759218604441635184372088832703687441776641407374883553282814749767106565629499534213121125899906842624


This is the result of a direct execution of the last few rows of the 50 index function, with the result being a power of 50 to 2.

536870912 10737418242147483648429496729685899345921717986918434359738368 687194767361374389534722748779069445497558138881099511627776 2199023255552 4398046511104879609302220817592186044416 3518437208883270368744177664 1407374883553282814749767106565629499534213121125899906842624

This is the result of the last few rows obtained through the AB test, with multiple concurrent accesses to the/TEST_UWSGI interface 50 times. It can be seen that the final result is definitely an abnormal number. Why does the program run when running in Uwsgi with an exception?

In fact, by reading this simple example, it can be found that this example is generally used to demonstrate multi-threaded shared data synchronization problems, if not locked will expose the problem example. In the following code we will add the mutex when we modify the shared resource count to see if there is any change.

From flask import flask, requestimport Threadingmutex = threading. Lock () Count = 1 app = Flask (__name__) @app. Route ('/test_uwsgi ') def index ():    global COUNT    Global Mutex    Mutex.acquire ()    count=count-1    count=count+1    count=count*2    print COUNT    mutex.release ()    Return ' OK '

The above code is also put into the Uwsgi container run, through the HTTP interface multiple concurrent access 50 times, the result is correct. But why is that? In our original Python code did not write any operations involving multi-process, although uwsgi in the configuration file open multiple threads can concurrently processing requests, but according to the author's original understanding, not should each thread execute its own independent Python interpreter? Shouldn't data be isolated for each thread when it runs a python script?

In order to understand the above problem, we have to study the structure and design of UWSGI and its server architecture.

UWSGI is a server application container widely used in Python, similar to a server application container for common WSGI protocols on PHP, such as mod-php, PHP-FPM, LIGHTD, and so on. The UWSGI protocol is a new set of UWSGI protocols on top of the original WSGI protocol.


By studying the source code of the Uwsgi (core/uwsgi.c core/loop.c core/init.c core/master_util.c core/ UTIL.C), you can know the UWSGI server design, using the Unx book introduced in the induction of server programming Paradigm 8, and the TCP pre-create Thread server program, each thread each accept.

int main (int argc, char *argv[], char *envp[]) {Uwsgi_setup (argc, argv, ENVP); return Uwsgi_run ();} void Uwsgi_setup (int argc, char *argv[], char *envp[]) {int i;struct utsname uuts; ..... Set up and initialize a variety of resources, here is omitted, interested to see yourself ...//The main thing is this line Uwsgi_start ((void *) UWSGI.ARGV);} int Uwsgi_start (void *v_argv) {... Simplified summary Some of the main code ... Aside, here is the creation of a multi-threaded shared memory space, which is used later uwsgi_setup_workers. Because Uwsgi has a master process that can monitor the status of individual sub-processes, it requires an anonymous shared memory//Initialize Sharedareasuwsgi_sharedareas_init ();//Setup Queueif ( Uwsgi.queue_size > 0) {uwsgi_init_queue ();} ... It's important here. UWSGI.P is an interface, the app deployed in Uwsgi is initialized here (in Uwsgi, the deployed app needs the plug-in for the language, like Python's Python plugin), and actually UWSGI executes the Python code, and all of its module import is executed here. Initialize request plugin only if workers or master is Availableif (uwsgi.sockets | | uwsgi.master_process | | uwsgi.no_ Server | | Uwsgi.command_mode | | Uwsgi.loop) {for (i = 0; i <; i++) {if (uwsgi.p[i]->init) {Uwsgi.p[i]->init ()}}} Again check for Workers/sockets...if (uwsgi.sockets | | uwsgi.master_process | | uwsgi.no_Server | | Uwsgi.command_mode | | Uwsgi.loop) {for (i = 0; i <; i++) {if (uwsgi.p[i]->post_init) {uwsgi.p[i]->post_init ();}}} ... This is mainly to set up the shared memory space for each worker//Initialize Workers/master gkfx segmentsuwsgi_setup_workers ();//Here we spawn the Workers...if (!uwsgi.status.is_cheap) {if (Uwsgi.cheaper && uwsgi.cheaper_count) {int nproc = Uwsgi.cheaper_ Initial;if (!nproc) Nproc = uwsgi.cheaper_count;for (i = 1; I <= uwsgi.numproc; i++) {if (I <= nproc) {if (Uwsgi_resp Awn_worker (i)) Break;uwsgi.respawn_delta = Uwsgi_now ();} else {uwsgi.workers[i].cheaped = 1;}}} else {for (i = 2-uwsgi.master_process; i < Uwsgi.numproc + 1; i++) {... This is the number of processes we set, to fork the sub-process if (Uwsgi_respawn_worker (i)) Break;uwsgi.respawn_delta = Uwsgi_now ();}}} END of Initializationreturn 0;} int uwsgi_respawn_worker (int wid) {... Mainly this line of code, fork sub-process, the inside is not with the pid_t pid = Uwsgi_fork (Uwsgi.workers[wid].name); if (PID = = 0) {signal (sigwinch, worker_wakeup); Signal (SIGTSTP, worker_wakeup); uwsgi.mywid = Wid;uwsgi.Mypid = Getpid ();//PID is updated by the Master//uwsgi.workers[uwsgi.mywid].pid = uwsgi.mypid;//overengineering (just to   Be safe) Uwsgi.workers[uwsgi.mywid].id = uwsgi.mywid;/* Uwsgi.workers[uwsgi.mywid].harakiri = 0;   Uwsgi.workers[uwsgi.mywid].user_harakiri = 0;   uwsgi.workers[uwsgi.mywid].rss_size = 0; uwsgi.workers[uwsgi.mywid].vsz_size = 0; *///do not reset worker counters on reload!!! Uwsgi.workers[uwsgi.mywid].requests = 0;//... but maintain a delta counter (yes, racy in multithread)//uwsgi.work ers[uwsgi.mywid].delta_requests = 0;//uwsgi.workers[uwsgi.mywid].failed_requests = 0;//uwsgi.workers[uwsgi.mywid]. Respawn_count++;//uwsgi.workers[uwsgi.mywid].last_spawn = Uwsgi.current_time;} else if (PID < 1) {Uwsgi_error ("fork ()");} else {//The PID is set only in the master, as the worker should never use Ituwsgi.workers[wid].pid = Pid;if (respawns ; 0) {uwsgi_log ("respawned Uwsgi worker%d (new PID:%d) \ n", wid, (int) PID);} else {Uwsgi_log ("spawned UWSGI worker%D (PID:%d, cores:%d) \ n ", WID, PID, uwsgi.cores);}} return 0;} int Uwsgi_run () {... Also pick up the important excerpt some if the PID is master, execute Master_loop if the PID is a worker, execute uwsgi_worker_run//!!! From is on, we could is in the master or in a worker!!! if (getpid () = = Masterpid && uwsgi.master_process = = 1) {(void) Master_loop (uwsgi.argv, Uwsgi.environ);} From now on the process is a real workeruwsgi_worker_run ();//Never Here_exit (0);} void Uwsgi_worker_run () {int i;if (Uwsgi.lazy | | uwsgi.lazy_apps) {uwsgi_init_all_apps ();} Uwsgi_ignition ();//Never Hereexit (0);} void Uwsgi_ignition () {if (uwsgi.loop) {void (*u_loop) (void) = Uwsgi_get_loop (Uwsgi.loop), if (!u_loop) {Uwsgi_log (" Unavailable loop engine!!! \ n "); exit (1);} if (Uwsgi.mywid = = 1) {Uwsgi_log ("* * * running%s loop engine [addr:%p] ***\n", Uwsgi.loop, U_loop);} U_loop (); Uwsgi_log ("Your loop engine died. R.i.p.\n ");} else {... The loop body of the subprocess, generally with SIMPLE_LOOPIF (Uwsgi.async < 1) {simple_loop ();} else {async_loop ();}} End of the Process...end_me (0);} ... All the time, it's in the loop of the child process.To start creating the Execute function that receives the thread thread that handles request requests Simple_loop_run is also a loop, basically a regular step, accept,receive, response ..., We're not going to keep chasing after that. After Reciev receives the requested data, it invokes the WSGI function of the Python script via Python_call method, processing the request void Simple_loop () {Uwsgi_loop_cores_run ( Simple_loop_run);} void Uwsgi_loop_cores_run (void * (*FUNC) (void *)) {int i;for (i = 1; i < uwsgi.threads; i++) {Long J = i;pthread_create (&uwsgi.workers[uwsgi.mywid].cores[i].thread_id, &uwsgi.threads_attr, func, (void *) j);} Long y = 0;func ((void *) y);}

in simple terms, it is different to execute a Python script in Uwsgi and run the Python script directly. UWSGI executes a Python script by invoking the Python C API's method, first loading the module in a Python script by invoking the API, which, like the first instance code, is executed in the same way that the relevant code in the module import All global variables are created and initialized in the process. The Uwsgi then creates the thread, starts processing the request to call the Python API (Python_call), executes the function that handles the request in the Python script (the Wsgi interface), because the module import was executed prior to thread creation, So the previously shared data in the process can be accessed in the thread. Here's what we need to focus on, we need to lock up when we access the data between these threads, or when we write a Python script, we can use the singleton mode sparingly and avoid unnecessary mining pits.

In fact, the above is only a simplification of my problems, in order to help you understand uwsgi multithreaded execution of Python WSGI interface related issues. The problem I encountered was that in the function that processed the request, a Gearman client created in the global was called, and the client library was not thread-safe and was not locked in use. When the concurrency of the request is large, the Gearman client will report some connection exceptions.


PostScript: In fact, this problem is not very complex, exposing the problem is the code structure of UWSGI, and Python C API and C call Python methods and related concepts are not very skilled, exposing their knowledge system of the short board. Since those days at the same time in the development of several requirements, no detailed testing of the problem, no careful analysis and find errors in trackback, but is always suspected of the callee interface performance issues. On this issue, my colleague is really a chess recruit, analysis of the problem, but for the cause of the problem is also smattering, did not explain the principle of the reason, caused the two of us fierce dispute. In fact, it also leads to this team communication in the communication indeed there are some problems, the argument is basically by shouting, relying on Kibitz, is not about the matter but often personal attacks. When someone's plan is indeed justified, and indeed proves to be available, the opposing person would rather die than compromise ... This is probably the same as Sina such a bloated old lack of vitality of the big company's common.

Multithreading c calls the Python API trap

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.