Python concurrent execution of multiple threads

Source: Internet
Author: User

Under normal circumstances, we start a program. This program will start a process first, and then the process will pull up a thread. This thread is going to handle the transaction again. In other words, the real work is the thread, the process is only responsible for the system to the memory, resources but the process itself is not working. By default, only one process will pull up one thread.

Multithreading as the name implies, is the same in a process situation simultaneously pull up multiple threads. It says that the real work is the thread. The relationship between a process and a thread is like the relationship between a factory and a worker. So the factory is still one, but there are more workers working. Then the efficiency naturally increases. Because there is only one process, multithreading improves efficiency while not reaching out to the system for more memory resources. Therefore, the cost-effective use is still very high. However, although multithreading does not consume more memory, each thread requires CPU involvement.

The equivalent of the factory, although the plant on one, can have a lot of workers to work. But the workers have to rely on the director to do their work. Too many workers, the director is too busy to arrange the same efficiency is not high. So the number of workers (threads) is best still in the director (CPU) capacity (number of cores) within the range is better.

There are two ways to implement multi-threading in Python, and my summary is one that is in the form of a function. One is to create a class by yourself and inherit the threading. Thread class to implement. In fact, the use of multi-threaded modules, there are two kinds. One is thread. This module is the most original multithreaded module, but this module is said to be relatively low. Threading module encapsulated the thread module, anyway is more advanced, anyway, no one to write programs with thread, all with threading!! Just remember.

Let's start with the first one, which I think is a relatively simple form of function.

Let's take an example to see the following code

Import timedef haha (max_num): "" "arbitrarily defines a function that requires the user to enter a maximum range input to print a number and then print from 0 until the user enters a range value of" "" For I in range (MA    X_num): "" "Each time you print a number 1 seconds before printing, then print 10 numbers will take 10 seconds" "Time.sleep (1) Print IFOR x in range (3): Haha (10)

The code above is not very difficult, just to show if the sequential execution function haha (). It takes 30 seconds to execute three times. The second loop is not executed until the program executes the first loop. The time is cumulative.

Now we are introducing multithreaded execution. See if there will be any change.

Import threadingimport timedef haha (max_num):     "" "      randomly define a function that requires the user to enter a maximum range      input for a number to be printed, starting from 0 until the user enters the maximum range      ""     for i in range (max_num):          ""          print a number every 1 seconds, then print 10 numbers will take 10 seconds           "" "        time.sleep (1)          print ifor x in range (3):     "" "      here rang (3) is to start the three threads sequentially, each thread calls the function haha ()      The first thread starts execution, immediately starts the second thread to execute again. Finally also quite      function performed 3 times      ""      #通过threading. The thread method instantiates multithreaded classes      #target后面跟的是函数的名称但是不要带括号也不填写参数     # The content behind args is the argument to be passed to the function haha (). Remember that the parameter must be filled in as an array or it will be an error.     t=threading. Thread (target=haha,args= ())      #将线程设置为守护线程     t.setdaemon (True)      #线程准备就绪, ready to wait for CPU scheduling     t.start ()

The result of the execution is that ..... Nothing happened!!!! There is no output. What's the situation?!! Is the code error??!

In fact, the question is on the T.setdaemon (True) sentence. Default does not write this sentence or the default setting of the case this sentence should be

T.setdaemon (False). What do you mean by that sentence?


Setdaemon set to background thread or foreground thread (default)

If it is a background thread, during the main thread execution, the background thread is also in progress, and after the main thread finishes executing, the background thread stops regardless of success or not.

If it is the foreground thread, during the main thread execution, the foreground thread is also in progress, and after the main thread finishes executing, wait for the foreground thread to finish, the program stops

What kind of front desk, backstage, main thread are these? Listen, are you particularly dizzy? It's not really that complicated. The simple understanding is that if this parameter is true, it means that the program process runs out of the thread and then exits, and the pipeline does not run out at all. As can be seen from the above example, we have to take a minimum of 10 seconds for each function haha (), even if the first number is printed, it will have to pause for 1 seconds before it is output. But the process is to pull up three threads to end. The 3 for loop that executes the startup thread is not available for 10 seconds, and 1 seconds is ended. So there is the result that we see, the program pulls up 3 threads, it ends the main thread but at this point the function called by haha () is not yet output, it is forced to follow the program with the end.

Now that we have found the reason, let's revise the code. Put the part that is in the way to the default or simply not write the line

Import threadingimport timedef haha (max_num): For I in Range (max_num): Time.sleep (1) Print IFOR X in rang E (3): T=threading. Thread (target=haha,args= (5,)) #也可以干脆不写这一行 T.setdaemon (False) T.start ()

Running now, you can see the results of the seemingly chaotic execution

0001 11222333444

In fact, this is the three threads running concurrently output, so the results are output to the cause. It is this chaos that understands that indeed three functions haha () are running at the same time.

Consider using the join () method if you want to make the results look like a rule

Import threadingimport timedef haha (max_num): For I in Range (max_num): Time.sleep (1) Print IFOR X in rang E (3): T=threading. Thread (target=haha,args= (5,)) T.start () #通过join方法让线程逐条执行 t.join ()

The result of this execution looks beautiful.

012340123401234

As the note says, aesthetics is no problem. However, if you create multiple threads, each thread executes sequentially. There are many threads to do without parallelism. This and the top write serial execution example is an effect. Therefore the Join method cannot be arbitrarily used.

But since the join () method, it has to be useful? It must not be designed to be looked at. Now let's change the code to see how the Join () method is used correctly.

import threadingimport timedef haha (max_num):     for i in range (max_num):         Time.sleep (1)         print i "" "Create a list to store instances of multithreading to start" "" threads=[" For x in range (3):     t=threading. Thread (target=haha,args= (5,))      #把多线程的实例追加入列表, append several instances to start several threads      Threads.append (t) for thr in threads:     #把列表中的实例遍历出来后, call the Start () method to start running the thread     thr.start () for thr in threads:     "" "   The   isalive () method can return true or false to determine if there are any threads running end     . If so, let the main thread wait for the thread to end and end again.      "" "    if thr.isalive ():         thr.join () 

When we learn the Setdaemon () method above, we know that the main thread is actually the main running flow of the program. Then the program runs when the first boot must be the main thread, the main thread is responsible for pulling the screwdriver threads for work. In our example, the Run function haha () thread is actually a child thread. So it can be said that multithreading is actually multiple sub-threads. Then the program runs out of the last exit is certainly the main thread. So the last iteration of the threads list in the previous example is to see if there are any child threads that have not exited, as long as the child threads are alive and do not exit. The process of forcing the program flow through the join () method cannot go to the main thread to exit the step. The steps to exit from the main thread can only be performed according to the rule order of the join () method until the child threads have exited.


The second way to create multithreading is by customizing a class.

Import threadingimport timeclass haha (Threading. Thread):     "" "     Customize a class Haha, you must inherit threading. Thread, you must override a run () method below.      writes the function to be executed into the run () method. If you do not have the run () method, you will get an error. In fact, the function of this class is      through the Haha class inside the run () method to define the function content of each started sub-thread to execute.      "" "    def __init__ (self,max_num):         threading. Thread.__init__ (self)         self.max_num=max_num     def run (self):         for i in range ( Self.max_num):             time.sleep (1)              print iif __name__== ' __main__ ':     threads=[]    for x in range (3):        &nbSP; "" "          is just a bit different from the way the function works, because the Haha class inherits the threading. Thread, so instantiating          by the Haha class is equivalent to invoking a multithreaded instantiation. The rest of the operation is just like the way the function works.          "" "        t=haha (5)         threads.append (t)     for thr in  threads:        thr.start ()     for thr  in threads:        if thr.isalive ():             thr.join ()

The above is the implementation of the two ways of multithreading, according to the choice of personal preference is good. There is no essential difference.


The following is a description of the thread lock , first look at the following section of code

import threading# define a variable gnum=0def work (max_number):     for i in range (Max_number):         Print i        def mylock ():     global  gnum     "" "     this function needs to run the function work first ()      After execution, the global gnum+1      "" "    work (Ten)     # Declare a variable as a global variable     gnum=gnum+1    print  ' gnum is  ', gnumfor  x in range (5):     "" "     simultaneously start 5 off-the-shelf run mylock () functions       "" "    t=threading. Thread (Target=mylock)     t.start () 

The above example does not look too difficult, and the goal is to run another time-consuming function before executing the gnum+1. Since we started 5 threads running at the same time, the theoretical running process should be gnum+1=1 after the first thread has finished running, and then the second thread runs out of Gnum=1 and adds 1 to the gnum=2. And so on, finally, when 5 threads run out, gnum should be equal to 5. But it's not what we imagined when it actually ran!!!!!

The real situation is that when our first thread runs, gnum=0 runs a time-consuming work () function. Because the thread is executing concurrently, the second thread starts running when the first work () is not finished. The gnum+1 operation is not performed when the first thread is not finished running. At this point, the second thread is still a gnum=0. After the first thread at the end of the time gnum after self-added 1 into the gnum=1, but the second thread or the original value of the time or according to Gnum=0 to do the self-addition operation. So the result of the second operation is likely to be gnum=1. Did not achieve the effect of our ideal gnum=2.

As you can see from here, if the tasks performed by multithreading are irrelevant, nothing happens. Once the multi-line multithreading is used to operate the same variable, the thread is executed concurrently. So it is very likely that the variables will be modified at the same time, resulting in the final result not meeting our expectations.

In this case, one scenario is to use our jump to join method and let the threads run sequentially. This way, only one thread is modifying the variable, and there is no confusion. But the problem is the same as many threads concurrency is not the effect. Certainly not advisable. A second

The solution is to use the thread lock. What is a thread lock? is when multiple threads are manipulating a resource at the same time, which thread operates first. Which thread locks the resource first. Until this thread operation finishes opening the lock, the other threads can operate again. This is called thread safety, which is the wire lock. Sounds a bit like the join () method. In fact, there is a difference, first look at the code that added the thread lock.

import threadinggnum=0lock=threading. Rlock () def work (max_number):     for i in range (Max_number):         print idef mylock ():     work (Ten)       #在操作gnum之前先上锁      #acquire () in parentheses can define a locked timeout time, which automatically opens the lock at more than this time      lock.acquire ()     global gnum    gnum=gnum+1      #操作结束之后再打开锁     lock.release ()     print  ' Gnum  is  ', Gnumfor x in range (5):     t=threading. Thread (Target=mylock)     t.start () 

The difference is seen in the code on the polygon, and the Join () method is a restriction on the entire thread. The thread lock Lock.acquire is a locking limit for a part during thread execution. The threads that are started in the example can still run work in parallel (), a time-consuming function that is only limited by the lock on gnum processing. This solves the problem of multiple threads concurrently manipulating one resource to raise the wrong data. Another thing to watch out for is threading. Rlock () is also a high-level usage of lock (), which can be used with this advanced.


Multi-threaded event Events

In general, multithreading starts to work immediately after it is created. Without any pause. But sometimes we may not want it that way. For example, we're going to write a reptile program. Before crawling the Web page, I want to ping this page first. See if this web page can be ping through. If it does, release the thread to crawl the content. Test the next page if it doesn't work. So the events of the Python thread are used for the main thread to control the execution of other threads, and the event provides three methods set, wait, clear. where Event.wait () is equivalent to a global identity, the program customizes the global flag value to TRUE or flase according to the Event.set () and Event.clear () two methods. When Flag=true is the equivalent of receiving a signal to release all threads. Look at one of the following columns

Import threadingdef do: print ' Start ' #函数执行到这里等待信号放行信号 event.wait () #收到放行信号后执行下面的语句 print ' Execute ' # Instantiate the threading. Event () Event_obj = Threading. Event () for I in range: T = Threading. Thread (Target=do, args= (Event_obj,)) T.start () #先将Flag标识置为Falseevent_obj. Clear () InP = raw_input (' input: ') #如果用户输入 ' True ' is like wait () sends the release signal if INP = = ' true ': Event_obj.set ()

This completes the purpose of controlling the thread running through the set () and clear () methods


And finally, a brief introduction to Gil.

The Gil is the short name of the Python global interpreter lock. What is this lock used for? To put it bluntly is to limit Python's use of calling CPU cores. Multithreading can theoretically invoke multiple CPU cores simultaneously, such as the Java language. But Python because Gil exists, only one process at a time is processed in the CPU core . Although we can see multiple threads running concurrently, it is only because the CPU core is quickly executing the illusion of a thread back and forth through the context switch. The language of Python and Java, which can really invoke multi-core multithreading, is still different in efficiency. This is the Gil lock that Python has been criticized for.


















Python concurrent execution of multiple threads

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.