Detailed description of multi-threading in Python (code example)

Source: Internet
Author: User
Tags time 0 python script
This article brings you a detailed description of the multi-threading in Python (code example), there is a certain reference value, the need for friends can refer to, I hope to help you.

This article records the problems encountered in learning Python and some common uses, and notes that the Python version of this development environment is 2.7.

First, Python file naming

When naming python files, be careful not to conflict with the system's default module name, or you will get an error.
As the following example, when learning a thread, name the file name threading.py , the Python script is completely normal, and the result is reported in the following error: AttributeError: 'module' object has no attribute 'xxx' .

threading.py

#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_test.py@time:18/8/25 09:14" "Import threading# Gets the number of threads activated print (Threading.active_count ())

Perform:

➜  Baselearn python threading/threading.pytraceback (most recent call last):  File "threading/threading.py", Line 9, in <module>    import threading  File "/users/kaiyiwang/code/python/baselearn/threading/ threading.py ", line <module>    print (Threading.active_count ()) Attributeerror: ' Module ' object have no Attribute ' Active_count ' ➜  baselearn

Problem Locator:

View the source file for import the library, discover that the source file exists without errors, and that the source file exists .pyc

Problem solving:

    • 1. when naming the py script, do not match the python reserved word, module name, etc.

    • 2. Delete the library's .pyc File (because the. pyc file is generated each time the py script runs; If the code does not update, the runtime will still go PYC if the. pyc file has been generated, so delete the. pyc file), rerun the code, or find an environment where you can run the code, and copy the. pyc file that replaces the current machine.

Rename the script file name to threading_test.py , and then execute it, without an error.

➜  baselearn python threading/threading_test.py1➜  Baselearn

Second, multithreading threading

Multithreading is an efficient way to accelerate program computation, and Python's multithreaded modules are threading quick and easy to get started with, and from this section we teach you how to use it.

1. Adding threads

threading_test.py

#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_test.py@time:18/8/25 09:14" "" Import threading# get the number of threads activated # Print (Threading.active_count ()) # View all thread information # print (Threading.enumerate ()) # View the threads now running # print (Threading.current_ Thread ()) def thread_job ():    print (' This is a thread of%s '% Threading.current_thread ()) def main ():    thread = Threa Ding. Thread (Target=thread_job,)  # defines threads    Thread.Start () # Let the thread start working if __name__ = = ' __main__ ':    Main ()

2. Join function

The result of not adding join ()

We let T1 the threads work more time-consuming

threading_join.py

#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_join.py@time:18/8/25 09:14" "Import Threadingimport Timedef Thread_job ():    print (' T1 start\n ') for    I in range:        Time.sleep (0.1) # task time 0.1s    print ("T1 finish\n") def main ():    Added_thread = Threading. Thread (target=thread_job, name= ' T1 ')  # defines threads    Added_thread.start () # Let the thread start working    print ("All done\n") if __name_ _ = = ' __main__ ':    Main ()

The results of the expected output are executed sequentially, in sequence:

T1 startT1 Finishall Done

But the actual result is:

➜  baselearn python threading/threading_join.pyt1 startall doneT1 finish➜  Baselearn

Result of joining join ()

The thread task is not finished yet output all done . If you want to follow the order, you can call it after you start the thread join :

Added_thread.start () Added_thread.join () print ("All done\n")

Printing results:

➜  baselearn python threading/threading_join.pyt1 startT1 finishall done

Full script file:

#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_join.py@time:18/8/25 09:14" "Import Threadingimport Timedef Thread_job ():    print (' T1 start\n ') for    I in range:        Time.sleep (0.1) # task time 0.1s    print ("T1 finish\n") def main ():    Added_thread = Threading. Thread (target=thread_job, name= ' T1 ')  # defines threads    Added_thread.start () # Let the thread start working    added_thread.join ()    Print ("All done\n") if __name__ = = ' __main__ ':    Main ()

Small trial Sledgehammer

If you add two threads, what is the output of the printout?

#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_join.py@time:18/8/25 09:14" "Import Threadingimport Timedef T1_job ():    print (' T1 start\n ') for    I in range:        Time.sleep (0.1) # task time 0.1s    print ("T1 finish\n") def T2_job ():    print ("T2 start\n")    print ("T2 finish\n") def Main ():    thread_1 = Threading. Thread (target=t1_job, name= ' T1 ')  # defines threads    thread_2 = Threading. Thread (target=t2_job, name= ' T2 ')  # Defining Threads    Thread_1.start ()  # Open T1    thread_2.start ()  # Open T2    print ("All done\n") if __name__ = = ' __main__ ':    Main ()

The "One" result of the output is:

T1 startT2 startT2 finishall doneT1 Finish

Now T1 and T2 are not join , note that the "one" here is because the appearance of all do depends entirely on the execution speed of two threads, and it is entirely possible that T2 finish appears after all is done. This messy execution is intolerable to us, so use join to control it.

Let's try the T2 before starting the T1, plus thread_1.join() :

Thread_1.start () Thread_1.join () # Notice the Difference!thread_2.start () print ("All done\n")

Printing results:

T1 startT1 finishT2 startall doneT2 Finish

As you can see, T2 waits for T1 to finish before it starts to run.

3. Storage Process Result Queue

Implementation features

Code implementation, passing data from a list of data, using four threads, saving the results in a queue, and getting the stored results from the queue after the thread finishes executing

Define one in a multithreaded function to Queue hold the return value, 代替return define a multithreaded list, and initialize a multidimensional data list to handle:

threading_queue.py

#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_queue.py@time:18/8/25 09:14" "Import threadingimport Timefrom Queue Import Queuedef Job (L, Q): For    I in range (Len (l)):        l[i] = l[i] * * 2    q.put (l) #多线程调用的函数不能用return return value def multithreading ():    q = Queue ()  #q中存放返回值, instead of return value    threads = []    data = [[1,2,3],[3,4,5],[ 4,4,4],[5,5,5]] for    i in range (4): #定义四个线程        t = Threading. Thread (Target=job, args= (Data[i], q))  #Thread首字母要大写, the called job function has no parentheses, just an index, and the parameter is        T.start () in the back #开始线程        Threads.append (t) #把每个线程append到线程列表中 for    thread in Threads:        thread.join ()    results = [] for    _ in Range (4):        results.append (Q.get ()) #q. Get () takes a value in order from Q to    print (results) if __name__ = = ' __main__ ':    Multithreading ()

There was an error in executing the script above:

➜  Baselearn python threading/threading_queue.pytraceback (most recent call last):  File "Threading/threading_ queue.py ", line one, in <module> from    queue import queueimporterror:no module named queue

The reason for this is due to the Python version:
Workaround: No module named ' Queue '

On Python 2, the module is a named Queue, on Python 3, it's renamed to follow PEP8 guidelines (all lowercase for module na MES), making it queue. The class remains Queue on all versions (following PEP8).

Typically, the "the" to "D" write version portable imports would is to do:

Python3 in this reference:

Try:    import queueexcept importerror:    import queue as queue

In Python2 we can quote:

From queue import queue

Print:

Baselearn python./threading/threading_queue.py[[1, 4, 9], [9, 16, 25], [16, 16, 16], [25, 25, 25]]

Full code:
threading_queue.py

#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_queue.py@time:18/8/25 09:14" "Import threading# Import Timefrom Queue Import queuedef Job (L, Q): For    I in range (Len (l)):        l[i] = l[i] * * 2    q.put (l) #多线程调用的函数不能用return return value def multithreading ():    q = Queue ()  #q中存放返回值, instead of return value    threads = []    data = [[1,2,3],[3,4,5],[ 4,4,4],[5,5,5]] for    i in range (4): #定义四个线程        t = Threading. Thread (Target=job, args= (Data[i], q))  #Thread首字母要大写, the called job function has no parentheses, just an index, and the parameter is        T.start () in the back #开始线程        Threads.append (t) #把每个线程append到线程列表中 for    thread in Threads:        thread.join ()    results = [] for    _ in Range (4):        results.append (Q.get ()) #q. Get () takes a value in order from Q to    print (results) if __name__ = = ' __main__ ':    Multithreading ()

4, Gil Efficiency problem

What is GIL?

This time, let's see why Python's multithreaded threading are sometimes not particularly ideal. The main reason is that Python design, there is a necessary link, that is Global Interpreter Lock (GIL) . This thing allows Python to handle only one thing at a time.

Gil's explanation:

Although Python fully supports multithreaded programming, the C-language implementation of the interpreter is not thread-safe in fully parallel execution. In fact, the interpreter is protected by a global interpreter lock, which ensures that only one Python thread executes at any time. The biggest problem with the Gil is that Python's multithreaded programs do not take advantage of multicore CPUs (such as a computationally intensive program that uses multiple threads that only runs on a single CPU). Before discussing the ordinary Gil, one thing to emphasize is that the Gil only affects those programs that rely heavily on the CPU (for example, computational type). If most of your programs only involve I/O, such as network interaction, then using multithreading is appropriate because they are waiting most of the time. In fact, you can safely create thousands of Python threads, and the modern operating system runs so many threads without any pressure, nothing to worry about.

Test Gil

We create one job that executes the program in threading and in a general way. and create a list to hold the data we want to work with. In Normal, we expand this list 4 times times, in threading, we build 4 lines and compare the elapsed time.

threading_gil.py

#-*-coding:utf-8-*-"" "@author: CORWIEN@FILE:THREADING_GIL.PY@TIME:18/8/25 09:14 "" "Import threadingfrom Queue import queueimport copyimport timedef Job (L, q): res = SUM (l) q.put (L) #多线程调用的函数不        Can return a value with return def multithreading (l): q = Queue () #q中存放返回值, instead of return value threads = [] for I in range (4): #定义四个线程 t = Threading. Thread (Target=job, args= (Copy.copy (L), q), name= "t%i"% i) #Thread首字母要大写, the called job function has no parentheses, just an index, the parameter is in the back T.start () #开始线 Cheng Threads.append (t) #把每个线程append到线程列表中 [T.join () for T-threads] Total = 0 for _ in range (4): Tot     Al = Q.get () #q. Get () takes a value from Q in order () print (total) def normal (L): all = SUM (l) print (total) if __name__ = = ' __main__ ': L = List (range (1000000)) s_t = Time.time () normal (l*4) print (' Normal: ', Time.time ()-s_t) s_t = Time.time () multithreading (L) Print (' Multithreading: ', Time.time ()-s_t) 

If you run the entire program successfully, you will probably have this output. The result of our operation is correct, so the program threading and Normal run the same number of operations. But we find that threading is not much faster, supposedly, we expect to be 3-4 times faster because there are 4 threads built, but not. This is where the GIL is at mischief.

1999998000000normal:  0.100346088409423831999998000000multithreading:  0.08421492576599121

5. Thread Lock Lock

Cases where Lock is not used

threading_lock.py

#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_lock.py@time:18/8/25 09:14" "" Import threading# the value of global variable a adds 1 each time, Loop 10 times, and Print Def JOB1 (): Global    A    for I in range:        a+=1        print (' Job1 ', a) # global variable A values each add 10, loop 10 times, and print Def JOB2 ():    global A for    I in range:        a+=10        print (' Job2 ', A) # defines two threads, performing function one and function two if __name__== ' __main_ _ ':       a=0    t1=threading. Thread (TARGET=JOB1)    t2=threading. Thread (TARGET=JOB2)    T1.start ()    T2.start ()    t1.join ()    t2.join ()

Print output data:

➜  Baselearn python./threading/threading_lock.py (' Job1 ', (' job2 ' 1), (one) ' Job1 ' (' job2 ', ') ' (' job2 ', +) (' Job2 ', 42 (' Job2 ', ' job2 ') (' job2 ', ' a ') (' job2 ', ' a ') (' Job2 ', ' () ') (' job2 ', ' 102 '), (' Job1 ', ' 103 ') (' job1 ', ' and ' 104 ') (' A ', ' job1 ', ' 105 ' Job1 ', 106) (' Job1 ', 107) (' Job1 ', 108) (' Job1 ', 109) (' Job1 ', 110)

As you can see, the printing results are very confusing.

The use of Lock is the case

Lock is used by different threads to 共享内存 ensure that threads do not affect each other, and lock is done by locking the shared memory before each thread performs an operation to modify the shared memory, ensuring that the lock.acquire() current thread executes, that the memory is not accessed by other threads, and that after the operation is completed, the lock.release()Open the lock to ensure that the shared memory is available to other threads.

function one and function two plus lock

Def job1 ():    global A,lock    lock.acquire ()    for I in range:        a+=1        print (' Job1 ', A)    Lock.release () def job2 ():    global A,lock    lock.acquire () for    I in range:        a+=10        print (' Job2 ', A)    lock.release ()

The main function defines aLock

If __name__== ' __main__ ':    lock=threading. Lock ()    a=0    t1=threading. Thread (TARGET=JOB1)    t2=threading. Thread (TARGET=JOB2)    T1.start ()    T2.start ()    t1.join ()    t2.join ()

Full code:

#-*-Coding:utf-8-*-"" "@author: Corwien@file:threading_lock.py@time:18/8/25 09:14" "" Import Threadingdef job1 ():    Global A,lock    lock.acquire ()    for I in range:        a+=1        print (' Job1 ', A)    lock.release () Def job2 ():    global A,lock    lock.acquire ()    for I in range:        a+=10        print (' Job2 ', A)    Lock.release () If __name__== ' __main__ ':    lock = Threading. Lock ()    a=0    t1=threading. Thread (TARGET=JOB1)    t2=threading. Thread (TARGET=JOB2)    T1.start ()    T2.start ()    t1.join ()    t2.join ()

Print output:

➜  Baselearn python./threading/threading_lock.py (' Job1 ', 1) (' Job1 ', 2) (' Job1 ', 3) (' Job1 ', 4) (' Job1 ', 5) (' Job1 ', 6) (' Job1 ', 7) (' Job1 ', 8) (' Job1 ', 9) (' Job1 ', 10) (' Job2 ', 20) (' Job2 ', 30) (' Job2 ', 40) (' Job2 ', 50) (' Job2 ', 60) (' Job2 ', 70) (' Job2 ', 80) (' Job2 ', 90) (' Job2 ', 100) (' Job2 ', 110)

From the printed result, lock after use, one thread finishes executing. With lock and without lock, the result of the final printout is different.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.