Python multi-Threading with multiple processes (i)

Source: Internet
Author: User
Tags thread class

Multithreading

Multithreading is the ability of a program to run multiple threads concurrently in the same context. These threads share the resources of the same process and can perform multiple tasks in a concurrent mode (single-core processor) or parallel mode (multicore processor)

Multithreading has several advantages:

    • Continuous response: In a single-threaded program, performing a long-running task may cause the program to freeze. Multithreading can put this long-running task in a thread, which can continuously respond to the customer's needs while the program is running concurrently.
    • Faster execution speed: On multi-core processor operating system, multithreading can improve the program running speed by real parallel
    • Lower resource consumption: Using threading mode, a program can respond to multiple requests with a resource within a process
    • Simpler state sharing and interprocess communication mechanism: Because threads share the same resource and memory space, the pass-to-process communication between threads is simple
    • Parallelization: Multi-processor systems enable multiple threads to run independently of each thread

But multithreading also has the following disadvantages:

    • Thread synchronization: Because multiple threads are running on the same data, some mechanisms need to be introduced to prevent race conditions
    • A problem thread causes a collective crash: Although multiple threads can run on their own, once a thread has a problem, it can also cause the entire process to crash
    • Deadlock: This is a common problem for threading operations. Typically, a thread performs a task that locks the resource in use, and a deadlock occurs when one thread begins to wait for another thread's resources to be freed while another thread waits for the first thread to release the resource.

In general, multithreading technology is fully capable of parallel computing on multiple processors. But the official version of Python (CPython) has a Gil limit, and Gil prevents multiple threads from running Python's bytecode at the same time, which is not really parallel. If your system has 6 processors, multithreading can run the CPU to

600%, however, you can see only 100%, or even more slowly, which is caused by the Gil.

CPython Gil is necessary because CPython's memory management is not thread-safe. Therefore, in order for each task to proceed sequentially, it needs to ensure that memory is not disturbed during operation. It can run single-threaded procedures faster, simplifying the use of the C-language extension library because it does not need to consider multithreading issues.

However, Gil can be bypassed in some way. For example, because the Gil only prevents multiple threads from running Python's bytecode at the same time, you can write the program in C and then encapsulate it in Python. This way, the Gil will not interfere with multi-threaded concurrency while the program is running.

Another Gil does not affect the performance of an example is the network server, the server most of the time is reading the packet, and when an IO wait, will attempt to release the Gil. In this case, the increased thread can read more packages, although this is not really parallel. Doing so can increase the performance of the server, but it does not affect speed.

Creating a thread with the _thread module

Let's use an example to quickly demonstrate the use of the _thread module: The _thread module provides the Start_new_thread method. We can pass in the following parameters to the inside:

    • Target function: Contains the code we want to run, and once the function returns a value, the thread stops running
    • Parameters: The parameters required to execute the target function, usually in the form of tuples
Import _threadimport timedef print_time (thread_name, delay):    count = 0 while    count < 5:        Time.sleep (delay )        count + = 1        print ("%s:%s"% (Thread_name, Time.ctime (Time.time ()))) Try:    _thread.start_new_thread (print _time, ("Thread-a", 1))    _thread.start_new_thread (Print_time, ("Thread-b", 2)) except:    print ("Error:unable to Start Thread ") while 1:    Pass

  

Operation Result:

Thread-a:sun Jul  8 07:39:27 2018thread-b:sun Jul  8 07:39:28 2018thread-a:sun Jul  8 07:39:28 2018thread-a: Sun Jul  8 07:39:29 2018thread-b:sun Jul  8 07:39:30 2018thread-a:sun Jul  8 07:39:30 2018thread-a:sun jul< c15/>8 07:39:31 2018thread-b:sun Jul  8 07:39:32 2018thread-b:sun Jul  8 07:39:34 2018thread-b:sun Jul  8 07 : 39:36 2018

  

The above example is simple, and thread A and thread B are executed concurrently.

The _thread module also provides some easy-to-use thread-native interfaces:

  • _thread.interrupt_main (): This method can send the interrupt exception to the main thread, just like the keyboard to the program input CTRL + C, we modify the Print_time method, when Count is 2, Sleep time delay 2 sends interrupt exception to main thread
    def print_time (Thread_name, delay): Count = 0 while Count < 5:time.sleep (delay) Count + = 1 if Count = 2 and Delay = = 2: _thread.interrupt_main ( Print ("%s:%s"% (Thread_name, Time.ctime (Time.time ()))) 

    Run Result:

    th Read-a:sun Jul 8 09:12:57 2018thread-b:sun Jul 8 09:12:58 2018thread-a:sun Jul 8 09:12:58 2018thread-a:sun Jul 8 09:12 : 2018thread-b:sun Jul 8 09:13:00 2018Traceback (most recent): File ' d:/pypath/hello/test3/test01.py ', line In <module> passkeyboardinterrupt 

  • Exit: This method will exit the program from the background, which has the advantage that it will not cause other exceptions when the thread is disconnected.
    def print_time (Thread_name, delay):    count = 0 while    count < 5:        time.sleep (Delay)        count + 1        if Count = = 2 and Delay = = 2:            _thread.exit ()        print ("%s:%s"% (Thread_name, Time.ctime (Time.time ())))

    Operation Result:

    Thread-a:sun Jul  8 09:15:51 2018thread-b:sun Jul  8 09:15:52 2018thread-a:sun Jul  8 09:15:52 2018thread-a: Sun Jul  8 09:15:53 2018thread-a:sun Jul  8 09:15:54 2018thread-a:sun Jul  8 09:15:55 2018

      

The Allocate_lock method can return a lock for a thread, which protects a block of code from running to running at the end of only one thread, and the wire lock object has three methods:

    • Acquire: The main function of this method is to request a lock for the current thread. It accepts an optional integer parameter, and if the parameter is 0, the thread lock is fetched as soon as it is requested, does not need to wait, and if the parameter is not 0, it indicates that the threads can wait for the lock
    • Release: This method frees the thread lock so that the next thread gets
    • Locked: Returns True if the thread lock is fetched by one of the threads, otherwise false

The following code adds a value to a global variable with 10 threads, so ideally, the value of the global variable should be 10:

Import _threadimport timeglobal_values = 0def Run (thread_name):    global global_values    local_copy = Global_ Values    print ('%s with value%s '% (Thread_name, local_copy))    global_values = local_copy + 1for i in range: 
   
    _thread.start_new_thread (Run, ("thread-(%s)"% str (i),)) Time.sleep (3) print ("global_values:%s"% global_values)
   

  

Operation Result:

thread-(0) with the value 0thread-(1) with the value 0thread-(2) with the value 0thread-(4) with value 0thread-(6) with value 0thread- (8) with the value 0thread-(7) with the value 0thread-(5) with the value 0thread-(3) with the value 0thread-(9) with value 1global_values: 2

    

Unfortunately, we did not get the results we hoped for, but instead the results of the program run farther from the results we wanted. This is caused by the fact that multiple threads are manipulating the same variable or block of code, causing some threads to not read the latest values, or even assigning the results of the old values to all local variables.

Now, let's change the original code:

Import _threadimport timeglobal_values = 0def run (thread_name, Lock):    global global_values    lock.acquire ()    local_copy = global_values    print ("%s with value%s"% (Thread_name, local_copy))    global_values = local_copy + 1    lock.release () lock = _thread.allocate_lock () for I in range:    _thread.start_new_thread (Run, ("thread-" (% s) "% str (i), lock)" Time.sleep (3) print ("global_values:%s"% global_values)

  

Operation Result:

thread-(0) with the value 0thread-(2) with the value 1thread-(4) with the value 2thread-(5) with value 3thread-(3) with value 4thread- (6) with value 5thread-(1) with the value 6thread-(7) with the value 7thread-(8) with the value 8thread-(9) with value 9

  

It is now possible to see that the execution order of the threads is still disorderly, but the values of the global variables are incremented by increments

_thread There are several other ways:

    • _thread.get_ident (): This method returns an integer that is not 0, representing the ID of the currently active thread. This integer is retracted after the thread ends or exits, so it is not unique throughout the program's life cycle
    • _thread.stack_size (size): Size This parameter is optional to set or return the capacity of the line stacks when the code creates a new thread, which can be 0 or at least 32KB, as determined by the operating system

Creating a thread with the threading module

This is the most commonly recommended module for processing threads in Python, which provides a more complete and advanced interface, and we try to convert the previous example into the form of a threading module:

Import Threadingimport timeglobal_values = 0def run (thread_name, Lock):    global global_values    lock.acquire ()    local_copy = global_values    print ("%s with value%s"% (Thread_name, local_copy))    global_values = local_copy + 1    lock.release () lock = Threading. Lock () for I in range:    t = Threading. Thread (Target=run, args= ("thread-(%s)"% str (i), lock))    T.start () time.sleep (3) print ("global_values:%s"% global _values)

  

For more complex situations, if you want to better encapsulate the behavior of threads, we might need to create our own threading classes, here are a few things to note:

    • The thread needs to be inherited. Thread class
    • You need to rewrite the Run method, or you can use the __init__ method
    • If you overwrite the initialization method __init__, you need to call the initialization method of the parent class at the beginning thread.__init__
    • The thread stops when the thread's Run method stops or throws an unhandled exception, so you should design the method in advance.
    • Your thread can be named with the name parameter of the initialization method
Import Threadingimport Timeclass MyThread (threading. Thread):    def __init__ (self, count):        Threading. Thread.__init__ (self)        self.total = Count    def run (self): for        I in range (self.total):            time.sleep (1)            print ("thread:%s-%s"% (Self.name, i)) T = MyThread (2) t2 = MyThread (3) T.start () T2.start () print ("Finish")

  

Operation Result:

Finishthread:thread-2-0thread:thread-1-0thread:thread-2-1thread:thread-1-1thread:thread-2-2

  

Note that the main thread above prints the finish before printing the print statements in other threads, which is not a big problem, but the following situation is problematic:

f = open ("Content.txt", "w+") T = MyThread (2, f) t2 = MyThread (3, F) T.start () T2.start () F.close ()

  

We assume that the printed statement is written to Content.txt in Mythread, but this code is problematic because the main thread may shut down the file processor before the other threads are turned on, and if you want to avoid this, you should use the Join method, which allows the called thread to return to the original Thread to continue execution:

f = open ("Content.txt", "w+") T = MyThread (2, f) t2 = MyThread (3, F) T.start () T2.start () T.join () T2.join () F.close () print (" Finish ")

  

The Join method also supports an optional parameter: time limit (floating point or none), in seconds. But the join return value is none. Therefore, to check if the operation has timed out, you need to see the activation state of the thread after the Join method returns, and if the state of the thread is active, the operation timed out

Another example is checking the request status code for a group of sites:

From urllib.request import urlopensites = [    "https://www.baidu.com/",    "http://www.sina.com.cn/",    "http ://www.qq.com/"]def check_http_status (URL):    return Urlopen (URL). GetCode () http_status = {}for URL in sites:    Http_status[url] = check_http_status (URL) for key, value in Http_status.items ():    print ("%s%s"% (key, value))

  

Operation Result:

# time Python3 test01.py https://www.baidu.com/200http://www.sina.com.cn/200http://www.qq.com/ 200real0m1.669suser0m0.143ssys0m0.026s

  

Now, we are trying to transform the IO operation function into a thread to optimize the code:

From urllib.request import urlopenimport threadingsites = [    "https://www.baidu.com/",    "/HTTP/ www.sina.com.cn/",    " http://www.qq.com/"]class httpstatuschecker (threading. Thread):    def __init__ (self, URL):        threading. Thread.__init__ (self)        self.url = URL        self.status = None    def run (self):        self.status = Urlopen (self.url ). GetCode () threads = []http_status = {}for URL in sites:    t = httpstatuschecker (URL)    t.start ()    Threads.append (t) for T in Threads:    t.join () for T in Threads:    print ("%s%s"% (T.url, t.status))

  

Operation Result:

# time Python3 test01.py https://www.baidu.com/200http://www.sina.com.cn/200http://www.qq.com/ 200real0m0.237suser0m0.110ssys0m0.019s

  

Obviously, the threaded version of the program is faster and runs almost 8 times times faster than the previous version, and performance improvements are significant

Implementing inter-thread communication through the event object

Although threads are usually run as standalone or parallel tasks, there are sometimes requirements for inter-thread communication, and the threading module provides an event object that implements inter-thread communication, contains an internal tag, and a calling thread that can use the set () and clear () methods

The interface of the event class is simple, and it supports the following methods:

    • Is_set: Returns True if the event has an internal tag set
    • Set: Sets the internal tag to true. It can wake up all the threads waiting to be set, and the thread that calls the wait () method will no longer be blocked
    • Clear: Resets the internal tag. The thread that calls the wait method is blocked until the set () method is called
    • Wait: Use this method to block thread calls until the internal tag of the event is set, which supports an optional parameter as the wait Time (timeout). If the wait time limit is not 0, the thread will be blocked for a period of time

Let's use the thread event object to demonstrate a simple example of threading communication, which can be used to print strings in turn. Two threads share a single Event object. In a while loop, each time the loop is set, one thread sets the token and the other thread resets the token.

Import Threadingimport Timeclass Threada (threading. Thread):    def __init__ (Self, event):        Threading. Thread.__init__ (self)        self.event = Event    def run (self):        count = 0 while        count < 6:            Time.sleep (1 )            if Self.event.is_set ():                print ("A")                self.event.clear ()            count + = 1class threadb (threading. Thread):    def __init__ (Self, event):        Threading. Thread.__init__ (self)        self.event = Event    def run (self):        count = 0 while        count < 6:            Time.sleep (1)            if not Self.event.is_set ():                print ("B")                Self.event.set ()            count + = 1event = Threading. Event () Ta = Threada (event) TB = THREADB (event) Ta.start () Tb.start ()

  

Operation Result:

Bababababab

  

Here's a summary of the Python multi-threading use time:

Using Multithreading:

    • Frequent IO operations
    • Parallel tasks can be resolved by concurrency
    • GUI Development

Do not use Multithreading:

    • A large number of CPU operation tasks
    • Programs must take advantage of multi-core operating systems

Multi-process

Because of the Gil's existence, Python's multithreading does not achieve true parallelism. Therefore, some problems using the threading module do not solve

However, Python provides an alternative to parallelism: multi-process. In a multi-process, the thread is replaced by a sub-process. Each process operates its own Gil (so that Python can open multiple processes in parallel with no limit on the number). It is important to be clear that threads are part of the same process, and they share the same block of memory, storage space, and compute resources. Processes do not share memory with their parent processes, so interprocess communication is more complex than inter-thread communication

Multi-process advantages and disadvantages of multithreading are as follows:

Advantages Disadvantages
can use multi-core operating system More memory consumption
Process uses a separate memory space to avoid race problems Data sharing between processes becomes more difficult
Child processes are easily interrupted Inter-process communication is more difficult than threading
Avoid the Gil limit

  

Python multi-Threading with multiple processes (i)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.