Concurrent programming Multi-Threading

Last Update:2018-07-10 Source: Internet

Author: User

Tags mutex processing text semaphore

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

One, what is multithreading?
A process is simply used to bring resources together (a process is just a resource unit, or a collection of resources), and the thread is the executing unit on the CPU.
Multithreading: Refers to the existence of multiple threads in a process, where multiple threads share the address space of the process.

Ii. the difference between threads and processes
1, the thread share creates the address space of its process, and the process has its own address space.
2, the thread can access the data segment of its process directly, and the process has its own copy of the parent process data segment.
3, the thread can communicate directly with the other threads of the process, and the process must use interprocess communication to communicate with the sibling process.
4, new threads are easy to create; The new process requires a duplicate of the parent process.
5, threads can have considerable control over the threads of the same process, and the process can only be controlled by the child process.
6, changes to the main thread (cancellation, priority changes, and so on) may affect the behavior of other threads of the process, and changes to the parent process do not affect the child process.

So we need to use multithreading in a specific scenario:
1. Multiple threads within the same process share address resources within the process
2. The cost of creating a thread is much less than the cost of creating a process (creating a process, creating a workshop, involving the application space, and building at least one pipeline within that space, but creating a thread that creates a pipeline in a single workshop without applying for space, so the creation cost is small)

Three, multi-threaded application examples
Open a word processing software process, the process must do more than one thing, such as listening to keyboard input, processing text, automatically save the text to the hard disk, the three tasks are the same piece of data, and therefore can not be used multi-process. Only in one process can open three threads concurrently, if it is a single thread, it can only be, keyboard input, can not handle text and auto-save, automatically save the text can not be entered and processed.

Iv. Introduction of Threading Module
The Multiprocess module mimics the interface of the threading module and is very similar on the use plane.

V. Two ways to open a thread
Way One:
Import time
Import Random
From threading Import Thread

Def study (name):
Print ("%s is learning"% name)
Time.sleep (Random.randint (1, 3))
Print ("%s is do"% name)

If name = = 'main':
t = Thread (target=study,args= (' James ',))
T.start ()
Print ("Main thread starts running ....")

Way two:
Import time
Import Random
From threading Import Thread

Class MyThread (Thread):
def Init(self, name):
Super (). __INIT__ ()
Self.name = Name

def run(self):    print("%s is learning" % self.name)    time.sleep(random.randint(1, 3))    print("%s is done" % self.name)

If name = = 'main':
t = MyThread (' James ')
T.start ()
Print ("Main thread starts running ....")

Six, multi-threaded implementation of the concurrent socket communication
Client

_*_ Coding:utf-8 _*_
From socket Import *

Ip_port = (' 127.0.0.1 ', 9999)
Client = socket (Af_inet,sock_stream)
Client.connect (Ip_port)

While True:
cmd = input (">>>"). Strip ()
If not cmd:
Continue
Client.send (Cmd.encode (' Utf-8 '))
data = CLIENT.RECV (1024)
Print (Data.decode (' Utf-8 '))
Client.close ()

Service side
Import multiprocessing
Import threading
Import socket

Ip_port = (' 127.0.0.1 ', 9999)
s = Socket.socket (socket.af_inet,socket. SOCK_STREAM)
S.bind (Ip_port)
S.listen (5)

DEF action (conn):
While True:
data = CONN.RECV (1024)
Print (data)
Conn.send (Data.upper ())
If name = = 'main':
While True:
CONN,ADDR = S.accept ()

    p = threading.Thread(target=action,args=(conn,))    p.start()

Seven, multi-threading and multi-process differences
Process to request memory space, the cost of open process is much larger than the thread

To open a thread under the main process
Import time
Import Random
From multiprocessing import Process
From threading Import Thread

Def study (name):
Print ("%s is learning"%name)
Time.sleep (Random.randint (1,3))
Print ("%s is playing"% name)

If name = = 'main':
t = Process (target=study,args= (' James ',))
T.start ()
Print ("The main process starts running ....")

Execution result: The thread is turned on at the same time as T.start ()

To open a child process under the main process
Import time
Import Random
From multiprocessing import Process
From threading Import Thread

Def study (name):
Print ("%s is learning"%name)
Time.sleep (Random.randint (1,3))
Print ("%s is playing"% name)

If name = = 'main':

t = Thread(target=study,args=('james',))t.start()print("主线程开始运行....")

Execution Result: P.start () sends the signal of the open process to the operating system, the operating system to request memory space, so that the copy of the parent process address space to the child process

View PID
Open multiple threads under the main thread, each with the same PID as the main thread (threads share the PID of the master process)
From threading Import Thread
Import OS

def work ():
Print (' Hello ', Os.getpid ())

If name = = 'main':
T1=thread (Target=work)
T2=thread (Target=work)
T1.start ()
T2.start ()
Print (' main thread/main process pid ', os.getpid ())

Multiple processes, each with a different PID
From multiprocessing import Process
Import OS

def work ():
Print (' Hello ', Os.getpid ())

If name = = 'main':
P1=process (Target=work)
P2=process (Target=work)
P1.start ()
P2.start ()
Print (' main thread/main process ', Os.getpid ())

Threads in the same process share the process's data
The address space between processes is isolated
From multiprocessing import Process
Import OS

def work ():
Global n
N=0

If name = = 'main':
n=100
P=process (Target=work)
P.start ()
P.join ()
Print (' main ', N)

Execution result: The child process P has changed its global N to 0, but only its own, the view of the parent process n is still 100

From threading Import Thread
Import OS

def work ():
Global n
N=0

If name = = 'main':
n=100
T=thread (Target=work)
T.start ()
T.join ()
Print (' main ', N)

Execution Result: View the result as 0 because the in-process data is shared among threads within the same process

Viii. other properties or methods of the thread object
Introduced
Methods for thread instance objects
# isAlive (): Returns whether the thread is active.
# getName (): Returns the thread name.
# setName (): Sets the thread name.

Some of the methods provided by the threading module are:
# Threading.currentthread (): Returns the current thread variable.
# threading.enumerate (): Returns a list that contains the running thread. Running when a thread is started,
Before the end, the threads before and after the start are not included.
# Threading.activecount (): Returns the number of running threads with the same result as Len (Threading.enumerate ()).

From threading Import Thread
From threading Import Current_thread
Import time

Def task ():
Print ("%s is running"%current_thread (). GetName ())
Time.sleep (1)
Print ("%s is done"% current_thread (). GetName ())

If name = = 'main':
#没有子线程这个概念, just for the convenience of understanding
t = thread (target=task,name= ' child thread 1 ')
T.start ()
T.setname (' son thread 1 ')
Print ("Main thread%s"% Current_thread (). GetName ())
#主线程 Mainthread
#子线程1 is running
#儿子线程1 is done

The main thread waits for the child thread to end
From threading Import Thread
Import time
def sayhi (name):
Time.sleep (2)
Print ('%s say hello '%name)

If name = = 'main':
T=thread (target=sayhi,args= (' James ',))
T.start ()
T.join ()
Print (' main thread ')
Print (T.is_alive ())
‘‘‘
James Say hello
Main thread
False
‘‘‘

Nine, the Guardian thread
1 The main process is finished after its code is finished (the daemon is recycled at this point), and then the main process will wait until the non-daemon child processes have finished running to reclaim the child process's resources (otherwise it will produce a zombie process) before it ends.
2 The main thread runs after the other non-daemons have finished running (the daemon thread is recycled at this point). Because the end of the main thread means the end of the process, the resources of the process as a whole are recycled, and the process must ensure that the non-daemon threads are finished before they end.

From threading Import Thread
Import time

def task (name):
Time.sleep (1)
Print ("%s is working"%name)

If name = = 'main':
t = Thread (target=task,args= (' James ',))
#t. Setdaemon (True) #必须在t, set before start (), and T.daemon are the same effect
T.daemon =true
T.start ()
Print ("Main thread")
Print (T.is_alive ())

Mutex: Similar to a process's mutex

Mutex

From threading Import Thread,lock
Import time

n=100

Def task ():
Global n
Mutex.acquire ()
Temp=n
Time.sleep (0.1)
N=temp-1
Mutex.release ()

If name = = 'main':
Mutex=lock ()
T_l=[]
For I in range (100):
T=thread (Target=task)
T_l.append (t)
T.start ()

for t in t_l:    t.join()print('主',n)

Ten, Gil Global interpreter lock
Gil is the essence of a mutex, since it is a mutex, all the nature of the mutex is the same, all the concurrent operation into serial, in order to control the same time shared data can only be modified by a task, and thus ensure data security.
Each time a Python program executes, a separate process is created.

The difference between Gil and lock
The purpose of the lock is to protect the shared data, and only one thread can modify the shared data at the same time
To protect different data, you should add different locks.
Gil and lock are two locks, the protection of the data is not the same, the former is the interpreter level, the latter is to protect the user's own development of the application data, it is obvious that Gil is not responsible for this matter, only user-defined lock processing, that is, lock.

From threading Import Thread,lock
Import Os,time
def work ():
Global n
Lock.acquire ()
Temp=n
Time.sleep (0.1)
N=temp-1
Lock.release ()
If name = = 'main':
Lock=lock ()
n=100
L=[]
For I in range (100):
P=thread (Target=work)
L.append (P)
P.start ()
For P in L:
P.join ()

print(n) #结果肯定为0，由原来的并发执行变成串行，牺牲了执行效率保证了数据安全，不加锁则结果可能为99

Gil and multithreading
With Gil's presence, at the same moment only one thread in the same process is executed
1, for the calculation, the more CPU the better, but for I/O, no more CPU is useless (CPU-intensive code)
2, of course, for running a program, with the increase in CPU execution efficiency will certainly improve (IO-intensive code)

In Python multithreading, the way each thread executes:
1. Get Gil
2. Execute the code until sleep or a Python virtual machine hangs it
3. Release the Gil

Each time the Gil Lock is freed, the thread locks the competition and switches threads, consuming resources. And because the Gil Lock exists, a process in Python can always execute only one thread at a time (the thread that gets the Gil can execute), which is why the multithreading efficiency of Python is not high on multicore CPUs.

Multithreaded Performance Testing
If multiple concurrent tasks are computationally intensive: multi-process efficiency
From multiprocessing import Process
From threading Import Thread
Import Os,time
def work ():
Res=0
For I in Range (100000000):
Res*=i

If name = = 'main':
L=[]
Print (Os.cpu_count ()) #本机为4核
Start=time.time ()
For I in range (4):
P=process (target=work) #耗时5s多
P=thread (target=work) #耗时18s多
L.append (P)
P.start ()
For P in L:
P.join ()
Stop=time.time ()
Print (' Run time is%s '% (Stop-start))

If multiple concurrent tasks are I/O intensive: multithreading is highly efficient
From multiprocessing import Process
From threading Import Thread
Import threading
Import Os,time
def work ():
Time.sleep (2)
Print (' ===> ')

If name = = 'main':
L=[]
Print (Os.cpu_count ()) #本机为4核
Start=time.time ()
For I in range (400):
# p=process (target=work) #耗时12s多, most of the time spent on the creation process
P=thread (target=work) #耗时2s多
L.append (P)
P.start ()
For P in L:
P.join ()
Stop=time.time ()
Print (' Run time is%s '% (Stop-start))

Application: Multithreading for IO-intensive, such as sockets, crawlers, web
Multi-process for computational-intensive, such as financial analysis

Xi. deadlock phenomenon and recursive lock
Deadlock: Refers to two or more than two processes or threads in the course of execution, because of contention for resources caused by a mutual waiting phenomenon, if there is no external force, they will not be able to proceed. At this point the system is in a deadlock state or the system generates a deadlock, and these processes, which are always waiting on each other, are called deadlock processes.

From threading Import Thread,lock
Import time
Mutexa=lock ()
Mutexb=lock ()

Class MyThread (Thread):
def run (self):
SELF.FUNC1 ()
SELF.FUNC2 ()
def func1 (self):
Mutexa.acquire ()
Print (' \033[41m%s get a lock \033[0m '%self.name)

    mutexB.acquire()    print('\033[42m%s 拿到B锁\033[0m' %self.name)    mutexB.release()    mutexA.release()def func2(self):    mutexB.acquire()    print('\033[43m%s 拿到B锁\033[0m' %self.name)    time.sleep(2)    mutexA.acquire()    print('\033[44m%s 拿到A锁\033[0m' %self.name)    mutexA.release()    mutexB.release()

If name = = 'main':
For I in range (10):
T=mythread ()
T.start ()

Recursive lock
Recursive lock, Python provides a reentrant lock Rlock in Python to support multiple requests for the same resource in the same thread. The Rlock internally maintains a lock and a counter variable, counter records the number of acquire, so that resources can be require multiple times. Until all the acquire of a thread are release, the other threads can get the resources. The above example if you use Rlock instead of lock, there will be no deadlock, the difference is that the recursive lock can be acquire multiple times, and the mutex can only acquire once

From threading Import Thread,rlock
Import time

Mutexa=mutexb=rlock () #一个线程拿到锁, counter plus 1, the line range again in the case of lock, then counter continue to add 1, during which all other threads can only wait, waiting for the thread to release all locks, that is, counter down to 0

Class MyThread (Thread):
def run (self):
SELF.FUNC1 ()
SELF.FUNC2 ()
def func1 (self):
Mutexa.acquire ()
Print (' \033[41m%s get a lock \033[0m '%self.name)

    mutexB.acquire()    print('\033[42m%s 拿到B锁\033[0m' %self.name)    mutexB.release()    mutexA.release()def func2(self):    mutexB.acquire()    print('\033[43m%s 拿到B锁\033[0m' %self.name)    time.sleep(2)    mutexA.acquire()    print('\033[44m%s 拿到A锁\033[0m' %self.name)    mutexA.release()    mutexB.release()

If name = = 'main':
For I in range (10):
T=mythread ()
T.start ()

12. Signal volume, Event, timer
Signal Volume
Semaphore is also a lock, you can specify a signal volume of 5, compared to a mutex lock can only have a task at the same time to grab the lock to execute, the signal volume at the same time can have 5 tasks to get the lock to execute, if the mutex is a house to rob a toilet, then the signal volume is the equivalent of a group of passers-by to Public toilets have multiple pit positions, which means that at the same time there can be more than one person on the public toilet, but the number of public toilets is certain, which is the size of the semaphore

From threading Import Thread,semaphore
Import threading
Import time

def func ():
Sm.acquire ()
Print ('%s get SM '%threading.current_thread (). GetName ())
Time.sleep (3)
Sm.release ()

If name = = 'main':
Sm=semaphore (5)
For I in range (23):
T=thread (Target=func)
T.start ()

Semaphore manages a built-in counter,
Built-in counter whenever acquire () is called-1;
Built-in counter +1 when call Release ();
The counter cannot be less than 0, and when the counter is 0 o'clock, acquire () blocks the thread until another thread calls release ().

Event
A key feature of a thread is that each thread is run independently and the state is unpredictable. Thread synchronization problems can become tricky if other threads in the program need to determine the state of a thread to decide what to do next. To solve these problems, we need to use the event object in the threading library. The object contains a signal flag that can be set by the thread, which allows the thread to wait for certain events to occur. In the initial case, the signal flag in the event object is set to False. If the thread waits for an event object, and the flag of the event object is false, then the threads will be blocked until the flag is true. A thread if the signal flag of an event object is set to true, it will wake up all the threads waiting for the event object. If a thread waits for an event object that is already set to true, it ignores the event and continues execution

From threading Import Event
Event.isset (): Returns the status value of the event;
Event.wait (): If Event.isset () ==false will block the thread;
Event.set (): Sets the status value of event to true, all the threads of the blocking pool are activated into a ready state, waiting for the operating system to dispatch;
Event.clear (): The status value of recovery event is false.

From threading Import Thread,event
Import threading
Import Time,random
Def conn_mysql ():
Count=1
While not Event.is_set ():
If Count > 3:
Raise Timeouterror (' Link timeout ')
Print (' <%s>%s attempts link '% (Threading.current_thread (). GetName (), count))
Event.wait (0.5)
Count+=1
Print (' <%s> link succeeded '%threading.current_thread (). GetName ())

Def check_mysql ():
Print (' \033[45m[%s] checking mysql\033[0m '% threading.current_thread (). GetName ())
Time.sleep (Random.randint (2,4))
Event.set ()
If name = = 'main':
Event=event ()
Conn1=thread (Target=conn_mysql)
Conn2=thread (Target=conn_mysql)
Check=thread (Target=check_mysql)

conn1.start()conn2.start()check.start()

Timer
Timer, specifying n seconds after an action is performed
From threading Import Timer

def hello ():
Print ("Hello, World")

t = Timer (1, hello)
T.start () # after 1 seconds, "Hello, World" would be printed

13. Thread Queue
There are three different ways to use thread queue
1,class queue. Queue (maxsize=0) #队列: FIFO
Import queue

Q=queue. Queue ()
Q.put (' first ')
Q.put (' second ')
Q.put (' third ')

Print (Q.get ())
Print (Q.get ())
Print (Q.get ())
2,class queue. Lifoqueue (maxsize=0) #堆栈: Last in fisrt out
Import queue

Q=queue. Lifoqueue ()
Q.put (' first ')
Q.put (' second ')
Q.put (' third ')

Print (Q.get ())
Print (Q.get ())
Print (Q.get ())

3,class queue. Priorityqueue (maxsize=0) #优先级队列: A queue that can be prioritized when storing data
Import queue

Q=queue. Priorityqueue ()

Put into a tuple, the first element of a tuple is a priority (usually a number, or a comparison between non-numbers), and the smaller the number the higher the priority

Q.put ((+, ' a '))
Q.put ((Ten, ' B '))
Q.put ((+, ' C '))

Print (Q.get ())
Print (Q.get ())
Print (Q.get ())

14. Process pool and thread pool
Official website: https://docs.python.org/dev/library/concurrent.futures.html
The Concurrent.futures module provides a highly encapsulated asynchronous calling interface
Threadpoolexecutor: Thread pool, providing asynchronous calls
Processpoolexecutor: Process pool, providing asynchronous calls
Both implement the same interface, which is defined by the abstract Executor class.

Basic methods
1, Submit (FN, *args, **kwargs) Asynchronous Submit task
2. Map (func, *iterables, Timeout=none, chunksize=1) replaces the For Loop submit operation
3, Shutdown (wait=true) is equivalent to the Pool.close () +pool.join () operation of the process pool
Wait=true, wait until all the tasks in the pool have been completed and the resources have been reclaimed before continuing
Wait=false, returns immediately, and does not wait for the task in the pool to complete
But regardless of the wait parameter, the entire program waits for all tasks to complete.
Submit and map must be before shutdown
4. Result (Timeout=none) results obtained
5, Add_done_callback (FN) callback function

Process Pool
Usage:
From concurrent.futures import Threadpoolexecutor,processpoolexecutor

Import Os,time,random
def task (n):
Print ('%s is runing '%os.getpid ())
Time.sleep (Random.randint (1,3))
Return n**2

If name = = 'main':

executor=ProcessPoolExecutor(max_workers=3)futures=[]for i in range(11):    future=executor.submit(task,i)    futures.append(future)executor.shutdown(True)print('+++>')for future in futures:    print(future.result())

Thread pool
Usage
Change the Processpoolexecutor to Threadpoolexecutor, the rest of the usage is all the same

Map method
From concurrent.futures import Threadpoolexecutor,processpoolexecutor

Import Os,time,random
def task (n):
Print ('%s is runing '%os.getpid ())
Time.sleep (Random.randint (1,3))
Return n**2

If name = = 'main':

executor=ThreadPoolExecutor(max_workers=3)# for i in range(11):#     future=executor.submit(task,i)executor.map(task,range(1,12)) #map取代了for+submit

callback function
A function can be bound for each process or thread of a process pool or line Cheng, which automatically fires after the task of a process or thread finishes executing, and receives the return value of the task as a parameter, which is called a callback function

From concurrent.futures import Threadpoolexecutor,processpoolexecutor
From multiprocessing import Pool
Import requests
Import JSON
Import OS

def get_page (URL):
Print (' < process%s> Get%s '% (Os.getpid (), URL))
Respone=requests.get (URL)
if Respone.status_code = = 200:
return {' url ': url, ' text ': Respone.text}

def parse_page (res):
Res=res.result ()
Print (' < process%s> Parse%s '% (Os.getpid (), res[' URL '))
parse_res= ' url:<%s> size:[%s]\n '% (res[' url '],len (res[' text '))
With open (' Db.txt ', ' a ') as F:
F.write (Parse_res)

If name = = 'main':
urls=[
' Https://www.baidu.com ',
' Https://www.python.org ',
' Https://www.openstack.org ',
' https://help.github.com/',
' http://www.sina.com.cn/'
]

p=ProcessPoolExecutor(3)for url in urls:    p.submit(get_page,url).add_done_callback(parse_page)

#parse_page拿到的是一个future对象obj, you need to get the results with Obj.result ()

Concurrent programming Multi-Threading

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More