Multithreading in Python (1): A Preliminary thread study, 2016 python
--- Restore content start ---
New Year's first article, continue with Python.
First, we will briefly introduce the threads and processes.
At the beginning of computer development, the program exclusively used all the memory and hardware resources from start to end. Each computer can only run one program at a time. Later, some mechanisms were introduced to improve this calling method, including pipelines and multi-process. We start to execute concurrent programs. Each program is called by the processor alternately. At a very high frequency, you will think that these programs are executed at the same time, that is, the Concurrency Technology. Use the operating system to manage concurrency, read the program to the memory, and then start being called by the operating system. Its lifecycle begins. Every program is executed in the form of a process. Each process has its own address space, memory, data stack, and other things. For any program executed in process mode, it seems that all hardware resources are exclusive. Each process has a virtual memory address starting from address 0. That is to say, a process is an abstract of some significance. Then, the operating system manages all the processes and assigns them time-based runtime. Processes communicate with each other by means of interaction between processes. information cannot be shared directly.
So what is thread?
A thread is often called a Lightweight Process, which is similar to a process. The difference is that all threads run in the same process and share the same runtime environment, it has the same address space and data stack space. It can be considered that there are many parallel threads in a process.
The thread has three parts: Start, sequential execution, and end. It has its own command pointer to record where it runs. The thread may be preemptible or temporarily suspended for other threads to run. This method is called concession.
Because threads share the same data space, data and communication can be easily shared. However, one problem is that multiple threads simultaneously access the same piece of data, data results may be inconsistent, resulting in so-called race conditions.
In general, a thread is a set of several concurrent processes in a program, so that several tasks can be executed simultaneously.
Use threads in Python
Python code is controlled by Python virtual machines. It can be controlled by the global interpreter lock GIL, or some modules can be used directly to meet our needs. Next we will introduce the thread and threading modules. The thread module is generally not recommended, because it exits when the main thread exits, if other threads are not finished, but have not been cleared, the threading module ensures that all important sub-threads exit before the process ends.
By default, Python enables thread support. In interactive mode, if you try to import the thread module without errors, it means it is available.
>>> import thread>>>
If an import error occurs, re-compile the Python interpreter to run it.
First, let's look at an example without multi-thread support:
Here, the sleep () function of the time module is used. Input a floating point parameter to indicate the number of seconds of sleep, which means that the program will be suspended for this period of time.
from time import sleep, ctimedef loop0(): print 'start loop 0 at: %s' % ctime() sleep(4) print 'loop 0 done at: %s' % ctime()def loop1(): print 'start loop 1 at: %s' % ctime() sleep(2) print 'loop 1 done at: %s' % ctime()def main(): print 'starting at: %s' % ctime() loop0() loop1() print 'all Done at: %s' % ctime()if __name__ == '__main__': main()
>>> starting at: Sat Jan 02 21:17:48 2016start loop 0 at: Sat Jan 02 21:17:48 2016loop 0 done at: Sat Jan 02 21:17:52 2016start loop 1 at: Sat Jan 02 21:17:52 2016loop 1 done at: Sat Jan 02 21:17:54 2016all Done at: Sat Jan 02 21:17:54 2016
We can see that the program is executed in sequence without any doubt, but the time we use sleep () to suspend does not make sense.
So let's take a look at the method after the thread is used:
import threadfrom time import sleep, ctimedef loop0(): print 'start loop 0 at: %s' % ctime() sleep(4) print 'loop 0 done at: %s' % ctime()def loop1(): print 'start loop 1 at: %s' % ctime() sleep(2) print 'loop 1 done at: %s' % ctime()def main(): print 'starting at: %s' % ctime() thread.start_new_thread(loop0,()) thread.start_new_thread(loop1,()) sleep(6) print 'all Done at: %s'% ctime()if __name__ == '__main__': main()
The result is as follows:
>>> starting at: Sat Jan 02 21:23:58 2016start loop 0 at: Sat Jan 02 21:23:58 2016start loop 1 at: Sat Jan 02 21:23:58 2016loop 1 done at: Sat Jan 02 21:24:00 2016loop 0 done at: Sat Jan 02 21:24:02 2016all Done at: Sat Jan 02 21:24:04 2016
We can see that this time loop1 and loop0 ended in 4 seconds after the program started, but we had an additional sleep (6), so that the entire program ran for 6 seconds, this is to prevent the sub-thread from exiting after the main thread ends, resulting in no execution. However, this method is really stupid, we finally run the program for 6 seconds to end. If we do not know when the sub-process will end, for example, after reading a command on the keyboard, how should we write such a statement. Next I will introduce this method. Let's take a look at what the thread module is doing here.
We call a thread method start_new_thread (funciton, args kwargs = None). The function of this method is to generate a new thread, the specified parameter and optional kwargs are used in the new thread to call this function. This is a simple thread mechanism.
Next, use the lock to prevent the use of the sleep () function in the main thread:
import threadfrom time import ctime, sleeploops = [4, 2]def loop(nloop, nsec, lock): print 'start loop%s at: %s\n' % (nloop, ctime()), sleep(nsec) print 'loop%s done at: %s\n' % (nloop, ctime()), lock.release()def main(): print 'starting at: %s\n' % ctime(), locks = [] nloops = range(len(loops)) for i in nloops: lock = thread.allocate_lock() lock.acquire() locks.append(lock) for i in nloops: thread.start_new_thread(loop, (i, loops[i], locks[i])) for i in nloops: while locks[i].locked(): pass print 'all DONE at: %s\n' %ctime(),if __name__ == '__main__': main()
The result is as follows:
>>> starting at: Sat Jan 02 21:55:30 2016start loop0 at: Sat Jan 02 21:55:30 2016start loop1 at: Sat Jan 02 21:55:30 2016loop1 done at: Sat Jan 02 21:55:32 2016loop0 done at: Sat Jan 02 21:55:34 2016all DONE at: Sat Jan 02 21:55:34 2016
You can see that I have used the print statement in a very popular way. As to why not use the basic method, you can see how the output result is in the original method.
Here, we use thread. the allocate_lock () function creates a Lock Object and stores it in a lock list. Each time, you have to call the acquire () function to obtain the lock, that is, lock the lock. After the lock, through a lock list, in a loop, each thread is allocated to its own lock and then executed together. At the end of the thread, We need to unlock it.
Why not create a process during lock creation? There are two reasons: one is that we want every thread to start synchronization almost simultaneously. The other is that it takes a certain amount of time to obtain the lock. If the thread exits too fast, the lock may have not been obtained and the thread ends.
Therefore, we need to allocate locks, obtain locks, and release locks to synchronize processes.
I will write it here today and explain the use of threading next time. At that time, we will not need to consider these locks.