Iron Music Python_day41_ Thread 01

Source: Internet
Author: User
Tags posix processing text thread class advantage

Introduction of threading concept into background process

Before we knew the concept of the process in the operating system, the program could not be run alone, and the program would be allocated resources to run it only if it was loaded into memory, and this kind of execution would be called a process.

The difference between a program and a process is:

A program is a set of instructions, which is a static description of the process running text;
A process is an execution activity of a program and belongs to a dynamic concept.
In multi-channel programming, we allow multiple programs to be loaded into memory at the same time, which can be implemented concurrently under the dispatch of the operating system.
It is this design that greatly improves the utilization of the CPU.
The advent of the process makes it possible for each user to feel the CPU alone, so the process is proposed for multi-channel programming on the CPU.

With the process why the thread

The process has many advantages, it provides multi-channel programming, let us feel each of us have their own CPU and other resources, can improve the utilization of the computer. Many people do not understand, since the process is so good, why do threads? In fact, careful observation will find that the process is still a lot of defects, mainly reflected in two points:

1, the process can only do one thing at a time, if you want to do two things at the same time or many things, the process will be powerless.

2. If the process is blocked during execution, such as waiting for input, the entire process will hang, even if some work in the process does not depend on the input data, it will not be able to execute.

If these two shortcomings are difficult to understand, give a realistic example, perhaps you are clear: if we take the course of the process as a process, then we have to do is ear to listen to the teacher lectures, hand also take notes, the brain still have to think about the problem, so as to effectively complete the task of the lectures. And if only to provide the process of this mechanism, the above three things will not be executed at the same time, can only do one thing, listen to the time can not take notes, and can not use the brain to think, this is one; if the teacher wrote the calculation process on the blackboard, we began to take notes, and the teacher suddenly had a step down, blocked He was thinking over there, and we couldn't do anything else, even if you wanted to think about a problem you didn't understand just now, which is second.

Now you should understand the defects of the process, and the solution is very simple, we can let listen, write, think three independent process, parallel together, so it is obvious to improve the efficiency of lectures. The same mechanism, the thread, is also introduced in the actual operating system.

The appearance of a thread

60 's, in the OS can have resources and independent operation of the basic unit is the process, but with the development of computer technology, the process has a lot of drawbacks, one is because the process is the resource owner, creating, undoing and switching there is a large space-time overhead, it is necessary to introduce a light process, second, because of the symmetric multiprocessor (SMP) , multiple running units can be satisfied, and multiple processes are too expensive to parallel.
So in the 80 's, there was the basic unit-thread (Threads) that could run independently.
Note: A process is the smallest unit of resource allocation, and a thread is the smallest unit of CPU scheduling.
There is at least one thread in each process.

Relationship of processes and threads

The difference between threads and processes can be summed up in the following 4 points:
1) address space and other resources (such as open files): Processes are independent of each other and shared among threads of the same process. Threads within a process are not visible in other processes.
2) Communication: Inter-process communication IPC, between threads can directly read and write process data segments (such as global variables) to communicate-the need for process synchronization and mutual exclusion means of support to ensure data consistency.
3) Scheduling and switching: Thread context switches are much faster than process context switches.
4) in a multithreaded operating system, the process is not an executable entity.

Features of the thread

In a multithreaded operating system, it is common to include multiple threads in a process, each as the basic unit of CPU utilization, and an entity that spends the least overhead. The thread has the following properties.
1) Light body
Entities in a thread do not have system resources at all, but have a bit of an essential resource that can be run independently.
The entities of the thread include programs, data, and TCB. A thread is a dynamic concept, and its dynamic nature is described by the thread control block TCB.
2) The basic unit of independent Dispatch and dispatch.
In multi-threaded OS, threads are the basic units that can run independently, and thus are the basic units of independent Dispatch and dispatch. Because the thread is "light", the thread switches very quickly and with little overhead (in the same process).
3) Share process resources.
Threads in the same process can share the resources owned by the process, which first manifests that all threads have the same process ID, which means that the thread can access every memory resource of the process, and also access the open files, timers, semaphores, etc. owned by the process. Because threads in the same process share memory and files, there is no need to call the kernel to communicate with one another.
4) can be executed concurrently.
In a process between multiple threads, can be executed concurrently, and even allow all threads in a process to execute concurrently, also, the threads in different processes can execute concurrently, take full advantage of and play the processor and peripheral equipment to work in parallel.

The TCB includes the following information:
(1) Thread state.
(2) When the thread is not running, the field resource is saved.
(3) A set of execution stacks.
(4) Store the local variable memory area of each thread.
(5) Access to main memory and other resources in the same process.
A set of registers and stacks that indicate the program counter, the reserved local variable, the few state parameters, and the return address of the instruction sequence being executed.

Using the thread's actual scenario

Open a word processing software process, the process must do more than one thing, such as listening to keyboard input, processing text, automatically save the text to the hard disk, the three tasks are the same piece of data, and therefore can not be used multi-process.
Only in one process can open three threads concurrently, if it is a single thread, it can only be, keyboard input, can not handle text and auto-save, automatically save the text can not be entered and processed.

In-Memory threads


Multiple threads share resources in the address space of the same process, which is a simulation of multiple processes on a single computer, sometimes referred to as a lightweight process.
For multiple processes on a single computer, you share physical memory, disks, printers, and other physical resources. Multi-threaded operation is similar to multi-process operation, which is a fast switchover between multiple threads by the CPU.
The different processes are full of hostility, each other is preemption, competition CPU relationship, if Thunderbolt will and QQ Rob Resources. And the same process is created by a programmer's program, so the thread within the same process is a partnership, one thread can access the memory address of the other thread, everyone is shared, and one thread dries the memory of another thread, which is purely a problem for the programmer.
Similar to the process, each thread also has its own stack, unlike the process, where the line libraries cannot force the thread out of the CPU using the clock interrupt, and can call the Thread_yield run thread to automatically abandon the CPU and let another thread run.
Threads are usually useful, but the problem with threading is that it is difficult to design a program:
1. If the parent process has multiple threads, does the open child thread require the same number of threads
2. In the same process, if one thread closes the file and another thread is preparing to write to the file?
Therefore, in multithreaded code, more attention is needed to design the program's Logic and protect the program's data.

User-level threads and kernel-level threads

Thread implementations can be divided into two categories: User-level threads (user-level thread) and kernel thread threads (kernel-level thread), which are also known as kernel-supported threads or lightweight processes. In multi-threaded operating system, the implementation of each system is not the same, in some systems to implement the user-level threads, and some systems to implement the kernel-level threads.

User-level threads

Kernel switching by the user state program itself control the kernel switch, do not need the core interference, less in and out of the kernel state consumption, but not very good use of multi-core CPU.
The user space simulates the scheduling of the operating system to invoke a thread in a process, and each process has a runtime system to dispatch the thread. At this point, when the process acquires the CPU, the process then dispatches a thread to execute, and only one thread executes at the same time.

Kernel-level threading

Kernel-level thread: The switch is controlled by the kernel, and when the thread switches, the user state is converted to the kernel state. The switch is completed to return the user state from the kernel state, and the SMP can be used to make use of the multi-core CPU. This is the case with Windows threads.

User-level vs. kernel-level threading

1 Kernel support threads are OS kernel-aware, while user-level threads are not perceived by the OS kernel.
2 the creation, revocation, and dispatch of user-level threads do not require the support of the OS kernel, which is handled at the level of language (such as Java), while kernel support for thread creation, revocation, and dispatch requires the OS kernel to provide support, and is largely the same as process creation, revocation, and scheduling.
3 When a user-level thread executes a system invoke instruction, it causes its owning process to be interrupted, while the kernel support thread executes the system call instruction, causing the thread to be interrupted only.
4 in a system with only a user-level thread, the CPU is dispatched as a process, multiple threads in a running process, and the user program controls the rotation of the thread, and in a system with kernel support threads, the CPU is dispatched in threads, and the thread scheduler of the OS is responsible for thread scheduling.
5 The program entity of a user-level thread is a program that runs under a user-state, while a kernel-enabled program entity is a program that can run in any state.

Advantage: When there are multiple processors, multiple threads of a process can execute concurrently.
Cons: Dispatched by the kernel.

Advantages:
Thread scheduling does not require the kernel to participate directly, the control is simple.
Can be implemented in an operating system that does not support threading.
Thread management, such as creating and destroying threads, thread switching costs, and so on, is much less expensive than kernel threads.
Allow each process to customize its own scheduling algorithm, thread management is more flexible.
Threads can take advantage of more table space and stack space than kernel-level threads.
Only one thread is running in the same process, and if one thread is blocking with a system call, the entire process is suspended. Also, a page failure can cause the same problem.
Disadvantages:
Resource scheduling According to the process, multiple processors, the same process threads can only be reused under the same processor.

Hybrid implementations
User-level and kernel-level multiplexing, the kernel is the same dispatch kernel thread, each kernel thread corresponds to n user threads.

NPTL History of the Linux operating system

Before kernel 2.6, the scheduler entity is a process, and the kernel does not really support threads.
It can be implemented by a system called Clone (), which creates a copy of the calling process,
Unlike fork (), this copy of the process completely shares the address space of the calling process.
Linuxthread is used by this system to provide thread support at the kernel level.
(Many of the previous thread implementations are entirely user-state, and the kernel does not know the existence of threads at all).
Unfortunately, there are quite a few places in this approach that do not follow POSIX standards, especially in the areas of signal processing, scheduling, inter-process communication primitives, and so on.

Obviously, in order to improve the linuxthread, the kernel must be supported, and the line libraries needs to be rewritten.
To achieve this demand, there are two competing projects: IBM-initiated NGTP (Next Generation POSIX Threads) project,
And the Redhat company's NPTL. In 2003, IBM gave up the NGTP, which is about then, Redhat released the original NPTL.

NPTL was first released in Redhat Linux 9 and now supports NPTL from the RHEL3 kernel 2.6 and is fully part of the GNU C library.

Design

NPTL uses the same approach as Linuxthread, where the thread is still treated as a process and still uses the clone () system call (called in the NPTL Library). However, NPTL requires special support at the kernel level, such as the thread synchronization primitive Futex that needs to hang and then wake the thread.

NPTL也是一个1*1的线程库,就是说,当你使用pthread_create()调用创建一个线程后,在内核里就相应创建了一个调度实体,在linux里就是一个新进程,这个方法最大可能的简化了线程的实现。除NPTL的1*1模型外还有一个m*n模型,通常这种模型的用户线程数会比内核的调度实体多。在这种实现里,线程库本身必须去处理可能存在的调度,这样在线程库内部的上下文切换通常都会相当的快,因为它避免了系统调用转到内核态。然而这种模型增加了线程实现的复杂性,并可能出现诸如优先级反转的问题,此外,用户态的调度如何跟内核态的调度进行协调也是很难让人满意。
Thread and Python global interpreter lock Gil

The execution of the Python code is controlled by the Python virtual machine (also known as the interpreter main loop). Python was designed to take into account the main loop, while only one thread was executing. Although the Python interpreter can "run" multiple threads, only one thread runs in the interpreter at any time.
Access to the Python virtual machine is controlled by the global Interpreter Lock (GIL), which ensures that only one thread is running at the same time.
In a multithreaded environment, the Python virtual machine executes as follows:
A, set GIL;
b, switch to a thread to run;
C, run a specified number of bytecode instructions or threads actively give up control (can call Time.sleep (0));
D, set the thread to sleep state;
e, unlock GIL;
D. Repeat all of the above steps again.
The Gil will be locked until the function ends (because no Python bytecode is running, so there is no thread switching) when calling external code (such as C + + extension functions), programmers who write extensions can actively unlock the Gil.

Selection of the Python threading module

Python provides several modules for multithreaded programming, including thread, threading, and queue. The thread and threading modules allow programmers to create and manage threads. The thread module provides basic threading and lock support, and threading provides a higher-level, more functional thread-management capability. The queue module allows a user to create a data structure that can be used to share information between multiple threads.
Avoid using the thread module because the higher-level threading modules are more advanced, the support for threading is better, and the use of attributes in the thread module may conflict with the threading, followed by a low level of synchronization primitives for the thread module ( There is actually only one, and the threading module is a lot; Moreover, in the thread module, when the main thread ends, all threads are forced to end, no warnings and no normal cleanup work, at least the threading module ensures that the important child threads exit after the process exits.
The thread module does not support daemon threads, and when the main thread exits, all child threads are forced to quit, regardless of whether they are still working. While the threading module supports the daemon thread, the daemon thread is typically a server waiting for a client request, and if no client requests it, it waits, if a thread is set to be a daemon, it means that the thread is unimportant, and the process exits without waiting for the thread to exit.

Threading Module

The Multiprocess module completely imitates the interface of the threading module, which has a great similarity in the use level, so it is no longer described in detail.

Creating a thread Threading.thread class
Cases:#!/usr/bin/env python# _*_ Coding:utf-8 _*_ fromThreadingImportThreadImportTimedefSayhi (name): Time.sleep (2)Print('%sSay hello ' %Nameif __name__ == ' __main__ ': t=Thread (target=Sayhi, args=(' Iron Music ',)) T.start ()Print(' main thread ') fromThreadingImportThreadImportTimeclassSayhi (Thread):def __init__( Self, name):Super().__init__() Self. Name=NamedefRun Self): Time.sleep (2)Print('%sSay hello ' %  Self. Name) Example2: fromThreadingImportThreadImportTimeclassSayhi (Thread):def __init__( Self, name):Super().__init__() Self. Name=NamedefRun Self): Time.sleep (2)Print('%sSay hello ' %  Self. Name)if __name__ == ' __main__ ': t=Sayhi (' Iron Music ') T.start ()Print(' main thread ') Example3Comparison of:p IDs fromThreadingImportThread fromMultiprocessingImportProcessImportOsdefWork ():Print(' Hello ', Os.getpid ())if __name__ == ' __main__ ':#part1: Open multiple threads under the main process, each with the same PID as the main processT1=Thread (target=Work) T2=Thread (target=Work) T1.start () T2.start ()Print(' main thread/main process pid ', Os.getpid ())#part2: Open multiple processes with different PID for each processP1=Process (target=Work) P2=Process (target=Work) P1.start () P2.start ()Print(' main thread/main process pid ', Os.getpid ()) Example: analog socket multi-threaded chat server:ImportThreadingImportSockets=Socket.socket (socket.af_inet, socket. Sock_stream) S.bind ((' 127.0.0.1 ',9527)) S.listen (5)defACTION (conn): while True: Data=CONN.RECV (1024x768)Print(Data.decode (' Utf-8 ')) Conn.send (Data.upper ())if __name__ == ' __main__ ': while True: Conn, addr=S.accept () p=Threading. Thread (target=Action, args=(conn,)) P.start () Client side:ImportSockets=Socket.socket (socket.af_inet, socket. SOCK_STREAM) S.Connect((' 127.0.0.1 ',9527)) while True: Msg= input(' >>: '). Strip ()if  notMsgContinueS.send (Msg.encode (' Utf-8 ')) data=S.RECV (1024x768)Print(Data.decode (' Utf-8 '))

Attached: some short summaries:

gil--Global Interpreter Lock
Lock thread: Only one thread can access the CPU at the same time at the time of calculation
The thread lock restricts your use of the CPU, but does not affect the efficiency of the Web class or the crawler code.
We can make up the problem by initiating a multi-process form.

Efficiency problem: Thread fast process slow
Multi-process-slow operating system switching when opening and destroying
The overhead of opening a child process is large, and the operating system switches between processes with a high time overhead.
Threads are lightweight processes, and the time required to create and destroy threads is very small.
Threads directly use the memory of the process, and threads cannot exist independently, depending on the process.

Data sharing issues:
Data is isolated between processes, and data is shared between threads.
If there is too much data sharing between multiple sub-processes, you should not isolate the data.

# 一个进程 —— 实现不了并发,如果你既不希望数据隔离,还要实现并发的效果,那就借助多线程。# 同一个进程下的多个线程进程号相同 : 线程号不同# 进程 —— 资源分配的最小单位# 线程 —— CPU调度的最小单位    # 轻型进程 : 创建、销毁、切换 开销比进程小    # 数据不隔离    # 可以并发    # 依赖进程# 每一个进程里至少有一个线程# 进程负责管理资源、线程负责执行代码# if __name__ == ‘__main__‘ : 开启进程 必须有这句话 但是开启线程不需要.这种现象只在windows操作系统上才出现

Other methods of the thread class
Methods for thread instance objects
IsAlive (): Returns whether the thread is active.
GetName (): Returns the thread name.
SetName (): Sets the thread name.

Some of the methods provided by the threading module are:
Threading.currentthread (): Returns the current thread variable.
Threading.enumerate (): Returns a list that contains the running thread.
Running refers to threads that do not include pre-and post-termination threads until after the thread has started and ends.
Threading.activecount (): Returns the number of running threads with the same result as Len (Threading.enumerate ()).

Cases: fromThreadingImportThreadImportThreadingdefWork ():ImportTime Time.sleep (3)Print(Threading.current_thread (). GetName ())if __name__ == ' __main__ ':# Open threads under the main processT=Thread (target=Work) T.start ()Print(Threading.current_thread (). GetName ())Print(Threading.current_thread ())# main thread    Print(Threading.Enumerate())# with two threads running within the main thread    Print(Threading.active_count ())Print(' main thread/main process ')" "Printing results:Mainthread<_mainthread (Mainthread, started 1452) >[<thread (Thread-1, started 7404), <_mainthread (Mainthread, started 1452);]2Main thread /master processThread-1" "
Daemon Threads

Whether it is a process or a thread, follow:
Guardian XX will wait for the main xx to be destroyed after the operation is complete. It should be emphasized that the operation is not terminated
1. For the main process, running complete means that the main process code is running
2. To the main thread said, run complete refers to the main thread in the process of all non-daemon threads run complete, the main thread is run complete

Detailed:
1 The main process is finished after its code is finished (the daemon is recycled at this point), and then the main process will wait until the non-daemon child processes have finished running to reclaim the child process's resources (otherwise it will produce a zombie process) before it ends.
2 The main thread runs after the other non-daemons have finished running (the daemon thread is recycled at this point). Because the end of the main thread means the end of the process, the resources of the process as a whole are recycled, and the process must ensure that the non-daemon threads are finished before they end.

import  time from  threading import  threaddef  func1 (): # daemon thread  while  true : Time.sleep (1 ) print  ( ' child Threads ' ) def  Func2 (): Time.sleep (5 ) print  () t =  Thread (Target=  func1) t2 =  thread (Target=  FUNC2) T.setdaemon (true ) T.start () T2.start () print  (  ' main thread ' ) The daemon of the # main process is the end of the code of the main process, and the daemon ends  The daemon thread of the # main thread will not end until all threads of the non-daemon thread have finished executing   

Reference:
Http://www.cnblogs.com/Eva-J/articles/8306047.html

2018-5-15
End

Iron Music Python_day41_ Thread 01

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.