Examples of concurrent programming in Python and python
I. Introduction
We call a running program a process. Each process has its own system status, including the memory status, open file list, program pointer tracking command execution, and a call stack that saves local variables. Generally, a process is executed in a single sequence of control flows. This control flow is called the main thread of the process. At any given moment, a program only does one thing.
A program can create a new process (such as OS. fork () or subprocess. Popen () through the OS or subprocess module in the Python library function ()). However, these processes, called sub-processes, run independently. They have their own independent system states and main threads. Because processes are independent from each other, they are concurrently executed with the original process. This means that the original process can execute other work after creating the sub-process.
Although processes are independent of each other, they can communicate with each other through the mechanism called inter-process communication (IPC. A typical mode is based on message transmission, which can be simply understood as a pure byte buffer, while send () or recv () operation primitives can be used through pipelines such as pipe) or I/O channels such as network socket to transmit or receive messages. There are also some IPC modes that can be completed through the memory-mapped mechanism (for example, the mmap Module). Through memory ing, the process can create shared areas in the memory, modifications to these regions are visible to all processes.
Multi-process can be used in scenarios where multiple tasks need to be executed simultaneously. Different processes are responsible for different parts of the task. However, another way to segment jobs into tasks is to use threads. Similar to a process, a thread also has its own control flow and execution stack, but the thread runs within the process it was created to share all the data and system resources of its parent process. Threads are useful when applications need to complete concurrent tasks, but the potential problem is that tasks must share a large number of system states.
When multiple processes or threads are used, the operating system is responsible for scheduling. This is achieved by a small time slice for each process (or thread) and fast cyclic switching between all active tasks. This process divides the CPU time into small fragments and distributes them to each task. For example, if 10 active processes in your system are being executed, the operating system will allocate a CPU time to each process and switch among the 10 processes cyclically. When the system has more than one CPU core, the operating system can schedule processes to different CPU cores to maintain the average system load for parallel execution.
Complex issues need to be considered for programs written using the concurrent execution mechanism. The main source of complexity is data synchronization and sharing. In general, multiple tasks attempt to update the same data structure at the same time, which may result in inconsistent dirty data and program status (formally speaking, it is about resource competition ). To solve this problem, you need to use mutex or other similar synchronization primitives to identify and protect key parts of the program. For example, if multiple different threads are attempting to write data to the same file at the same time, you need a mutex lock to execute these write operations in sequence. When a thread is writing data, other threads must wait until the current thread releases the resource.
Concurrent Programming in Python
Python has long supported concurrent programming in different ways, including threads, sub-processes, and other concurrent implementations using the generator function.
Python supports message passing and thread-based concurrent programming in most systems. Although most programmers are more familiar with the thread interface, the Python thread mechanism has many restrictions. Python uses the internal global interpreter lock (GIL) to ensure thread security. GIL allows only one thread to execute at the same time. This allows the Python program to run on a single processor even on a multi-core system. Despite a lot of arguments in the Python field about GIL, there is no possibility of removing it in the foreseeable future.
Python provides some sophisticated tools for managing concurrent operations based on threads and processes. Even a simple program can use these tools to speed up concurrent tasks. The subprocess module provides APIs for creating and communicating sub-processes. This is especially suitable for running text-related programs because these Apis support data transmission through standard input/output channels of new processes. The signal module exposes the semaphore mechanism of the UNIX System to users to transmit event information between processes. Signals are processed asynchronously. When a signal arrives, the current work of the program is interrupted. Signal mechanisms can implement coarse-grained message transmission systems, but other more reliable in-process communication technologies can transmit more complex messages. The threading module provides a series of advanced object-oriented APIs for concurrent operations. Thread objects run concurrently in a process to share memory resources. I/O-intensive tasks can be better expanded using threads. The multiprocessing module is similar to the threading module, but it provides process operations. Each process class is a real operating system process and does not share memory resources. However, the multiprocessing module provides a mechanism to share data between processes and transmit messages. Generally, it is easy to change a thread-based program to process-based. You only need to modify some import declarations.
Threading module example
Taking the threading module as an example, let's think about a simple question: how to accumulate a large number using the multipart parallel method.
import threading class SummingThread(threading.Thread): def __init__(self, low, high): super(SummingThread, self).__init__() self.low = low self.high = high self.total = 0 def run(self): for i in range(self.low, self.high): self.total += i thread1 = SummingThread(0, 500000)thread2 = SummingThread(500000, 1000000)thread1.start() # This actually causes the thread to runthread2.start()thread1.join() # This waits until the thread has completedthread2.join()# At this point, both threads have completedresult = thread1.total + thread2.totalprint(result)
Custom Threading class library
I wrote a small Python class library that is easy to use with threads, including some useful classes and functions.
Key parameters:
* Do_threaded_work-this function assigns a series of given tasks to corresponding processing functions (the allocation order is uncertain)
* ThreadedWorker-This class creates a thread that pulls a job from a synchronous work queue and writes the processing result to the synchronous result queue.
* Start_logging_with_thread_info-Write the thread id to all log messages. (Dependent on the log environment)
* Stop_logging_with_thread_info-used to remove the thread id from all log messages. (Dependent on the log environment)
import threadingimport logging def do_threaded_work(work_items, work_func, num_threads=None, per_sync_timeout=1, preserve_result_ordering=True): """ Executes work_func on each work_item. Note: Execution order is not preserved, but output ordering is (optionally). Parameters: - num_threads Default: len(work_items) --- Number of threads to use process items in work_items. - per_sync_timeout Default: 1 --- Each synchronized operation can optionally timeout. - preserve_result_ordering Default: True --- Reorders result_item to match original work_items ordering. Return: --- list of results from applying work_func to each work_item. Order is optionally preserved. Example: def process_url(url): # TODO: Do some work with the url return url urls_to_process = ["http://url1.com", "http://url2.com", "http://site1.com", "http://site2.com"] # process urls in parallel result_items = do_threaded_work(urls_to_process, process_url) # print(results) print(repr(result_items)) """ global wrapped_work_func if not num_threads: num_threads = len(work_items) work_queue = Queue.Queue() result_queue = Queue.Queue() index = 0 for work_item in work_items: if preserve_result_ordering: work_queue.put((index, work_item)) else: work_queue.put(work_item) index += 1 if preserve_result_ordering: wrapped_work_func = lambda work_item: (work_item[0], work_func(work_item[1])) start_logging_with_thread_info() #spawn a pool of threads, and pass them queue instance for _ in range(num_threads): if preserve_result_ordering: t = ThreadedWorker(work_queue, result_queue, work_func=wrapped_work_func, queue_timeout=per_sync_timeout) else: t = ThreadedWorker(work_queue, result_queue, work_func=work_func, queue_timeout=per_sync_timeout) t.setDaemon(True) t.start() work_queue.join() stop_logging_with_thread_info() logging.info('work_queue joined') result_items = [] while not result_queue.empty(): result = result_queue.get(timeout=per_sync_timeout) logging.info('found result[:500]: ' + repr(result)[:500]) if result: result_items.append(result) if preserve_result_ordering: result_items = [work_item for index, work_item in result_items] return result_items class ThreadedWorker(threading.Thread): """ Generic Threaded Worker Input to work_func: item from work_queue Example usage: import Queue urls_to_process = ["http://url1.com", "http://url2.com", "http://site1.com", "http://site2.com"] work_queue = Queue.Queue() result_queue = Queue.Queue() def process_url(url): # TODO: Do some work with the url return url def main(): # spawn a pool of threads, and pass them queue instance for i in range(3): t = ThreadedWorker(work_queue, result_queue, work_func=process_url) t.setDaemon(True) t.start() # populate queue with data for url in urls_to_process: work_queue.put(url) # wait on the queue until everything has been processed work_queue.join() # print results print repr(result_queue) main() """ def __init__(self, work_queue, result_queue, work_func, stop_when_work_queue_empty=True, queue_timeout=1): threading.Thread.__init__(self) self.work_queue = work_queue self.result_queue = result_queue self.work_func = work_func self.stop_when_work_queue_empty = stop_when_work_queue_empty self.queue_timeout = queue_timeout def should_continue_running(self): if self.stop_when_work_queue_empty: return not self.work_queue.empty() else: return True def run(self): while self.should_continue_running(): try: # grabs item from work_queue work_item = self.work_queue.get(timeout=self.queue_timeout) # works on item work_result = self.work_func(work_item) #place work_result into result_queue self.result_queue.put(work_result, timeout=self.queue_timeout) except Queue.Empty: logging.warning('ThreadedWorker Queue was empty or Queue.get() timed out') except Queue.Full: logging.warning('ThreadedWorker Queue was full or Queue.put() timed out') except: logging.exception('Error in ThreadedWorker') finally: #signals to work_queue that item is done self.work_queue.task_done() def start_logging_with_thread_info(): try: formatter = logging.Formatter('[thread %(thread)-3s] %(message)s') logging.getLogger().handlers[0].setFormatter(formatter) except: logging.exception('Failed to start logging with thread info') def stop_logging_with_thread_info(): try: formatter = logging.Formatter('%(message)s') logging.getLogger().handlers[0].setFormatter(formatter) except: logging.exception('Failed to stop logging with thread info')
Example
from test import ThreadedWorkerfrom queue import Queue urls_to_process = ["http://facebook.com", "http://pypix.com"] work_queue = Queue()result_queue = Queue() def process_url(url): # TODO: Do some work with the url return url def main(): # spawn a pool of threads, and pass them queue instance for i in range(5): t = ThreadedWorker(work_queue, result_queue, work_func=process_url) t.setDaemon(True) t.start() # populate queue with data for url in urls_to_process: work_queue.put(url) # wait on the queue until everything has been processed work_queue.join() # print results print(repr(result_queue)) main()
How to Understand the instances of class objects in python programming?
Class is the generalization of a class of things, such as people.
The data type includes built-in strings, numbers, plural numbers, and other custom classes.
Objects and instances are specific things in the class, such as men, women, and others. Here men and women can also be one type, such as older men, young man.
Remember that a class is a collective term of a type of thing, and an instance (or object) is a specific thing.
For reference only.
Example:
Class Person:
'''Basic attributes of a person: name, age, and Gender '''
Def _ init _ (self, name, age, sex ):
Self. name = name
Self. age = age
Self. sex = sex
Class Man (Person ):
Def _ init _ (self, name, age ):
Super (Man, self). _ init _ (name, age, 'male ')
Class Woman (Person ):
Def _ init _ (self, name, age ):
Super (Woman, self). _ init _ (name, age, 'female ')
An arrangement and combination problem in python Programming
99 days, each person can be paired with the other 99 people ..