Examples of concurrent programming in Python and python

Source: Internet
Author: User

Examples of concurrent programming in Python and python

I. Introduction

We call a running program a process. Each process has its own system status, including the memory status, open file list, program pointer tracking command execution, and a call stack that saves local variables. Generally, a process is executed in a single sequence of control flows. This control flow is called the main thread of the process. At any given moment, a program only does one thing.

A program can create a new process (such as OS. fork () or subprocess. Popen () through the OS or subprocess module in the Python library function ()). However, these processes, called sub-processes, run independently. They have their own independent system states and main threads. Because processes are independent from each other, they are concurrently executed with the original process. This means that the original process can execute other work after creating the sub-process.

Although processes are independent of each other, they can communicate with each other through the mechanism called inter-process communication (IPC. A typical mode is based on message transmission, which can be simply understood as a pure byte buffer, while send () or recv () operation primitives can be used through pipelines such as pipe) or I/O channels such as network socket to transmit or receive messages. There are also some IPC modes that can be completed through the memory-mapped mechanism (for example, the mmap Module). Through memory ing, the process can create shared areas in the memory, modifications to these regions are visible to all processes.

Multi-process can be used in scenarios where multiple tasks need to be executed simultaneously. Different processes are responsible for different parts of the task. However, another way to segment jobs into tasks is to use threads. Similar to a process, a thread also has its own control flow and execution stack, but the thread runs within the process it was created to share all the data and system resources of its parent process. Threads are useful when applications need to complete concurrent tasks, but the potential problem is that tasks must share a large number of system states.

When multiple processes or threads are used, the operating system is responsible for scheduling. This is achieved by a small time slice for each process (or thread) and fast cyclic switching between all active tasks. This process divides the CPU time into small fragments and distributes them to each task. For example, if 10 active processes in your system are being executed, the operating system will allocate a CPU time to each process and switch among the 10 processes cyclically. When the system has more than one CPU core, the operating system can schedule processes to different CPU cores to maintain the average system load for parallel execution.

Complex issues need to be considered for programs written using the concurrent execution mechanism. The main source of complexity is data synchronization and sharing. In general, multiple tasks attempt to update the same data structure at the same time, which may result in inconsistent dirty data and program status (formally speaking, it is about resource competition ). To solve this problem, you need to use mutex or other similar synchronization primitives to identify and protect key parts of the program. For example, if multiple different threads are attempting to write data to the same file at the same time, you need a mutex lock to execute these write operations in sequence. When a thread is writing data, other threads must wait until the current thread releases the resource.

Concurrent Programming in Python

Python has long supported concurrent programming in different ways, including threads, sub-processes, and other concurrent implementations using the generator function.

Python supports message passing and thread-based concurrent programming in most systems. Although most programmers are more familiar with the thread interface, the Python thread mechanism has many restrictions. Python uses the internal global interpreter lock (GIL) to ensure thread security. GIL allows only one thread to execute at the same time. This allows the Python program to run on a single processor even on a multi-core system. Despite a lot of arguments in the Python field about GIL, there is no possibility of removing it in the foreseeable future.

Python provides some sophisticated tools for managing concurrent operations based on threads and processes. Even a simple program can use these tools to speed up concurrent tasks. The subprocess module provides APIs for creating and communicating sub-processes. This is especially suitable for running text-related programs because these Apis support data transmission through standard input/output channels of new processes. The signal module exposes the semaphore mechanism of the UNIX System to users to transmit event information between processes. Signals are processed asynchronously. When a signal arrives, the current work of the program is interrupted. Signal mechanisms can implement coarse-grained message transmission systems, but other more reliable in-process communication technologies can transmit more complex messages. The threading module provides a series of advanced object-oriented APIs for concurrent operations. Thread objects run concurrently in a process to share memory resources. I/O-intensive tasks can be better expanded using threads. The multiprocessing module is similar to the threading module, but it provides process operations. Each process class is a real operating system process and does not share memory resources. However, the multiprocessing module provides a mechanism to share data between processes and transmit messages. Generally, it is easy to change a thread-based program to process-based. You only need to modify some import declarations.

Threading module example

Taking the threading module as an example, let's think about a simple question: how to accumulate a large number using the multipart parallel method.

import threading class SummingThread(threading.Thread):  def __init__(self, low, high):    super(SummingThread, self).__init__()    self.low = low    self.high = high    self.total = 0   def run(self):    for i in range(self.low, self.high):      self.total += i thread1 = SummingThread(0, 500000)thread2 = SummingThread(500000, 1000000)thread1.start() # This actually causes the thread to runthread2.start()thread1.join() # This waits until the thread has completedthread2.join()# At this point, both threads have completedresult = thread1.total + thread2.totalprint(result)

Custom Threading class library

I wrote a small Python class library that is easy to use with threads, including some useful classes and functions.

Key parameters:

* Do_threaded_work-this function assigns a series of given tasks to corresponding processing functions (the allocation order is uncertain)

* ThreadedWorker-This class creates a thread that pulls a job from a synchronous work queue and writes the processing result to the synchronous result queue.

* Start_logging_with_thread_info-Write the thread id to all log messages. (Dependent on the log environment)

* Stop_logging_with_thread_info-used to remove the thread id from all log messages. (Dependent on the log environment)

import threadingimport logging def do_threaded_work(work_items, work_func, num_threads=None, per_sync_timeout=1, preserve_result_ordering=True):  """ Executes work_func on each work_item. Note: Execution order is not preserved, but output ordering is (optionally).     Parameters:    - num_threads        Default: len(work_items) --- Number of threads to use process items in work_items.    - per_sync_timeout     Default: 1        --- Each synchronized operation can optionally timeout.    - preserve_result_ordering Default: True       --- Reorders result_item to match original work_items ordering.     Return:     --- list of results from applying work_func to each work_item. Order is optionally preserved.     Example:     def process_url(url):      # TODO: Do some work with the url      return url     urls_to_process = ["http://url1.com", "http://url2.com", "http://site1.com", "http://site2.com"]     # process urls in parallel    result_items = do_threaded_work(urls_to_process, process_url)     # print(results)    print(repr(result_items))  """  global wrapped_work_func  if not num_threads:    num_threads = len(work_items)   work_queue = Queue.Queue()  result_queue = Queue.Queue()   index = 0  for work_item in work_items:    if preserve_result_ordering:      work_queue.put((index, work_item))    else:      work_queue.put(work_item)    index += 1   if preserve_result_ordering:    wrapped_work_func = lambda work_item: (work_item[0], work_func(work_item[1]))   start_logging_with_thread_info()   #spawn a pool of threads, and pass them queue instance   for _ in range(num_threads):    if preserve_result_ordering:      t = ThreadedWorker(work_queue, result_queue, work_func=wrapped_work_func, queue_timeout=per_sync_timeout)    else:      t = ThreadedWorker(work_queue, result_queue, work_func=work_func, queue_timeout=per_sync_timeout)    t.setDaemon(True)    t.start()   work_queue.join()  stop_logging_with_thread_info()   logging.info('work_queue joined')   result_items = []  while not result_queue.empty():    result = result_queue.get(timeout=per_sync_timeout)    logging.info('found result[:500]: ' + repr(result)[:500])    if result:      result_items.append(result)   if preserve_result_ordering:    result_items = [work_item for index, work_item in result_items]   return result_items class ThreadedWorker(threading.Thread):  """ Generic Threaded Worker    Input to work_func: item from work_queue   Example usage:   import Queue   urls_to_process = ["http://url1.com", "http://url2.com", "http://site1.com", "http://site2.com"]   work_queue = Queue.Queue()  result_queue = Queue.Queue()   def process_url(url):    # TODO: Do some work with the url    return url   def main():    # spawn a pool of threads, and pass them queue instance     for i in range(3):      t = ThreadedWorker(work_queue, result_queue, work_func=process_url)      t.setDaemon(True)      t.start()     # populate queue with data      for url in urls_to_process:      work_queue.put(url)     # wait on the queue until everything has been processed       work_queue.join()     # print results    print repr(result_queue)   main()  """   def __init__(self, work_queue, result_queue, work_func, stop_when_work_queue_empty=True, queue_timeout=1):    threading.Thread.__init__(self)    self.work_queue = work_queue    self.result_queue = result_queue    self.work_func = work_func    self.stop_when_work_queue_empty = stop_when_work_queue_empty    self.queue_timeout = queue_timeout   def should_continue_running(self):    if self.stop_when_work_queue_empty:      return not self.work_queue.empty()    else:      return True   def run(self):    while self.should_continue_running():      try:        # grabs item from work_queue        work_item = self.work_queue.get(timeout=self.queue_timeout)         # works on item        work_result = self.work_func(work_item)         #place work_result into result_queue        self.result_queue.put(work_result, timeout=self.queue_timeout)       except Queue.Empty:        logging.warning('ThreadedWorker Queue was empty or Queue.get() timed out')       except Queue.Full:        logging.warning('ThreadedWorker Queue was full or Queue.put() timed out')       except:        logging.exception('Error in ThreadedWorker')       finally:        #signals to work_queue that item is done        self.work_queue.task_done() def start_logging_with_thread_info():  try:    formatter = logging.Formatter('[thread %(thread)-3s] %(message)s')    logging.getLogger().handlers[0].setFormatter(formatter)  except:    logging.exception('Failed to start logging with thread info') def stop_logging_with_thread_info():  try:    formatter = logging.Formatter('%(message)s')    logging.getLogger().handlers[0].setFormatter(formatter)  except:    logging.exception('Failed to stop logging with thread info')

Example

from test import ThreadedWorkerfrom queue import Queue urls_to_process = ["http://facebook.com", "http://pypix.com"] work_queue = Queue()result_queue = Queue() def process_url(url):  # TODO: Do some work with the url  return url def main():  # spawn a pool of threads, and pass them queue instance   for i in range(5):    t = ThreadedWorker(work_queue, result_queue, work_func=process_url)    t.setDaemon(True)    t.start()   # populate queue with data    for url in urls_to_process:    work_queue.put(url)   # wait on the queue until everything has been processed     work_queue.join()   # print results  print(repr(result_queue)) main()


How to Understand the instances of class objects in python programming?

Class is the generalization of a class of things, such as people.
The data type includes built-in strings, numbers, plural numbers, and other custom classes.
Objects and instances are specific things in the class, such as men, women, and others. Here men and women can also be one type, such as older men, young man.
Remember that a class is a collective term of a type of thing, and an instance (or object) is a specific thing.
For reference only.
Example:
Class Person:
'''Basic attributes of a person: name, age, and Gender '''
Def _ init _ (self, name, age, sex ):
Self. name = name
Self. age = age
Self. sex = sex

Class Man (Person ):
Def _ init _ (self, name, age ):
Super (Man, self). _ init _ (name, age, 'male ')

Class Woman (Person ):
Def _ init _ (self, name, age ):
Super (Woman, self). _ init _ (name, age, 'female ')

An arrangement and combination problem in python Programming

99 days, each person can be paired with the other 99 people ..

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.