How to write concurrent programs in Python-Python tutorial

Source: Internet
Author: User
Concurrent running of computer programs is a frequently discussed topic. today I want to discuss various concurrency methods in Python. GIL

In Python, due to historical reasons (GIL), the effect of multithreading in Python is very unsatisfactory. GIL allows Python to use only one CPU core at any time, and its scheduling algorithm is simple and crude: in multithreading, every thread can run for a period of time t, and then force the thread to be suspended, then run other threads until all threads are finished.

This makes it impossible to effectively use the "locality" in the computer system. frequent thread switching is not very friendly to the cache, resulting in a waste of resources.

It is said that Python has implemented a Python interpreter to remove GIL, but it is not as effective as a GIL interpreter. later, Python officially launched the "using multi-process to replace multithreading" solution, which also included concurrent in Python3. the future packages allow us to write programs to achieve "simplicity and performance ".

Multi-process/multi-thread + Queue

In general, the experience of writing concurrent programs in Python is: computing-intensive tasks use multiple processes, IO-intensive tasks use multiple processes or multithreading. in addition, because resource sharing is involved, synchronization locks and other troublesome steps are required, and the code writing is not intuitive. another good idea is to use the multi-process/multi-thread + Queue method to avoid the trouble and inefficiency of locking.

In Python2, Queue + multi-process is used to process an I/O-intensive task.

If you need to download and parse multiple web pages, the efficiency of a single process is very low, so it is imperative to use multi-process/multi-thread.
We can initialize a tasks queue, which will store a series of dest_urls, enable four processes to fetch tasks from tasks and then execute them. the processing results are stored in a results queue, finally, parse the results in results. close two queues.

Below are some major logic code.

#-*-Coding: UTF-8-*-# IO-intensive tasks # Multiple processes download multiple webpages at the same time # Use Queue + multi-process # because it is IO-intensive, so we can also use the threading module to import multiprocessingdef main (): tasks = multiprocessing. joinableQueue () results = multiprocessing. queue () cpu_count = multiprocessing. cpu_count () # Number of processes = number of CPU cores create_process (tasks, results, cpu_count) # The main process immediately creates a series of processes, but the task started to be empty because of the blocking queue, all the sub-processes are blocked add_tasks (tasks) # Start to add the task parse (tasks, results) to tasks # Finally, the main process waits for other threads to finish processing. Def create_process (tasks, results, cpu_count): for _ in range (cpu_count): p = multiprocessing. process (target = _ worker, args = (tasks, results) # create the corresponding Process p according to _ worker. daemon = True # enable all processes to end with the completion of the main process p. start () # start def _ worker (tasks, results): while True: # because daemon = True is set for all the preceding threads, there is no infinite loop. try: task = tasks. get () # if there is no task in tasks, the result = _ download (task) results will be blocked. put (result) # some exceptions do not handled f Inally: tasks. task_done () def add_tasks (tasks): for url in get_urls (): # get_urls () return a urls_list tasks. put (url) def parse (tasks, results): try: tasks. join () counter t KeyboardInterrupt as err: print "Tasks has been stopped! "Print err while not results. empty (): _ parse (results) if _ name _ = '_ main _': main ()

Use the concurrent. ures package in Python3

In Python3, you can use the concurrent. ures package to write more simple and easy-to-use multi-threaded/multi-process code. it feels similar to the concurrent framework of Java (for reference ?)
For example, the following simple code example

def handler():  futures = set()  with concurrent.futures.ProcessPoolExecutor(max_workers=cpu_count) as executor:    for task in get_task(tasks):      future = executor.submit(task)      futures.add(future)def wait_for(futures):  try:    for future in concurrent.futures.as_completed(futures):      err = futures.exception()      if not err:        result = future.result()      else:        raise err  except KeyboardInterrupt as e:    for future in futures:      future.cancel()    print "Task has been canceled!"    print e  return result

Summary

If some large Python projects are also written in this way, the efficiency is too low. There are many existing frameworks in Python to use them more efficiently.
However, it is good to write some of your own "cool" programs .:)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.