How to write concurrent programs in Python

Source: Internet
Author: User
GIL

In Python, the effect of multi-threading in Python is very unsatisfactory due to historical reasons (Gil). Gil makes it possible for Python to take advantage of only one CPU core at any time, and its scheduling algorithm is simple and crude: multi-threaded, let each thread run for a period of time t, and then forcibly suspend the thread, Then run another thread, so that it repeats itself until all threads are finished.

This makes it impossible to effectively utilize the "locality" in the computer system, and frequent thread switching is not very friendly to the cache, resulting in a waste of resources.

It is said that the official python has implemented a removal of the Gil Python interpreter, but its effect is not as good as the Gil interpreter, then give up. Later, Python officially launched a "multi-process alternative to multithreading" scenario, In Python3 There are also concurrent.futures such packages, so that our program can be "simple and performance."

Multi-process/multithreaded +queue

In general, the experience of writing concurrent programs in Python is that compute-intensive tasks use multiple processes, and IO-intensive tasks use multiple processes or multithreading. In addition, because of the resource sharing, so a series of troublesome steps such as synchronization lock, code writing is not intuitive. Another good idea is to use multi-process/ Multithreading +queue method, can avoid locking such troublesome inefficient way.

The queue+ multi-process approach is now used in Python2 to handle an IO-intensive task.

Assuming that you now need to download multiple Web content and parse it, a single process is inefficient, so using multi-process/multithreading is imperative.
We can initialize a tasks queue, which will store a series of Dest_url, while opening 4 processes to the tasks to take the task and then execute, the processing results are stored in a results queue, and finally the results in results parsing. Finally, two queues are closed.

Here are some of the main logic codes.

#-*-Coding:utf-8-*-#IO密集型任务 # Multiple processes downloading multiple pages simultaneously # using queue+ multi-process # because it is IO intensive, you can also use the Threading module import multiprocessingdef Main ( ): Tasks = multiprocessing. Joinablequeue () results = multiprocessing. Queue () Cpu_count = Multiprocessing.cpu_count () #进程数目 the number of ==CPU cores create_process (tasks, results, Cpu_count) #主进程马上创建一系列进程, However, because the blocking queue tasks start empty, the secondary process is all blocked add_tasks (tasks) #开始往tasks中添加任务 Parse (tasks, results) #最后主进程等待其他线程处理完成结果def Create_ Process (tasks, results, Cpu_count): For _ in Range (cpu_count): p = multiprocessing. Process (Target=_worker, args= (tasks, results)) #根据_worker创建对应的进程 P.daemon = True #让所有进程可以随主进程结束而结束 p.start () #启动def       _worker (Tasks, results): While True: #因为前面所有线程都设置了daemon =true, so no infinite loop Try:task = Tasks.get () #如果tasks中没有任务, block result = _download (Task) results.put (Result) #some exceptions do not handled Finally:tasks.task_done (  def add_tasks (tasks): For URL in Get_urls (): #get_urls () return a urls_list tasks.put (URL) def parse (tasks, results): Try    Tasks.join () except Keyboardinterrupt as Err:print "Tasks has been stopped!" Print err while not results.empty (): _parse (results) if __name__ = = ' __main__ ': Main ()

Using the Concurrent.futures package in Python3

In Python3, you can use the Concurrent.futures package to write more easy-to-use multithreaded/multi-process code. It feels similar to Java's concurrent framework (for reference?)
For example, the following simple code example

Def handler ():  futures = Set () with  Concurrent.futures.ProcessPoolExecutor (Max_workers=cpu_count) as Executor: For    task in Get_task (tasks): Future      = Executor.submit (Task)      Futures.add (future) def wait_for ( Futures):  try: for    concurrent.futures.as_completed (futures):      err = futures.exception ()      if Not err:        result = Future.result ()      else:        raise err  except Keyboardinterrupt as E:    for future in Futures:      future.cancel ()    print "Task has been canceled!"    Print e  return result

Summarize

If some large Python projects are written like this, the efficiency is too low. There are many existing frameworks used in Python that are more efficient to use.
But some of their own "little" program to write this is good.:)

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.