Python uses future to handle concurrency issues, pythonfuture
Future first Knowledge
Use the following script to get a preliminary understanding of future:
Example 1: A normal Loop
import osimport timeimport sysimport requestsPOP20_CC = ( "CN IN US ID BR PK NG BD RU JP MX PH VN ET EG DE IR TR CD FR").split()BASE_URL = 'http://flupy.org/data/flags'DEST_DIR = 'downloads/'def save_flag(img,filename): path = os.path.join(DEST_DIR,filename) with open(path,'wb') as fp: fp.write(img)def get_flag(cc): url = "{}/{cc}/{cc}.gif".format(BASE_URL,cc=cc.lower()) resp = requests.get(url) return resp.contentdef show(text): print(text,end=" ") sys.stdout.flush()def download_many(cc_list): for cc in sorted(cc_list): image = get_flag(cc) show(cc) save_flag(image,cc.lower()+".gif") return len(cc_list)def main(download_many): t0 = time.time() count = download_many(POP20_CC) elapsed = time.time()-t0 msg = "\n{} flags downloaded in {:.2f}s" print(msg.format(count,elapsed))if __name__ == '__main__': main(download_many)
Example 2: Implemented in the future mode. The above code is reused.
from concurrent import futuresfrom flags import save_flag, get_flag, show, mainMAX_WORKERS = 20def download_one(cc): image = get_flag(cc) show(cc) save_flag(image, cc.lower()+".gif") return ccdef download_many(cc_list): workers = min(MAX_WORKERS,len(cc_list)) with futures.ThreadPoolExecutor(workers) as executor: res = executor.map(download_one, sorted(cc_list)) return len(list(res))if __name__ == '__main__': main(download_many)
Run three times, respectively. The average speed of the two is 13.67 s and 1.59 s. The difference is very large.
Future
Future is an important component of the concurrent. futures module and asyncio module.
There are two classes named Future in the standard library starting from python3.4: concurrent. futures. Future and asyncio. Future
The two classes share the same role: the instances of the two Future classes indicate that the computing delay may or has not been completed. Similar to the Deferred class in Twisted and the Future class in Tornado framework
Note: Generally, you should not create a future, but instantiate it by the concurrent. futures or asyncio.
Cause: future indicates that something will eventually happen, and the only way to determine that something will happen is that the execution time has been scheduled. Therefore, you only need to give something to concurrent. futures. concurrent is created only when the Executor subclass is processed. futures. future instance.
For example, the parameters of the Executor. submit () method are callable objects. After this method is called, the scheduled time of the passed callable object is scheduled and
Future
The client code cannot change the future state. The concurrent framework changes the state of the things after the end of the latency calculation indicated by future, and we cannot control the end time of the calculation.
Both future methods have the. done () method. This method is not blocked and the return value is a Boolean value, indicating whether the callable object of the future link has been executed. Client code usually waits for a notification instead of asking future if it is finished. Therefore, both Future classes have the. add_done_callback () method. This method has only one parameter and the type is a callable object. After the future runs, it calls the specified callable object.
The. result () method works the same in two Future classes: return the callable object results, or throw an exception when executing the callable object again. However, if the future does not end, the behavior of the result method in the two Futrue classes varies greatly.
For concurrent. futures. for Future instances, call. the result () method blocks the thread where the caller is located until a result can be returned. At this time, the result method can receive optional timeout parameters. if the future is not completed within the specified time, A TimeoutError exception is thrown.
The asyncio. Future. result method does not support setting the timeout time. It is best to use the yield from structure to obtain the future result, but concurrent. futures. Future cannot do this.
Whether asyncio or concurrent. futures. there are several functions in the Future that will return future, while other functions use future. In the first example, we use Executor. map is using future. The returned value is an iterator. The _ next _ method of the iterator calls the result method of each ure. Therefore, we get the result of each futrue, rather than the future itself
For the use of the future. as_completed function, we use two loops: one for creating and arranging future, and the other for obtaining the result of future.
from concurrent import futuresfrom flags import save_flag, get_flag, show, mainMAX_WORKERS = 20def download_one(cc): image = get_flag(cc) show(cc) save_flag(image, cc.lower()+".gif") return ccdef download_many(cc_list): cc_list = cc_list[:5] with futures.ThreadPoolExecutor(max_workers=3) as executor: to_do = [] for cc in sorted(cc_list): future = executor.submit(download_one,cc) to_do.append(future) msg = "Secheduled for {}:{}" print(msg.format(cc,future)) results = [] for future in futures.as_completed(to_do): res = future.result() msg = "{}result:{!r}" print(msg.format(future,res)) results.append(res) return len(results)if __name__ == '__main__': main(download_many)
The result is as follows:
Note: The Python code cannot control GIL. All functions in the standard library that execute blocking IO operations will release GIL while waiting for the operating system to return results. it is precisely because of this that Python threads can play a role in IO-intensive applications.
All of the above are concurrent. futures start threads, and the following uses it to start processes
Concurrent. futures Start Process
The ProcessPoolExecutor class in concurrent. futures assigns work to multiple Python processes for processing. Therefore, if you need CPU-intensive processing, use this module to bypass GIL and use all CPU cores.
The principle is that a ProcessPoolExecutor creates N Independent Python interpreters, and N is the number of available CPU cores on the system.
Use the same method as ThreadPoolExecutor
Summary