It has been said that the multithreading of Python can only run on a single core, that is, each thread executes asynchronously in a concurrent manner.
In this article, let's talk about the Python multi-process approach.
Multi-process depends on the number of processors on the machine, multi-core machine on multi-process programming, the core running between the processes are executed in parallel, you can take advantage of the process pool, is running a process on each core, when the number of processes in the fin is greater than the total number of cores, the process to run will wait, Until the other process has finished running out of the kernel
Multi-process is the equivalent of the following ticket sales behavior
Note here that when there is only one single-core CPU in the system, a multi-process does not occur, at which point each process takes up CPU to complete
We can use the Python statement to get the CPU available for the number of cores, such as
In order to form a comparison, we still use the previous example, when the book, search keyword product information crawl
First write the multi-process master method
# coding=utf-8__author__ = "Susmote" from multi_threading import mining_funcimport multiprocessingimport timedef Multiple_process_test (): start_time = Time.time () page_range_list = [ (1, ten), (one, 21, 32) , ] pool = multiprocessing. Pool (processes=3) for Page_range in page_range_list: Pool.apply_async (mining_func.get_urls_in_pages, (page _range[0], page_range[1]) pool.close () pool.join () end_time = Time.time () print ("Crawl time:", End_ Time-start_time) return end_time-start_time
In this case, let me briefly explain the operation of the multi-process
Pool is defined as a process pool that can concurrently parallel 3 processes, and then through loops, using the Apply_async method to run the process that enters the process pool in parallel in an asynchronous manner
Here is the main function
# coding=utf-8__author__ = "Susmote" from process_func import multiple_process_testif __name__ = "__main__": pt = mu Ltiple_process_test () Print ("PT:", PT)
Run the code to get the following results.
5.908
Run one More time
3.954
Last time
4.163
Take average Time
4.341 seconds
In this case, we will review the multi-threading scenario (same network conditions) for the previous article:
Multithreading
Single Thread
It can be seen that the gap is very clear, multi-process is the biggest advantage
Multi-process is these, you can also find a larger data pool, to test these methods
Data Mining _ Multi-process crawl