In-depth analysis of python timed killing process and analysis of python Process
Previously, I wrote a python script to use selenium + phantomjs to crawl a New post. During the page loop pulling process, phantomjs is always blocked and the maximum wait time setting using WebDriverWait is invalid. No improvement in replacing phantomjs with firefox
Because this script will not be used for a long time, a temporary method is adopted to create a new sub-thread to kill the phantomjs process at a fixed cycle, so that selenium will return after the block does not exceed this cycle at most. Of course, make some fine-tuning in the crawler script to prevent some URLs from being skipped.
The sched module is used for scheduled task execution. Many people compare it with crontab.
Command to kill a specific process
Copy codeThe Code is as follows:
The kill-9 pid command can terminate the corresponding pid process unconditionally.
Obtain the process pid named phantomjs
Ps command to list process information
Grep filters process information with the specified name.
Awk '{print $2}' extracts the pid information of the second column
The final command is: kill-9 'ps-aux | grep phantomjs | awk '{print $2 }''
Python can use OS. system () to execute shell commands.
Use sched module to execute tasks cyclically
The sched module uses heapq to save the event queue. The event type is namedtuple.
Sched requires two functions, one for obtaining time changes, and the other for waiting for a period of time, which can be customized.
Basic API
The sched. scheduler (time_func, sleep_func) function returns a scheduler object. timefunc is a timing function that returns numbers, and sleepfunc can accept this parameter and delay the corresponding time.
Sched。. enter (delay, priority, action, argument) after the delay period, call the action with the argument parameter. argument must be a tuple. To run at a fixed time, call scheduler. enterabs
Schedcel. cancel (event) cancels a scheduled task. Event is the return value of the enter function.
Scheduler. run ()
Task time overlaps
A block may take some time to execute a task. After the task is returned, it may have exceeded the scheduled time of the next task. In this case, the next task will be executed immediately without skipping.
Periodical execution
Similar to recursive calls, write a wrapper function and schedule the next task again in the task.
def wrapper(func, delay):scheduler.enter(delay, 0, wrapper, (func, delay))func()
Final code
import os, time, schedschedule = sched.scheduler(time.time, time.sleep)cmd = '''kill -9 `ps -aux|grep phantomjs|awk '{print $2}'`'''def recycle_eval(c, inc):schedule.enter(inc, 0, recycle_eval, (c, inc))os.system(c)print time.ctime(),'phantomjs killed'if __name__ == '__main__':inc = 180schedule.enter(inc, 0, recycle_eval, (cmd, inc))schedule.run()
The above is a small series of knowledge about the python timed kill process, hoping to help you!