Python's performance on logs (logging module) and comprehensive parsing of multiple processes

Source: Internet
Author: User
When using Python to write background tasks, it is often necessary to use the output log to record the state of the program running and to save the details of the error when an error occurs, so that you do not debug and analyze it. Python's logging module is a good helper in this case. This article introduces the performance of the log logging module in Python and the related data of the multi-process, the need of friends can refer to.

Objective

Java is the most common log module is log4j, in Python, also comes with the logging module, the use of the module is actually similar to log4j. Logging is a good way to record operations. But the log, basically is file-based, that is, to write to the disk. At this point, the disk will become a performance bottleneck. What is the performance bottleneck for a Python log for a normal server hard disk (mechanical, non-SSD)? Let's take a look at it today. The following words do not say, come together to see the detailed introduction:

The test code is as follows:


#!  /usr/bin/env python #coding =utf-8 # ============================ # Describe: Logs provided to the platform # d&p Author by: Common Success # Create  DATE:2016/08/01 # Modify date:2016/08/01 # ============================ Import time import OS import logging print  "Start test ..." S_TM = Time.time () test_time = 10.0 # Test time 10 sec. e_tm = s_tm + j = 0 pid = str (Os.getpid ()) while 1: Now_time = Time.time () j + = 1 if now_time > E_tm:break # Generate Folder lujing = "D:\\test_log" If not os.path.exists (Lu Jing): Os.mkdir (lujing) fm2 = '%y%m%d ' YMD = Time.strftime (FM2, Time.localtime (now_time)) filename = ' Recharge_ ' +  YMD + '. Log ' log_file = Os.path.join (lujing, filename) t = "\ t" log_msg = STR (j) +t+ str (now_time) +t+ pid The_logger = Logging.getlogger (' recharge_log ') F_handler = logging. Filehandler (log_file) The_logger.addhandler (F_handler) The_logger.setlevel (logging.info) # to pass exception Information, use the keyword argument exc_info with a true value The_logger.info (log_msg, exc_Info=false) The_logger.removehandler (f_handler) RPS = j/test_time print RPS, "Rows per second" 

The result is:

Start test ....

2973.0 rows per second


Python's logging performance:

7200 RPM mechanical disk, measured several times, the number of rows per second that can be written to the log (each line is a log), the number is basically between 2800-3000. At this point, the disk IO is basically running full. (On a 3.3Ghz CPU, the CPU occupies about 40%).

Python's logging multi-process:

Python's logging module is thread-safe. But for multi-process programs, how to write log files? My solution is to write a separate log file for each PID of the process. The algorithm then merges the logs of all processes to generate a new log.

Tip: Because disk IO has reached the bottleneck, multi-process does not improve log performance. High-Performance logs, which need to be cached, or distributed logs.

Summarize

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.