Efficient use of Top

Source: Internet
Author: User
Efficient use of Toplinuxc in Linux is an important task for desktop users to monitor system resource usage. Through this work, we can find the bottleneck of the system, perform targeted system optimization, and identify memory leaks. The problem is what software should we use and how to monitor system resource usage for desktop users if Top linuxc is used efficiently in Linux. Through this work, we can find the bottleneck of the system, perform targeted system optimization, and identify memory leaks. The problem is what software we should use and if we want to use it. Most of the many alternative monitoring tools use "top" (part of the procps package ). Top provides almost all the system resource usage monitoring that we need, just in this software. All the information in this article is based on the procps package running on Linux 2.6.x kernel version 3.2.5. Here, we assume that procps has been installed and runs on your Linux system. There is no need for your experience with top, but it will be more advantageous if you try to use it. The following are some challenges: A. interaction or batch processing mode? By default, the interactive mode is used when Top is called. In this mode, Top runs indefinitely, and you can press the key to redefine the running mode of Top. However, sometimes you need to process the Top output, but this is difficult to implement in this mode. Solution? Use the batch processing mode. 1 $ top-B you will get output similar to the following: top-15:22:45 up, 5 users, load average: 0.00, 0.03, 0.00 Tasks: 60 total, 1 running, 59 sleeping, 0 stopped, 0 zombieCpu (s): 3.8% us, 2.9% sy, 0.0% ni, 89.6% id, 3.3% wa, 0.4% hi, 0.0% siMem: 515896 k total, 495572 k used, 20324 k free, 13936 k buffersSwap: 909676 k total, 4 k used, 909672 k free, 377608 k cachedPID user pr ni virt res shr s % CPU % mem time + COMMAND 1 root 16 0 1544 476 404 S 0.0 0.1. 35 init 2 root 34 19 0 0 0 S 0.0 0: 00. 02 ksoftirqd/0 3 root 10-5 0 0 0 S 0.0 0.0. 11 events/0 ha, and so on. it runs repeatedly, the same as the interaction mode. Don't worry, you can use-n to limit the number of duplicates. Therefore, if you want to get a one-time result, the real advantage of typing: 1 $ top-B-n 1 is that you can easily combine it with at or cron commands. Their combination allows Top to take snapshots of the resource usage status at a specific time. For example, with at, we can set top to run in one minute. 123 $ cat. /test. atTERM = linux top-B-n 1>/tmp/top-report.txt $ at-f. /test. at now + 1minutes readers may ask, "Why do I need to set the environmental variable TERM before calling Top when creating a new task? ". The answer is: this variable is required for Top running, but "at" is not retained during scheduled calls. The simple settings above can ensure the normal operation of Top. B. how to monitor the development process? Sometimes, we are only interested in several processes. it may be four or five of all processes. For example, if you want to monitor processes whose PID is 4360 and 4358, you need to type: 1 $ top-p 4360,4358 or 1 $ top-p 4360-p 4358 looks simple, just use-p to list all the required PIDs, use commas (,) to separate the values or simply use-p multiple times. Another possibility is to monitor processes with specific user IDs. To meet this requirement, you can use the-u or-U option. Assume that the UID of the user "johndoe" is 500. type: 1 $ top-u johndoeOR 1 $ top-u 500 or 1 $ top-U johndoe. The conclusion is, you can use either a pure user name or a digital UID. "-U,-U? Are the two different ?" Yes. Like most other GNU tools, the options are case-sensitive. -U indicates that Top searches for valid, real, saved, and UID of the file system, while-u matches only valid user IDs. You must know that every * nix process uses a valid user ID at runtime, and some of them are not the same as real user IDs. In most cases, people who are interested in a valid user identity similar to file system permissions or operating system features will check it, not UID. Unlike-p, which is only used for command line options,-U and-u can be used in interactive mode. As you guessed, you can type 'U' or 'u' to filter processes by user name. The same rules still apply. 'U' is a valid user ID, and 'U' is the real/valid/save/file system username. You will be asked to enter the user name or number UID. C. quick or slow update? Before answering this question, let's briefly introduce how Top runs. Here, Strace can help you: 1 $ strace-o/tmp/trace.txt top-B-n 1 open/tmp/trace.txt in your preferred text editor. What do you think? There are too many jobs to do in one call. I think so. One of the tasks that Top must do in each traversal is to open many files and parse their content. you can see the number of times: 1 $ grep open (/tmp/hasil.txt | wc-l for example, in my Linux, the number is 304. after careful observation, you will find that the Top directory traverses the/proc folder to collect process information. /Proc itself is a virtual file system, which means it is not stored in a real hard disk, but created by the Linux kernel and saved in memory. In the folder, such as/proc/2097 (2097 is PID), the Linux kernel prints the information associated with it to this file, and this is the Top message source. Try it at the same time: 1 $ time top-B-n 1 so that you can understand how fast the Top single-round operation is. In my system, it is about 0.5-0.6 seconds. Check the "real" field, not the "user" or "system" field, because the "real" field reflects the total time required for Top jobs. Therefore, with this cognition, it is wise to use the appropriate update interval. It also takes time to access the memory based on the file system. The empirical rule is that for most users, the interval between 1 and 3 seconds is sufficient. Use-d in the command line, or press "s" in interactive mode to set. You can use small trees like 2.5 and 4.1. When do we need updates faster than 1 second? More samples are required in the time range. To address this requirement, it is best to use the batch processing mode and redirect the standard output to the file for better analysis. You don't care about the extra CPU load consumed by Top. Yes, although it is small, it still requires load. If your Linux system is relatively idle, use a short interval at will. if not, it is best to keep your CPU time for important tasks. One way to reduce the Top workload is to monitor only a few specific PIDs. In this way, Top does not need to traverse all subfolders under/proc. What about username filtering? It will not be better. User name filtering brings additional workload to Top, so associating it with a short interval will increase the CPU load. Of course, when you need to force update, press the Space key and the Top will refresh the statistics., D. when the required field is default, the following task attributes are displayed after Top is started: field Description PID: process IDUSER: valid user IDPR: Dynamic priority value NI: good value, also known as basic priority VIRT: virtual size of a task. Including the size of executable binary files of processes, the size of data areas, and the size of all loaded shared libraries. RES: current task memory consumption. The part stored in the swap partition does not include. SHR: some memory areas may be shared by two or more tasks. This field reflects these shared areas. For example, shared library and Sysv shared memory. S: task status % CPU: Percentage of CPU time used to run the task when the Top screen is updated. % MEM: percentage TIME consumed by the current memory of the task +: total cpu time consumed after the task is started ." + "Sign means it is displayed with hundreth of a second granularity. by default, TIME/TIME + is not counted into the subprocess of a disabled task. COMMAND: displays the program name. More than that. Next I will introduce some columns that you may use: the number of major page errors (page fault) since the nFLT ('u' key) process was started. To be precise, a page error is caused by a process accessing a page that does not exist in its address space. A "major" page error means that the kernel needs to access the disk to make the page valid. On the contrary, a small page error means that the kernel only needs to allocate pages in memory instead of reading disks. For example, assume that the ABC size of the program is 8 kB and the page size is 4 kB. When the program reads data into the memory, two major page errors (2*4 kB) occur ). The program itself allocates 8 KB space as temporary data. Therefore, there will be two small page errors. If the nFLT is too high, it may mean that the process reads a large amount of resources from the disk. The task is aggressively load some portions of its executable or library from the disk. The process has accessed a page that has been switched to the disk. When the process runs for the first time, it is normal to see a large number of major page errors. During the next running, because the cache has been allocated, you may see "0" times or a small nFLT. However, if a program frequently triggers major page errors, it is likely that the program you have installed is not enough memory for use. The number of dirty pages since the last page was written to the disk. What is a dirty page? First look at the background knowledge. As we all know, Linux uses a cache system, so the data read from the disk is also cached into the memory. The advantage of doing so is that subsequent read operations on the disk block can directly retrieve data from the memory, so the speed is faster. But there is a price. If the content of the buffer is modified, synchronization is required. Therefore, the modified buffer (dirty page) must be written back to the disk. If synchronization fails, data on the disk may be inconsistent. In a system with no heavy load, the value of the heat map is usually less than 10 (approximately estimated) or 0. If your system is usually larger than 10, it is possible that the process is writing a large amount of data to the disk. Disk I/O often cannot keep up with the buffer speed. Disk I/O congestion, so even if the process modifies a small part of the file, it must wait for a while to complete synchronization. Congestion occurs when many processes access disks at the same time and the cache hit rate is low (note: a typical case of FTP service ). Now, (1) is unlikely because I/O speed is getting faster and requires less CPU (the emergence of DMA technology ). Therefore, (2) the probability of appearance is higher. Note: In the 2.6.x kernel, the value of this column is always 0. P ('J' key) the CPU used last time. This column only makes sense in the SMP environment. SMP here refers to hyper-threading, multi-core or multi-CPU architecture. If you only have one CPU (not multi-core, no hyper-threading), this column is always 0. In the SMP system, do not be surprised even if this column is changed several times. This means that the Linux kernel tries to move your process to another CPU with less load. CODE ('R' key) and DATA ('s' key) CODE only reflect the size of your program CODE. DATA reflects your DATA segments (stacks, stacks, variables, the size of the shared library. The unit is KB. DATA shows how much memory your program has allocated. It can also be used to analyze memory leaks. Of course, you need better tools, such as using valgrind to view each memory allocation. If DATA continues to grow, memory leakage is likely to occur. Note: DATA, CODE, SHR, SWAP, VIRT, and RES are measured by page size (4 KB in Intel architecture. The read-only data segment is also included in the CODE size, so sometimes the CODE is larger than the implemented data segment. Size of the memory image of the SWAP ('P' key) process that has been switched. This column is sometimes confusing: logically, you may expect this column to show whether your program is fully exchanged or partially exchanged. But actually not. Even if the "Swap used" column is 0, you can still be surprised to find that the SWAP columns of all processes are greater than 0. Why? This is because the top command uses the following formula: 12 VIRT = SWAP + RES or listen SWAP = VIRT-RES as mentioned earlier, VIRT contains everything in the address space of the process: in memory, those that have been exchanged and have not been read from the disk. RES indicates the total memory size occupied by the process. Therefore, SWAP represents all the data that has been exchanged and the data that has not been read from the disk. Do not be confused by the SWAP name. it represents not only the data that has been exchanged. To display the above columns, press the 'F' key in interactive mode, and then press the corresponding key. Click to display the specified column, and then click to hide the column. To determine which columns are currently displayed, you only need to check the first row of letters (on the right of "Current Fields ). Uppercase letters indicate that the column is displayed, while lowercase letters indicate that the column is hidden. After you select it, press enter. A similar method is used for sorting. Press 'O' (uppercase) and then press the corresponding key. Even if you cannot remember the buttons, the top button is displayed. The new sort key is marked with an asterisk, and the corresponding letters are converted to uppercase, which is intuitive. After the selection, remember to press Enter. E. is multi-view better than single view? In different situations, sometimes we want to monitor different system attributes. For example, you want to monitor the percentage of CPU and the time consumed by all tasks at the same time. In another period of time, you want to monitor the total page faults of resident memory and all tasks. Quickly press the 'F' key and then switch to the interface? I guess that's too unwise. Why not try the multi-view window mode? Press 'A' (uppercase) to switch to the multi-window interface. By default, you will see four different series of field groups. Each field Group has a default label/Name: first field Group: Def second field Group: Job third field Group: Mem fourth field Group: the first field group of Usr is the group you commonly use in a single view window, and the rest of the group will be hidden. Built-in multi-view window mode. all available windows are cycled by 'A' or 'W. Note that the active window (also called the current window) is changed when you switch to another window ). If you are not sure which is an activity window, you only need to look at the first line of top display (on the left of the current time field ). Another way to change the activity window is by pressing 'G' followed by a number (1 to 4 ). The activity window is intended for user input. Therefore, you must select your preferred interface before you start to work. Then, you can do what you like in single window mode. In this case, you generally want to customize the field display, then you only need to press 'F' and then start custom. If you think there are too many fields in the fourth field group, you only need to switch to the field Group and Press '-' to hide it. Note that even if you hide the current field group, it does not mean that you have changed the activity group at the same time. If you press '-' again, the current group will be visible. If you want to operate the multi-view window mode, Press 'A' again. This will also make the activity form a new field Group for the single View window mode. F. "How can I only have a small amount of idle memory on my Linux host ?" Is there the same problem? No matter how much memory you add on the motherboard, you will soon find that the idle memory is greatly reduced. is the idle memory incorrect? No! Before answering this question, check the memory summary displayed on the top of the top command (you may need to press 'M' to display it ). here, you can see two areas: buffers and cached ). "Buffers" indicates the amount of memory used to cache disk blocks. "Cached" is similar to "Buffers", but it only reads cache pages from files. to thoroughly understand this part, we recommend that you read a Linux kernel book, such as Robert M. linux Kernel Development written by Love. This is enough to understand buffers and cached representing the system cache. they will dynamically increase or decrease based on the Linux kernel mechanism. In addition to the consumption of the cache, the program and code also occupy RAM. therefore, the idle memory is displayed as part of RAM that is not cached and occupied by programs/code. In general, you can also consider that the cache area is another part of "idle" RAM, if the program needs more memory, it will reduce the number of processes. you may want to know which region represents the actual memory consumption, VIRT (virtual memory usage) region? Of course not! This area represents everything in the process address space, including the relevant libraries. Read the source code and proc.txt of the topic command (in the Documentation/filesystem folder in the kernel code tree). my conclusion is that the RSS field is the best description of the process memory consumption. I said "the best" is because you can consider that it is an approximation rather than 100% accuracy at all times. G. use several saved configurations to save multiple different configuration files, so that you can easily switch the pre-configuration view? You only need to create a soft connection to the Top binary file to your favorite name: 1 # ln-s/usr/bin/top-a and then run the new "top-a" command ". After the adjustment, type 'W' to save the configuration. it will be saved ~ /. Top-arc (format: Your Top alias + rc ). In this way, you can use the previous view to run the original Top, while top-a uses the second view, and so on. H. There are many tips to use top to make it more efficient. The key is to know what you really need and a general understanding of Linux's low-level principles. Statistics are not always correct, but at least contribute to overall measurement. All these numbers are collected from/proc, so first make sure they are mounted!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.