Linux performance monitoring, the following are the Vpsee common tools:
Simple Introduction to Tools
Top view process activity status and some system conditions
Vmstat View system status, hardware and system information, etc.
Iostat View CPU load, hard drive status
SAR integrated tools to view system conditions
Mpstat View Multi-processor status
Netstat View network conditions
Iptraf Real-time Network condition monitoring
Tcpdump Crawl network packet, detailed analysis
Tcptrace Packet Analysis Tool
Netperf Network Bandwidth Tool
Dstat integrated tools, integrated Vmstat, Iostat, Ifstat, Netstat, and many other information
CPU optimization of Linux performance optimization (i)
Objective
What is performance optimization. Personally, performance optimization is designed to improve application or system capabilities. So how can you achieve performance tuning for your application? There are a lot of things to design here, including the Linux kernel, CPU architecture, and the Linux kernel's allocation and management of resources, understanding the process creation process, and so on. This area because of more space, so my article is not too much introduction. In the next few articles, we explain how to discover the source of application failure as the goal, which is the ability of each system engineer. Nonsense not much to say, I go straight to the subject.
Common terminology
Latency: The time to wait for the result to be returned after the operation is described. In some cases, it can refer to the entire operation time, equivalent to the response time.
IOPS: The number of input/output operations that occur per second is a measure of data transfer. For disk reads and writes, IOPS refers to the number of reads and writes per second.
Response time: The time when the general operation was completed. Includes time to wait and service, and time to return results.
Usage: For resources requested by the service, usage describes how busy the resource is within a given time interval. For coupled resources, the usage rate refers to the amount of storage consumed.
Saturation: means that a resource cannot meet the queued workload of a service.
Throughput: The rate at which the work order is evaluated, especially in data transmission, which is used for the transmission speed (Bytes/sec and bits/sec). In some cases, throughput refers to the speed of the operation.
Linux kernel Features
CPU scheduling level: A variety of advanced CPU scheduling algorithms, not always storage access architecture (NUMA);
I/o dispatch domain: I/O scheduling algorithms, including deadline/anticipatory and fully fair queues (CFQ);
TCP Network congestion: TCP congestion algorithm, allowing on-demand selection;
Problems
What is the difference between a process, a thread, and a task.
Processes are usually defined as execution of a program. The environment used to perform user-level programs. It includes memory address space, file descriptor, line stacks, and registers.
A thread is a program that runs separately in a process. That is, the thread is in the process.
A task is an activity that a program completes and can make a process, or it can be a thread.
Reference connection: http://blog.chinaunix.net/uid-25100840-id-271078.html
What is context switching.
The implementation of a section of the program code, to achieve a functional process introduction, when the CPU, the relevant resources must also be in place, is the graphics card, memory, GPS, and so on, and then the CPU began to execute. All of this, except for the CPU, constitutes the execution environment of the program, the program context we define. When the program is finished or the CPU execution time assigned to him is run out, it will be switched out and waiting for the next CPU to be lucky. The last step in being switched out is to save the program context, because this is the next time he is running the CPU environment, must be saved.
The difference between I/O-intensive and CPU-intensive workloads.
I/o-intensive refers to the system's CPU energy consumption relative to the hard disk/memory energy can be much better, at this time, the system operation, most of the situation is CPU in the I/O (hard disk/memory) read/write, at this time the CPU load is not high. CPU-intensive refers to the system's hard disk/memory energy consumption is much better than the CPU, at this time, the system operation, most of the situation is CPU load 100%,CPU to read/write I/O (hard disk/memory), I/O in a very short time can be completed, and the CPU has many operations to The CPU load is high. In general, CPU occupancy rate is quite high, most of the time to do calculations, logic and other CPU action procedures.
Application Performance Technology
1. Select I/O dimensions
The overhead of performing I/O includes initialization buffers, system calls, context switches, allocating kernel metadata, checking process permissions and restrictions, mapping addresses to devices, executing kernel and driver code to perform I/O, and releasing metadata and buffers at the end. Increasing I/O dimensions is a common strategy for increasing the throughput of applications.
2. Caching
The operating system uses caching to improve the read performance and memory allocation performance of the file system, and applications use caching for similar reasons. Keep the results of frequently performed operations in the local cache for later use, rather than always performing expensive operations.
3. Buffer zone
To improve write performance, data is merged and placed in a buffer before being sent to the next level. This increases the write latency, because after the first write to the buffer, wait for subsequent writes before sending.
4. Concurrency and parallelism
Parallelism: the ability to load and start execution of multiple executable programs (for example, answering telephones and eating at the same time). To take advantage of multi-core processor systems, applications need to run on multiple CPUs at the same time, which is called parallelism. Applications are implemented through multiple processes or multithreading.
Concurrency: The ability to handle multiple tasks, not necessarily at the same time. For example, after the phone to eat, there are resources to seize;
Synchronization primitives: Synchronous primitives monitor access to memory, which can cause a wait time (delay) when access is not allowed. There are three types of common:
Mutex Lock: Only the lock holder can operate, other threads will block and wait for the CPU;
Spin Lock: The spin lock allows the lock holder to operate, and other threads that need to spin the lock loop on the CPU to check whether the lock is released. Although this can provide a low latency of access, the blocked thread will not leave the CPU, always ready to run know the lock is available, but the thread spin, waiting is also a waste of CPU resources.
Read/write Locks: reading/Writing locks ensure data integrity by allowing multiple readers or by allowing only one writer without the reader.
Adaptive spin Lock: Low latency access without wasting CPU resources, is a combination of mutex lock and spin lock.
5. Bind CPU
About CPU Performance analysis
Uptime
System load, calculated by summarizing the number of running threads and the number of threads that are waiting to be run. reflect the load within 1/5/15 minutes respectively. The average load is now used not only to indicate CPU headroom or saturation, but to infer either CPU or disk load from this value alone.
Vmstat:
Virtual Memory Statistics command. The last few columns print the system's global CPU usage status, showing the number of running processes in the first column. as follows:
1 2 3 4 |
[root@zbredis-30104 ~] # vmstat procs-----------memory-------------Swap-------io------System-------CPU-----R B SW PD free buff cache si so bi bo in CS us sy ID WA St |