20th Chapter System Performance Tuning

Source: Internet
Author: User

First, Topas

1, Sub-command:

A----> Return to the original main screen

C---->CPU Zone State switching (CPU)

D----> Disk area state Switching

n----> Network zone State Switching (net)

P----> Process zone state switch (processor)

P----> Full screen display process status (uppercase P)

Q----> Exit (Quit)

2, monitoring screen analysis

A) parameters of theCPU area

Kernel:cpu the usage ratio of the program used to perform kernel mode

User:cpu the usage ratio of the program used to perform user mode

Wait:cpu waiting for I/O occupation ratio

Percentage of IDLE:CPU in idle state

b) Event/Queue area parameters (both display a value during a monitoring period, and a value that is displayed during another monitoring period after the refresh)

Cswitch: Number of context switches per second

Systcall: Number of system calls executed per second

Reads: Number of Read calls executed per second

Writes: Number of write calls executed per second

Forks: Number of fork calls executed per second

Execs: Number of exec calls executed per second

Runqueue: Average number of threads ready to run

Waitqueue: Average number of threads waiting for page scheduling to complete

c) File/terminal area parameters (both display a value during a monitoring period, and a value that is displayed during another monitoring period after the refresh)

READCH: The number of bytes read per second through the read call

Writech: The number of bytes written per second through write calls

Rawin: Number of raw bytes read through TTY per second

Ttyout: Number of bytes written to TTY per second

Igets: I-node lookup functions are called several times per second

Namei: Number of calls to the path lookup function per second

DIRBLK: Number of directory blocks scanned by the directory lookup function per second

d) The parameters of the page dispatch area (all display a value during the period of monitoring, and the refresh shows the value of the other period of monitoring)

Faults: The total number of errors that occur without face

Steals: Number of physical memory frames lost per second by the virtual memory manager

Pgspin: Number of pages per second read from page space, in 4KB

Pgspout: Number of pages written to page space per second, in 4KB

Pagein: Number of pages read per second, including the number of pages read in from the file system

Pageout: Number of pages written per second, including the number of pages written to the filesystem

Sios: I/O requests made by the virtual Memory manager per second

e) memory area parameters

REAL,MB: Actual memory size

Comp: Percentage of actual memory allocated to calculated page frames

Noncomp: Percentage of actual memory allocated to non-computed page frames

Client: Percent of actual memory allocated to the cache of Remote installation files

f) Parameters of the page space area

SIZE,MB: Page Space size

Used: Percentage of all page space used in the system

Free: Percentage of all Page space idle in the system

g) The parameters of the network interface area (all of which show a value during the monitoring period, and a value that is displayed during another monitoring period after the refresh)

Interf: Network Interface Name

KBPS: Throughput per second

I-pack: Number of packets received per second

O-pack: Number of packets sent per second

Kb-in: Number of kilobytes per second received

Kb-out: Number of kilobytes sent per second

h) Parameters of the Physical disk area

Disk: Physical Disk name

busy%: Percentage of physical disks performing read and write operations

KBPS: Total of Read and write kilobytes of data per second

TPS: The number of transfers to a physical disk per second, one transfer being a disk I/O, and multiple logical requests composing a disk I/O

Kb-read: Number of kilobytes of data read from physical disk per second

Kb-write: The number of kilobytes per second written to the physical disk

i) parameters of the process area

Name: process-Corresponding program name

PID: Process Number

CPU%: Percentage of CPU occupied

PGSP: The size of the page space allocated to the process

Class: "What is this?" "

second, the SAR

SAR mainly collects, displays and preserves the activity information of the system, including CPU efficiency, memory usage, system call, file reading and writing, process activity, IPC related activities and so on. The SAR command essentially calls the SADC command, and when the SAR command is run, a/USR/LIB/SA/SADC process is run in the background, and the SAR converts the data generated by the SADC command into a text format, which is then displayed or saved to a file.

There are two ways to run the SAR command, one is to add the interval and number parameters behind the SAR, to get statistics in real time, and the second way is to run the SAR directly without any parameters, then the default SAR will go to the/var/adm/sa/sadd file to analyze the data. DD is the number of days, such as today's 19th, then is/var/adm/sa/sa19, but this file does not automatically exist, the default is blocked by #, if you want to actively generate, then you need to open, in the/ETC/RC file has this:

The day data is generated when the system starts. This/ETC/RC script will be executed at the beginning of the system, this one in the/etc/inittab file

Directly without parameters run Sar,sar first to find/var/adm/sa/sadd, the second to find/USR/LIB/SA/SA1, but this script is executed by cron, if Cron does not execute/USR/LIB/SA/SA1 This script every day, Then you'll be prompted to run the SAR.

1. Analyze CPU Activity

The concept of the four parameters is the same inside the Topas.

%usr+%sys+%wio+%idle=100

When the%usr+%sys is close to 100%, then the CPU to the limit, you can consider increasing the number of CPUs,%usr significantly larger than%sys, proving that user applications occupy too much CPU, you can consider optimization. %wio takes too much time to prove that the CPU spends too long waiting for disk I/O, and if there is not enough memory, it can cause the page to be dispatched too frequently, which also results in a lot of disk I/O.

2. Read/write Operations for statistical files (access System)

IGET/S: Number of calls to Inode lookup function per second

LOOGUPPN/S: Number of calls to the directory lookup function per second

DIRBLK/S: The number of times a block is read by a directory lookup function

3, Statistical system call (calls)

SCALL/S: Total system calls per second

SREAD/S: Total Read calls per second

SWRIT/S: Total Write calls per second

FORK/S: Total number of fork calls per second

EXEC/S: Total exec calls per second

RCHAR/S: The number of bytes transmitted by the read () call system per second

wchar/s: Number of bytes transferred per second by write () call

4. Statistical block equipment Activity (device)

Device: Block device name

%busy: Percentage of devices busy processing transfer requests

Avque: The device does not have an average of the number of completed requests during the processing of the transfer request

R+W/S: Number of Reads/writes per second to the device

BLKS/S: Number of blocks transferred per second

Avwait: Average number of delivery request waiting Queue idle

Avserv: The average time required to complete the transfer request

If%busy>50% or AVAWAIT>AVSCRV, then there may be disk I/O bottlenecks

5, Statistics queue activity (queue statistics)

Runq-sz: Average number of kernel threads in the run queue

%RUNOCC: Percentage of elapsed time to run a queue

Swpq-sz: The average number of kernel threads waiting for the page to be paged in, that is, the swap queue size

%SWPOCC: Percentage of swap queue elapsed time

Runq-sz<4 or swpq-sz<5 is a better situation.

6. Scheduling of statistical pages (paging statistics)

Slots: The number of free pages that are located on the page space

CYCLE/S: The number of page exchange cycles that occur per second

FAULT/S: The number of page faults that occur per second

ODIO/S: Number of non-disk I/O page schedules that occur per second

7, the application of statistical system report

#sar-V 2 6

8. Activity of the statistical TTY devices (TTY device)

9. Use of statistical buffers (buffer)

10, the activity of the statistical kernel process

#sar-K 2 6

11. Activity of statistical messages and semaphores

#sar-M 2 6

12. Activities to be exchanged for statistics

#sar-W 2 6

Third, Vmstat

Primarily reports on the activity of virtual memory, but he also counts kernel threads, physical disks, traps (errors), and CPU activity.

1. Memory

AVM: Refers to the number of active virtual memory pages, that is, the total number of virtual memory pages allocated in the page space, and if this value is higher, that does not mean that the system is performing poorly.

fre: Refers to the number of free memory pages in RAM. The system maintains a memory page buffer called the Free list, which allocates space for VMM through the free list when the Virtual Memory Manager (VMM) requires space.

Description

1) dividing the AVM by 256 is the size of the allocated page space (in megabytes) within the system scope

2) #lsps-a display information for each page space

3) The recommended bit system allocates enough page space so that its usage rate does not reach 100%

4) When there are fewer than 128 unallocated virtual pages on the page space, some processes are killed to free up some page space.

5) The minimum number of pages that VMM maintains in the free list is determined by the Minfree parameter, which can be modified with the Vmtune command, but only if the set of bos.adt.samples files needs to be installed (vmtune This command is old and now VMO )

2. Page

re: (parameter deactivated)

Pi: Number of pages per second in page space

po: Paged page space per second

fr: Free pages

SR: Number of pages checked by page conversion method

cy: Clock cycle per second

Description

1) If the value of Pi and Po is not always 0, it proves that the page scheduling activity is too frequent, which greatly reduces the performance of the system, which is mainly due to the bottleneck of memory.

2) If the ratio of PI:PO is greater than or equal to 1, it proves that for every page call, there will be at least one page recall, so the system's page scheduling activity is very frequent, with a high page scheduling rate

3) If the FR:SR ratio is too high, it proves that memory usage is excessive, and if FR:SR is 1:4 it means that each page is freed, and 4 pages need to be checked.

3, Faults

In : number of device interrupts

Sy: Number of system calls

CS: The number of context switches for kernel threads

4. CPU

r: number of threads added to the running queue per second

B: the number of threads per second that are added to the waiting queue due to waiting for resources or I/O

US: Percentage of time that the CPU is in user mode

Sy: Percentage of time that the CPU is in system mode

ID: Percentage of CPU in idle mode

wa: Percentage of time that the CPU is idle while waiting for disk I/O mode

Description

1) ID has always been 0, indicating that the CPU has been in a busy state

2) The running queue increases, the user response time will increase, if R is not 0, prove that the CPU has more work to do

3) The ratio of CPU to user and system is close to 100%, which may indicate CPU limit

4) If the WA value exceeds 40%, the disk subsystem is unbalanced and may be caused by a disk-intensive workload.

Iv. Iostat

It reports statistics on CPU, terminal I/O, and disk I/O that help define I/O load for a single component, such as a hard disk.

1, TTY and CPU statistics (if using the-t parameter, then only the TTY and CPU statistics are reported)

Tin: number of characters received per second from the terminal

tout: number of characters sent to terminal per second

%user: percentage of CPU consumed by the execution user program

%sys: percentage of CPU consumed by executing core programs

%idle: Percentage of CPU in idle mode

%iowait: Percentage of time that the CPU is idle while waiting for disk I/O mode

%iowait is relatively high, proving that disk I/O has bottlenecks.

There are a few common ways to solve bottlenecks:

1) Do not place multiple active logical volumes and file systems on the same hard disk, so that I/O liabilities are evenly distributed across multiple physical disks

2) A logical volume is distributed on multiple physical disks to provide concurrent access

3) Create multiple JFS logs (log) in a volume group and assign them to a specific file system, which is useful for creating, deleting, or modifying a large number of files, especially temporary files. (Create multiple JFS logs, or JFS2 logs, remember to format with Logform)

4) Reduce fragmentation by backing up and recovering files, resulting in increased response time due to frequent drive position heads

5) Add some drives to balance the existing I/O subsystem

2. Statistical report of disk I/O

Disks: Displays the name of the physical volume

%tm_act: The percentage of time that the physical disk is active. A drive is active only when the data is transferred or handled by the command.

%kbps: How many kilobytes of data the drive transmits per second

TPS: number of transfers sent to physical disk per second

Kb_read: refers to the amount of data read from the physical volume during the reporting interval, in kilobytes

Kb_wrtn: refers to the amount of data written to the physical volume during the reporting period, in kilobytes

Description

1) The percentage of disk usage%tm_act is proportional to resource contention, inversely related to I/O performance, and when disk usage increases, I/O performance decreases and response time increases.

2) The more drives you have in the system, the better I/O performance

3) Find a busy drive (compared to an idle drive) and move data from a busy disk to a more busy disk, which can reduce disk bottlenecks

4) Check the page scheduling activity, because paging in/out also increases the I/O load, moving the page space from a busy disk to a non-busy disk.

Wu, Vmo

See other separate cloud notes

Six, network performance analysis

Description

When an application wants to send data to an application on a remote host, it first writes the data to a local socket, that is, the data is copied from the application's cache to the local socket send buffer.

Once the data is sent to the socket buffer, a mbuf or cluster list is formed, and then the socket is sent by TCP or UDP, the data can be very large, exceeding the MTU limit, then For the data sent by the TCP protocol, the data is divided into segments (Segment) at the Transport Layer (TCP/UDP layer), the segment is the data unit of the TCP protocol, and for the UDP protocol, the partition of the data is given to the network layer (IP layer) to complete .

Similarly, at the receiving end, there is a receiving queue that works against the sending method. Note the sending buffer and the receive buffer size are subject to udp_sendspace and tcp_sendspace(or udp_recvspace and tcp_recvspace ) is limited.

After the data arrives at the host, it is placed in the device driver layer and then routed through the interface layer to the IP input queue of the IPS layer, the size of the IP input queue is controlled by the ipqmaxlen parameter, and the no command can be changed by this parameter. If a packet is lost or broken in transit, the time the IP layer waits to accept the missing packet is controlled by the ipfragttl parameter.

Aix allocates virtual memory for various TCP/IP network tasks, and the memory management device used by the network subsystem, called MBUF,MBUF, is primarily used to store data received from the network and sent to the network. When Aix is running, you can configure the Mbuf pool. The system parameters listed below can be adjusted:

1) thewall: Kernel variables, setting the maximum number of MBUF management devices that can be allocated from RAM

2) Tcp_sendspace: Sets the default socket send buffer size. Default 16384

3) Udp_sendspace: Sets the maximum amount of memory used by a single UDP socket to set the buffer that emits the data. Default 9216

4) Udp_secvspace: Sets the maximum number of buffers that accept any single UDP socket. Default is 41920

5) rfc1323: This value is not 0, allowing the TCP window size to replace 16 bits with the maximum number of 32 bits (bit), both Tcp_sendspace and Tcp_secvspace can be set to 64KB

6) Sb_max: Control the upper limit of all buffer size points

7) Ipqmaxlen: Kernel variable, control the length of the IP input queue, the default is the length of 100 packets. This value is sufficient for a single network device, and if the IP input queue is not long enough, the packet is discarded

—————— These parameters, you can use the no command to modify

20th Chapter System Performance Tuning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.