(1) free
A common problem is:
How much memory is used by my applications, servers, users, and system processes? Or
How much memory is available now? If the memory used by running processes is greater than the available Ram, you need to move these processes to the swap zone.
Therefore, a supplementary question is:
How much swap space is being used?
The free command will answer all these questions. In addition, a very useful option-m can display available memory (in MB)
[root@Think ~]# free -m total used free shared buffers cachedMem: 1011 991 19 0 59 661-/+ buffers/cache: 270 740Swap: 0 0 0
The above output shows that the system has 991 mb ram, MB in use, and MB in memory available.
The second line shows the buffer and cache size changes in the physical memory.
The third line shows swap partition Utilization
To display the preceding content in kb or GB, replace the-M option with-K or-G. Use the-B Option in bytes.
[root@Think ~]# free -b total used free shared buffers cachedMem: 1060110336 1039556608 20553728 0 62877696 692731904-/+ buffers/cache: 283947008 776163328Swap: 0 0 0
-T option displays the total number at the bottom of the output (the total number of physical memory and swap partitions ):
[root@Think ~]# free -m -t total used free shared buffers cachedMem: 1011 991 19 0 60 660-/+ buffers/cache: 270 740Swap: 0 0 0Total: 1011 991 19
Although free does not show percentages, we can extract and format specific parts of the output.
Example: Percentage of memory in use
[root@Think ~]# free -m | grep Mem | awk '{print ($3 / $2)*100}'97.9228
This value is very important and you may want to trigger an alarm when the percentage of available memory is lower than a specific threshold value
Similarly, you can:
[root@Think ~]# free -m | grep -i Swap | awk '{print ($3 / $2)*100}'
You can use free to view the memory load applied by the application.
For example, check the available memory before starting the backup application, and check the available memory immediately after startup.
The difference between the two is the memory consumed by the backup application.
Oracle user usage
So how do you use this command to manage Linux servers running the Oracle environment?
One of the most common causes of performance problems is insufficient memory, which causes the system to temporarily swap the memory area to the disk.
A certain degree of swap may be inevitable, but too many exchanges indicate insufficient available memory.
Now, you can use free to obtain available memory information, and then use the sar command (described later) to check the historical trend of memory consumption and swap partition consumption.
If the use of swap partitions is temporary, a peak may occur, but if it is clear that it will take some time, pay attention
Persistent memory overload may have several obvious and possible problems:
● The larger SGA is higher than the available memory
● A large amount of memory is allocated on the PGA.
● Memory leakage error in some processes
In the first case, make sure that the SGA is lower than the available memory. Based on experience, SGA uses approximately 40% of the physical memory. Of course, this parameter should be defined based on actual conditions.
In the second case, try to reduce the allocation of a large number of buffers in the query.
In the third case, use the ps command to determine the specific process that may leak the memory.
(2) IPCS
When a process is running, it will capture "Shared Memory"
This process may have one or more shared memory segments
Processes send messages to each other and use Signals
To display information about shared memory segments, IPC message queues, and signals, run the following command:
IPCS
-M option is very popular. It can display shared memory segments.
[root@Think ~]# ipcs -m------ Shared Memory Segments --------key shmid owner perms bytes nattch status 0x7402f3d8 4620288 root 600 4 0 0x00000000 4980737 root 644 52 2 0x7402f3d7 4587522 root 600 4 0 0x00000000 5013507 root 644 16384 2 0x00000000 5046276 root 644 268 2 0x00000000 5111813 root 600 393216 2 dest 0x00000000 5144582 root 600 393216 2 dest 0x00000000 5177351 root 600 393216 2 dest 0x00000000 5439503 root 600 393216 2 dest 0x00000000 5472272 root 600 393216 2 dest 0xbe3bb918 5505041 oracle 640 419438592 20
The output indicates that the server is running Oracle software and various shared memory segments are displayed.
Each shared memory segment is uniquely identified by the shared memory ID displayed in the "shmid" Column (you will see how to use this value later)
Obviously, "owner" shows the owner of the memory segment, "perms" column shows permissions, and "bytes" shows the size of bytes.
-U option displays a very quick summary
[root@Think ~]# ipcs -mu------ Shared Memory Status --------segments allocated 18pages allocated 103562pages resident 36482pages swapped 0Swap performance: 0 attempts 0 successes
-L display the limit value (relative to the current value ):
[root@Think ~]# ipcs -ml------ Shared Memory Limits --------max number of segments = 4096max seg size (kbytes) = 524288max total shared memory (kbytes) = 8388608min seg size (bytes) = 1
If you see that the current value is in or near the limit value, you should consider increasing the limit value.
You can use the shmid value to obtain detailed snapshots of specific shared memory segments. The-I option can be used to complete this operation.
The following describes how to view details of shmid 5505041:
[root@Think ~]# ipcs -m -i 5505041Shared memory Segment shmid=5505041uid=501 gid=502 cuid=501 cgid=502mode=0640 access_perms=0640bytes=419438592 lpid=10881 cpid=5300 nattch=20att_time=Sun Feb 3 20:58:28 2013 det_time=Sun Feb 3 20:58:28 2013 change_time=Sun Feb 3 09:08:06 2013
Later, this article will use a case to show you how to explain the above output
-S: displays signals in the system:
[root@Think ~]# ipcs -s------ Semaphore Arrays --------key semid owner perms nsems 0x000000a7 0 root 600 1 0xf5d4b884 131073 oracle 640 154
It shows some valuable data. It shows that the signal group with the ID 0 has one signal, and the other signal group has 154 signals.
If you increase the signal, the total value must be lower than the upper limit defined by the Kernel Parameter (semmax ).
When Oracle software is installed, the pre-installed check program will check the semmax settings
Then, when the system reaches a stable state, you can check the actual utilization and adjust the kernel value accordingly.
Oracle user usage
How can I view the shared memory segments used by Oracle database instances?
Therefore, use the oradebug command
sys@ORCL> oradebug setmypidStatement processed.sys@ORCL> oradebug ipcInformation written to trace file.sys@ORCL> oradebug TRACEFILE_NAME/u01/app/oracle/admin/orcl/udump/orcl_ora_7525.trc
Now open the tracking file and you will see the shared memory ID (5505041)
The following is an excerpt from this file.
Area #0 `Fixed Size' containing Subareas 0-0 Total size 0000000000129cb0 Minimum Subarea size 00000000 Area Subarea Shmid Stable Addr Actual Addr 0 0 5505041 0x00000020000000 0x00000020000000 Subarea size Segment size 000000000012a000 0000000019002000 Area #1 `Variable Size' containing Subareas 2-2 Total size 0000000018c00000 Minimum Subarea size 00400000 Area Subarea Shmid Stable Addr Actual Addr 1 2 5505041 0x00000020400000 0x00000020400000 Subarea size Segment size 0000000018c00000 0000000019002000 Area #2 `Redo Buffers' containing Subareas 1-1 Total size 00000000002d6000 Minimum Subarea size 00000000 Area Subarea Shmid Stable Addr Actual Addr 2 1 5505041 0x0000002012a000 0x0000002012a000 Subarea size Segment size 00000000002d6000 0000000019002000
You can use the shared memory ID to obtain detailed information about the shared memory.
Combined with the IPCS-m-I 5505041 mentioned above
Another useful observation is the value of LPID-the ID of the last process that contacts the shared memory segment.
To display the attribute value, use SQL * Plus to connect to the instance from another session.
[oracle@Think ~]$ sqlplus / as sysdbasys@ORCL> select spid from v$process where addr = (select paddr from v$session where sid = (select sid from v$mystat where rownum<2));SPID------------11439
Now, run the IPCS command again for the same shared memory segment.
[root@Think ~]# ipcs -m -i 5505041Shared memory Segment shmid=5505041uid=501 gid=502 cuid=501 cgid=502mode=0640 access_perms=0640bytes=419438592 lpid=11476 cpid=5300 nattch=20att_time=Sun Feb 3 21:25:31 2013 det_time=Sun Feb 3 21:25:31 2013 change_time=Sun Feb 3 09:08:06 2013
Note that the value of LPID has been changed from 10881 to 11476.
LPID displays the PID of the last process that contacts the shared memory segment
(3) ipcrm
Now that you have identified shared memory and other IPC metrics, what can you do with them?
You have seen some usage before, such as identifying the shared memory used by Oracle, and ensuring that kernel parameters are set for the shared memory.
Another common application is to delete shared memory, IPC message queues, or signal groups.
To delete a shared memory segment, pay attention to its shmid In the IPCS command output, and then use the-M option to delete the segment. to delete the segment with ID 3735562, use:
[root@Think ~]# ipcrm -m 3735562ipcrm: already removed id (3735562)
This will delete the shared memory. You can also use this command to delete signals and IPC message queues (using the-s and-Q parameters)
Oracle user usage
Sometimes, when you shut down the database instance, the Linux kernel may not completely clear the shared memory segment
The shared memory is of no use, but it will occupy system resources, so that less memory can be used for other processes.
In this case, you can check any delay shared memory segments owned by oracle users and delete them. If such segments exist, delete them using ipcrm.
(4) vmstat
Vmstat is the earliest command used to display information related to memory and processes.
During the call, the command runs continuously and releases its information.
It has two parameters:
Vmstat <interval> <count>
<Interval> the interval between two runs, in seconds.
<Count> indicates the number of times vmstat is repeated.
The following is an example when we want vmstat to run once every 5 seconds and stop after 10th running
A row is output every five seconds and the statistics at this time are displayed.
[root@Think ~]# vmstat 5 10procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 38576 76284 639688 0 0 25 24 98 186 1 1 97 1 0 0 0 0 38532 76292 639716 0 0 0 22 102 194 0 0 99 0 0 0 0 0 38516 76292 639720 0 0 0 13 99 187 0 0 99 0 0 0 0 0 38524 76300 639720 0 0 0 68 121 232 1 0 99 0 0 0 0 0 38524 76304 639720 0 0 0 16 165 298 1 1 99 0 0 0 0 0 38540 76308 639720 0 0 0 16 84 176 0 0 99 0 0 0 0 0 38524 76316 639840 0 0 0 81 94 187 0 0 99 0 0 1 0 0 38404 76324 639848 0 0 0 17 89 181 0 0 100 0 0 0 0 0 38404 76324 639848 0 0 0 13 93 180 0 0 99 0 0 2 0 0 38420 76328 639848 0 0 0 11 220 364 1 1 99 0 0
The output shows a large amount of information about system resources. We will introduce them in detail:
Sometimes, another column is located under the title "W", showing the number of processes that can run but have been switched to the swap partition.
The value under "B" should be close to 0. If the value under "W" is very high, more memory may be required.
The following table shows the memory metrics:
The buffer memory (buff) is used to store file metadata (such as I-nodes) and data in the original block device.
Cache memory (cache) is used for file data itself
The following table shows the exchange activities.
The following table shows the I/O activities.
The following table shows system related activities.
The last table may be used most --- information about CPU load
Let's take a look at how to explain these values
The first line of output is the average value of all metrics since the system restarts.
Therefore, ignore this row because it does not display the current status, and other rows display real-time metrics.
Ideally, the number of waiting or blocked processes (under the "procs" title) should be 0 or close to 0
If the value is high, the system does not have enough resources (such as CPU memory or I/O)
This information is very important when diagnosing performance problems.
The data in "Swap" indicates whether there are too many exchanges. If there are too many exchanges, the physical memory may be insufficient.
Memory needs should be reduced or physical Ram should be increased
The data under "Io" indicates the data flow to and from the disk, which indicates the amount of disk activity in progress. This does not necessarily indicate a problem.
If you see a large value and a high I/O under "B" (a blocked process) of "procs", serious I/O contention may occur.
The title "CPU" is the most useful information. The "ID" column shows the idle CPU. If 100 is used to subtract this value, the percentage of the busy CPU is obtained.
Compared with top, top displays the percentage of idle CPU resources, while vmstat displays the percentage of idle CPU resources.
The vmstat command also shows the CPU usage: How much is used in Linux, how many processes are used, and how much is waiting for I/O.
Through this division, you can determine the composition of CPU consumption. If the system CPU load is too high, can it indicate that a root process is running?
The system load for a period of time should be consistent. If the system displays a high value, use the top command to determine the system process that occupies the CPU.
Oracle user usage
The Oracle process (background process and server process) and user process (sqlplus, Apache, etc.) are located under "us"
If the value is high, use top to determine the process. If the "wa" column shows a high value, it indicates that the I/O system cannot keep up with the number of reads or writes.
Sometimes this may be due to a large number of updates in the database, resulting in switch log and a large number of subsequent archiving processes
However, if a large value is displayed continuously, it indicates that there may be an I/O bottleneck.
I/O bottlenecks in Oracle databases may cause serious problems. Unlike performance problems, slow I/O may lead to slow write speed of control files.
This will cause the process waiting to obtain the control file to join the queue. If the waiting time exceeds 900 seconds and the waiting time is a key process (such as lgwr), the database instance will be closed.