Basic machine inspection knowledge
Inspection is generally done by the manufacturer or agent.
First, check the room temperature and humidity. Of course, these situations are normal.
Reference Value: temperature (℃) 10 ℃-40 ℃
Humidity: 8%-80%
Second: it is about the power supply detection, as long as it is not a new installation, generally no problem.
Reference Value: Zero-ground voltage less than 1 V
Fire-ground voltage 200-240 V
Supplement: Of course, 59 series machines have 380 V and 240 V respectively.
Third: the error is reported.
Check hardware for permanent errors.# Errpt-DH | PG
Permanent Software Error Reporting# Errpt-Ds | PG
There are still systems and no alarm lights (if not, I will elaborate on them in later posts)
Third: the serial number of the machine:# Uname-mu
Fourth: operating system version:# Oslevel-RNote: this is because the system version is low.
Fifth: Other Detection
Run Sysdumpdev-l Check whether dump is set to always allow sysdump.
Run Sysdumpdev-e Check whether the current dump size is less than 80% of the master dump device size.
Run Lsvg-l rootvg Check whether there are logical volumes in the "stale" status
Run LSPs-S Check memory swap zone usage
Run DF-K Check the distribution of the file system. Generally, the size should not exceed 80%.
Run Lsdev-ccdisk Check hard drive status as available
Run Lsdev-ccadapter Check that the PCI Card status is available
Run Lsdev-cctape Check whether the status of the tape drive is available.
Run Lsdev-ccprocessor Check that the CPU status is available
Run Lsattr-El sys0 | grep autorestart Check whether the system is automatically restarted after crash.
Run Lsattr-El sys0 | grep cpuguard Check whether CPU guard is enabled
Run Lsattr-El mem0 Check normal memory status size = goodsize
Run vmstat 2
Iostat, topas observe us, Sy, PI, Po, memory usage, hard disk read/write speed, and other checks to check for performance bottlenecks
Run Netstat-in And Netstat-Rn Observe network status
Run Entstat-D ENX Check whether the NIC running rate matches the switch rate (when the NIC speed is changed from 10 MB to adaptive, the default gateway will be lost. After changing the NIC speed, you must execute the Smitty route operation in the system to reactivate the default route. Be careful when adjusting the NIC speed .)
Run Ping Command to check the network connection status
Run Lsdev-c | grep AIO Check whether asynchronous Io is available
Run Lssrc-G Cluster Check whether three processes are active (this mainly indicates several ha processes, sometimes one or two)
Run /Usr/sbin/cluster/clstat- Check whether the cluster status is normal
Check/ ETC/hosts To ensure that the IP alias in the dual-machine configuration does not contain the relationship (pai_ip1 includes pai_ip)
Run
More/usr/ES/ADM/cluster. Log
More/usr/ES/sbin/cluster/history /*
CAT/tmp/hacmp. Out
Check whether the three logs contain error or fail.
Check the indicator on the 7133 panel. If the yellow indicator is on, diagnose the problem.
Hotspare disk Detection
1. Check for raid protection # Smitty ssaraid ---> list all defined SSA raid Arrays
2. check whether there is hot spare # Smitty ssaraid ---> List components in a hot spare pool
For 7133, execute Smitty ssaraid list all defined SSA raid arrays to check the 7133 raid status. The normal status should be good.
For 7133, execute Smitty ssaraidchange/show use of an SSA physical disk to check the 7133 hard disk status. Normally, it should be in the member or spare status.
For fastt, log on to the two controllers respectively (detailed descriptions are provided later) to check whether there are error logs.
Record Check Results
These are the basic commands. If I have any omissions, I will continue to add them.
By the way, for joint inspection by Huawei and IBM, there will be several more projects:
CPU (clock speed * quantity)# Lsattr-El proc0
Number of built-in disks: # lsdev-ccdisk
Nic information: # lsdev-ccadapter