Analyze the causes of computer crashes in Linux

Source: Internet
Author: User
Article Title: analyzes the causes of computer crashes in Linux. Linux is a technology channel of the IT lab in China. Some basic classification systems, such as desktop applications, Linux system management, kernel research, embedded systems, and open-source systems, have crashed. They are generally divided into two types: hardware problems and software problems.
 
I. Hardware problems
 
Consider the following points:
 
1. Do not overclock the CPU. If the overclock has already been restored to the original frequency first
 
Although normal operation is normal, unexpected faults may occur during high load usage. In particular, in some applications of linux systems, the performance of hardware can reach its limit, but such hardware may be okay to run Windows.
 
2. confirm that the power supply is sufficient.
 
Make sure that the power supply meets the load in the high load status.
 
3. Use memtest86 to check the memory status
 
4. Recover the BIOS to the default state.
 
For servers, you can use the built-in monitoring tool for testing. This is also a good troubleshooting method.
 
  Ii. software problems
 
If the hardware problem has been basically ruled out, we must consider getting the system information in the dead state from the software.
 
1. If we are lucky enough, the system will not necessarily die completely (at this time, the keyboard may be able to respond). Then we can use the Sysrq algorithm.
 
The premise is that you must first enable the sysrq function:
 
# Echo "1">/proc/sys/kernel/sysrq
 
# Setterm-blank
 
In this way, when the system is faulty, we can use:
 
Reference
 
Alt + Sysrq-T get Process System stack information
 
Alt + Sysrq-M get memory allocation information
 
Alt + Sysrq-W get the current register information
 
For more hotkeys, refer to/usr/src/linux/Documentaion/sysrq.txt on the system.
 
Among them, setterm-blank can disable regular black screen protection under characters to easily record screen information.
 
2. to display more kernel debugging information on the screen, you can modify the display mode of the console to 80x25 in/boot/grub/menu. at the end of the line corresponding to the kernel in lst, add vga = 0x305, for example:
 
Reference
 
Kernel/boot/vmlinuz-2.4.21-9.30AXsmp ro root =/LABEL =/1 vga = 0x305
 
3. If the keyboard is unfortunately dead, we can only send the system information to another system via serial port. The method is as follows:
 
Modify the/boot/grub/menu. lst file and add the core parameter "console = ttyS0 console = tty1" at the end of the kernel line, for example:
 
Reference
 
Kernel/boot/grub/vmlinuz-2.4.21-9.30AXsmp ro root =/LABEL =/1 console = ttyS0 console = tty1
 
Then, modify/etc/sysconfig/syslog and add the klogd option "-c 7", such:
 
Reference
 
KLOGD_OPTIONS = "-x-c 7"
 
Restart the server and perform the test:
 
1) Use a serial port to connect to the client and server and run the following command on the client:
 
Cat/dev/ttyS0
 
Run on the server:
 
Echo hi>/dev/ttyS0
 
If the client has "hi" output, you can.
 
2) run on the server:
 
Echo w>/proc/sysrq-trigger
 
Check whether the kernel information is output on the client.
 
3) run on the server:
 
Modprobe loop
 
Check whether the kernel information is output on the client.
 
If all tests pass, run the following command on the client:
 
Cat/dev/ttyS0 | tee/tmp/result
 
When the crash occurs, we can see the required kernel information from the client (View/tmp/result ).
 
  Iii. Summary
 
Generally, Linux crashes due to the following reasons:
 
System hardware problems (scsicard, motherboard, RAID card, HbA card, Nic, hard disk, etc)
 
Peripheral hardware problems (networks, etc)
 
Software problems (system and application software)
 
Driver bug (find a new driver)
 
Core system bug (go to LKML to check, or change the core and try again)
 
System settings (restore to the default status, disable the firewall, etc)
 
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.