For the Dell PowerEdge series of servers I have a lot of contact, and encountered a lot of failures, then I have encountered and the solution:
1, for the server can not start, the Internet has a variety of solutions, but the most practical is the elimination of the law, to determine which parts of the fault. First determine if the power supply is normal, then replace the memory, CPU, motherboard. (I encountered a bad memory, the server can not get up.) )
2, for storage can not start, the most common is that the line is not connected to normal. (It's a temporary carelessness.) )
3, for the system frequently restart the solution is more, in the online down the official documents are:
1, power failure (replacement method to determine the resolution);
2, memory failure (can be detected from the BIOS error report);
3, network port data flow is too large (work pressure too large);
4, software failure (update or reload operating system solution).
I see is a number of software failures, the solution is a brother to deal with, I did not at the scene so do not know. But for the server's software to keep the server data secure, backups must be made on a regular basis. This will not cause data loss due to software failure or any other failure.
4, for the server panic, the server panic failure is more difficult to judge, generally divided into software and hardware two aspects: Software failure and hardware failure
More Wonderful content: http://www.bianceng.cnhttp://www.bianceng.cn/Servers/zs/
1, software failure, first check the operating system log, you can through the system log to determine the cause of the crash. Or the cause of the computer virus, system software bugs or vulnerabilities caused by the crash, this fault needs to be judged after hardware failure to make, and need to help the software provider. Improper use of software or system work pressure can be asked to properly reduce the server's work pressure to see if it can solve
2, hardware failures and hardware conflicts.
1, power failure or power supply is insufficient, can be compared to calculate the server power supply all the load power value to make a judgment.
2, hard disk failure (by scanning the hard disk surface to check if there is a bad way)
3, memory failure (through the motherboard BIOS in the error report and operating system errors to judge)
4, motherboard failure (using the replacement method to judge)
5. CPU failure (using substitution method)
6, the Board card failure (generally Scsi/raid card or other PCI devices may also cause system panic, can be used to determine the replacement method)
5. For other common malfunctions, the chassis led turns yellow and red. Often the machine's temperature is too high and hardware is damaged. (The most common is the power alarm, fan alarm, the hard drive broke down.) )
6, the last encounter Dell PE2950 server boot PCI error, LCD panel report E1216 3.3V regulator failure server up. Solution Solution: Due to this failure, the official documentation that E1216 is a 3.3V voltage regulator failure. So replace a voltage regulator. But after changing the error, Reseat PCIe cards. Please reset PCI-E card. So replaced a PCI-E expansion board MH180, but still reported reseat PCIe cards error, this thought to change the motherboard, but after the phone contact manufacturers, learned that reset PCI-E card need to replace 3 parts, one is PCI expansion board, a side control version, One is the PCI-E expansion board. After the replacement after the maintenance of the server, the machine finally started up.
This article from "Hello_ Small Strong" blog, please be sure to retain this source http://xiaozhuang.blog.51cto.com/4396589/1120391