Troubleshooting and remediation techniques for Linux systems

Source: Internet
Author: User

Troubleshooting and remediation techniques for Linux systems

I found that the Linux system in the boot process will have some failures, causing the system to not start normally, I wrote here a few application of single user mode, Grub command operation, Linux Rescue mode of the failure to repair cases to help you understand the resolution of such problems.

(i) Single user mode

Linux systems provide single-user mode (similar to Windows Safe Mode) for system maintenance in a minimal environment. In single-user mode (RunLevel 1), Linux boots into the root shell, the network is disabled, and only a few processes run. Single-user mode can be used to modify file system corruption, restore configuration files, move user data, and so on.

A few typical cases of single-user-mode repair system failures are listed below:

Case one: root password forgotten

In single-user mode, Linux does not require a root password (the Red hat system does not require a root password, but SuSE requires a slightly different Linux system, as explained in Fedora Core 6), which makes it easy to change the root password. It is important to understand how to enter single-user mode when the system is booted into multi-user mode failure.

1, in the system startup process, the Start screen will appear, press any key, enter the Grub menu options.

If you do not want this prompt, go directly to the Grub menu option and delete the "Hiddenmenu" entry in the configuration file grub.conf.

2. Press "E" key to edit grub boot menu option, press "E" key after the grub screen. Use the arrow keys to move down to the kernel line and press the "E" key to

3, add a single at the end of the line cursor, press ENTER to return to the previous screen, press "B" key to boot, the system automatically enters the one-user mode, if you want to change the root password, execute the command: sh-3.1# passwd root

After the change succeeds, execute the command exit to exit the restart.

In single-user mode, you can correct many problems that prevent the system from starting normally, such as:

1. Disable services that may abort the system if the Samba service is disabled, execute: sh-3.1# chkconfig SMB off the Samba service will not start the next system boot.

2, change the system default run level if X window fails to start or fails, you can edit the/etc/inittab file, log in as text, and change the Initdefault boot level to 3:id:3:initdefault:

Case two: HDD sector confusion

The most common problem in the startup process is that the hard disk may have bad or sector confusion (data corruption), which is caused by abnormal power outages and bad shutdown. This problem occurs when the system starts and the screen displays:

Press root password or ctrl+d: Enter the root password system automatically into single user mode, enter "Fsck-y/dev/hda6" (fsck for file system Detection Repair command, "-Y" setting detected error automatic repair,/dev/ Hda6 the hard disk partition where the error occurred, change this parameter as specified), and then restart with the command "reboot" after the system repair is complete.

Case three, grub option set error

The "Error 15" display system could not find the kernel specified in grub.conf. Grub Boot error message, we observed that because of typographical errors, the kernel file "Vmlinuz" became "Vmlinux", so the system could not find the kernel executable file. We can press any key to go back to the grub editing interface, modify this error, enter the Save and press "B" key to boot normally, of course, do not forget to enter the system after modifying the grub.conf file here error. This is a lot of beginners Linux users in the change grub settings is easy to make mistakes, this black screen prompt to observe the error message, can be targeted repair.

(ii) Troubleshooting GRUB Boot

I have found that sometimes Linux starts directly into the GRUB command line interface (only the "grub>" prompt), when many users choose to reinstall Grub or even reinstall the system. In fact, generally the cause of this failure is the most common two: one is the Grub configuration file option setting error, the second is the grub configuration file is missing (there are a few reasons, such as kernel file or image file corruption, missing,/boot directory mistakenly deleted, etc.), if it is the first case, The system can be repaired first through the GRUB command, and in the second case it will be repaired using the Linux rescue mode (described later in this article).

First, we need to understand the boot process of the Grub boot system, the main configuration options in the Grub.conf file are as follows (note that the GRUB configuration file is/boot/grub/grub.conf,/etc/grub.conf is just a soft link to this file):

Title Fedora Core (2.6.18-1.2798.fc6) root (hd0,0) kernel/boot/vmlinuz-2.6.18-1.2798.fc6 ro root=label=/rhgb quiet Initrd/boot/initrd-2.6.18-1.2798.fc6.img

Where the "title" segment specifies the grub-booted system: the "root" segment specifies the location of the/boot partition: The "kernel" segment specifies the location of the kernel file, the kernel load-time permission property is read-only ("Ro") and the location of the specified root partition (root=label=/) INITRD specifies where the image file is located. So grub is booted in order to load the/boot partition first, then load the kernel and the image file sequentially.

Case: "Title Fedora Core (2.6.18-1.2798.FC6)" segment was mistakenly deleted

At this point, the system will automatically enter the "grub>" command line after startup, in order to troubleshoot we can do the following in turn:

1. Find the partition where the/boot/grub/grub.conf file is located grub> find/boot/grub/grub.conf (hd0,0)

2. View grub.conf File Error Grub>cat (hd0,0)/boot/grub/ Grub.conf recommended system installation settings, to the grub.conf file backup, if there is a backup file such as Grub.conf.bak, then you can view the backup file, compared with the current file, found error: Grub>cat (hd0,0)/boot/ Grub/grub.conf.bak

3, confirm the error, first through the command line to complete the Grub boot, enter the system and then repair the grub.conf file error: 1) specify/boot partition root (hd0,0)

2) specify kernel load kernel/boot/vmlinuz-2.6.18-1.2798.fc6 ro root=label=/rhgb quiet 3) Specify the location of the image file initrd/boot/ Initrd-2.6.18-1.2798.fc6.img

Tip: Grub supports the TAB command completion function

4. Boot boot (hd0,0) from/boot partition

Command-line mode can be called in the Grub Menu mode by pressing the "C" key, or it can be used to test the newly compiled kernel (setting kernel, INITRD boot new kernel and image file). Increasing the knowledge about grub boot and Linux system boot will be helpful for this type of troubleshooting.

(iii) Linux Rescue mode application

We need to use the Linux rescue mode to troubleshoot boot issues that cannot be reached when the system is connected to single user mode or if the GRUB command line does not work. The steps are as follows:

1, the Linux installation CD (if you use a CD, put the first boot disc) into the CD-ROM drive, set the firmware Cmos/bios for the disc boot, when the Linux installation screen appears, after the "boot:" Prompt after the "Linux rescue" enter into rescue mode. (For more information on rescue mode, you can also press F5 to view)

2, the system will detect the hardware, booting the Linux environment on the CD-ROM, prompting you to select the language used in rescue mode (recommended to choose the default English can, according to the author test, some of the Linux system selected Chinese will appear garbled); the keyboard is set with the default "us", and the network settings can be Most bug fixes do not require a network connection, do not set this setting, select "No".

3, the system will try to find the root partition, the mount prompt, set the default in rescue mode, the root partition of the hard disk will be mounted to the/mnt/sysimage directory of the CD Linux environment, the default option "Continue" means that the Mount permission is read-write: "Read-only" is read-only, If a detection failure occurs, you can select "Skip" to skip. Here, because you want to repair the system, so you need to have read and write permissions, generally choose the default option "continue".

After entering the next step, you are prompted to execute the "chroot/mnt/sysimage" command to mount the root directory to the root of our hard disk system.

Case ONE: Dual system Startup Repair

When we install a dual-system environment, install Linux before installing Windows, or Windows corruption that has a dual-system environment installed, after reinstalling Windows, save the Grub MBR (Master boot record, master boot Records) will be overwritten by the bootloader NTLDR of the Windows system, causing the Linux system to fail to boot.

1, if you want to restore dual system boot, first use the above method to enter the rescue mode, execute the chroot command as follows:

sh-3.1# Chroot/mnt/sysimage

2. Switch the root directory to the root of the hard disk system, and then execute the grub-install command to reinstall GRUB:

sh-3.1# Grub-install/dev/hda

"/dev/hda" is the hard disk name, such as using a SCSI hard disk or Linux installed on the second IDE hard disk, this setting is adjusted accordingly.

3. Then execute exit command sequentially, exit Chroot mode and Rescue mode (execute two exit command):

sh-3.1# exit

After the system restarts, the grub-guided dual-system boot is restored.

Case two: System configuration file Missing repair

During booting, it is important that the INIT process reads its configuration file/etc/inittab, starts the System basic service program and the default RunLevel service to complete the system boot, and if/etc/inittab mistakenly deletes or modifies the error, Linux will not boot properly. As shown in 7. At this point, only through rescue mode can solve such problems.

/etc/inittab file Missing Boot error example

1, there is a backup file recovery method into the rescue mode, after the execution of the chroot command, if there is a backup of this file (strongly recommend that the system of important data directories, such as/etc,/boot, etc. to be backed up), directly copy the backup files back, quit the restart. If it is a configuration file modification error, such as comparing the typical/boot/grub/grub.conf and/etc/passwd file modification errors, you can also directly fix the recovery. If you have a backup file/etc/inittab.bak, execute it in rescue mode:

sh-3.1# chroot/mnt/sysimagesh-3.1# Cp/etc/inittab.bak/etc/inittab

2, no backup file recovery method if some of the configuration files are missing or the software is mistakenly deleted, and no backup can be restored by reinstalling the package, first find out which RPM package the/etc/inittab belongs to (even if the file is missing because there is a RPM database, you can find the results) : sh-3.1# chroot/mnt/sysimage sh-3.1# rpm-qf/etc/inittab initscripts-8.45.3-1

Exit Chroot Mode:

sh-3.1# exit

Mount the installation disc that holds the RPM package (in rescue mode, the disc is usually mounted in the/mnt/source directory):

sh-3.1# Mount/dev/hdc/mnt/source

The Fedora system RPM package is stored in the CD Fedora/rpms directory, other Linux storage location is similar, I do not enumerate here; Also, because the root of the hard disk system to be repaired is under/mnt/sysimage, you need to specify its location using the--root option. Overwrite the RPM package where the/etc/inittab file is installed:

sh-3.1# rpm-ivh--replacepkgs--root/mnt/sysimage/mnt/source/fedora/rpms/initscripts-8.45.3-1.i386.rpm

The rpm command option "--replacepkgs" means overwrite the installation, and after execution is completed, the file has been restored.

If you want to extract only the/etc/inittab files in the RPM package for recovery, you can execute the command after entering rescue mode:

sh-3.1# rpm2cpio/mnt/source/fedora/rpms/initscripts-8.45.3-1.i386.rpm | Cpio-idv./etc/inittabsh-3.1# CP etc/inittab/mnt/sysimage/etc

Note When this command executes, the file cannot be restored directly to the/etc directory, but only to the current directory, and the path to the restored file name is the absolute path to write the complete. After the file has been extracted successfully, copy it to the location in the/mnt/sysimage directory where the root partition resides.

Rescue mode is a powerful weapon to maintain Linux, this article explains its application methods in the above two examples, hoping to give readers a little revelation. To solve the failure of Linux system startup, it is necessary to fully understand the boot process of Linux in order to effectively judge and deal with the failure.

This article from "Du Haiqiang" blog, declined reprint!

Troubleshooting and remediation techniques for Linux systems

Related Article

E-Commerce Solutions

Leverage the same tools powering the Alibaba Ecosystem

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.