Linux HDD Detection (original)

Last Update:2015-07-20 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

http://czmmiao.iteye.com/blog/1058215

Overview

With the rapid development of hard disk capacity and speed, the reliability of hard disk is becoming more and more important, today's single-block hard disk storage capacity can easily reach 1TB, the impact of hard disk damage is very huge.
Different file systems (XFS,REISERFS,EXT3) have their own detection and repair tools. Before detection can use the DMESG command to see if there is no hardware I/O failure of the log, if any, first use fsck to see if there is a problem with the file system, if not you can use the following describes the hard disk detection and optimization method to repair it. grep "Error"/va/log/messages*;
Linux detects hard drive bad path
Using smart to detect hard drives
Smart is a disk self-analysis and detection technology, as early as in the late 90, the basic access to each piece of hard disk (including IDE, SCSI), in the run time will be a number of its own parameters recorded, these parameters include model, capacity, temperature, density, sector, seek time, transmission, error rate and so on. After thousands of hours of hard drive operation, a lot of intrinsic physical parameters will change, a certain parameter exceeds the alarm threshold, then the hard disk is close to damage, the hard disk is still working, if the user ignore the alarm continue to use, then the hard disk will become very unreliable, may malfunction at any time.
Enable Smart
Smart is in conjunction with the corresponding function on the motherboard BIOS, to use smart, you must first enter into the motherboard BIOS settings to start the relevant settings. Generally from the Pentium2 level of the motherboard, all support Smart,bios boot, is the operating system level of things (Windows does not have built-in smart related tools, need to install third-party tool software), fortunately, Linux has a very early smart support, If you install Linux on a virtual machine such as VMware, you can see a service start error when the system starts: SMARTD. This server is the Smart daemon process (because the VMware virtual machine's hard drive does not support smart, so it is an error). SMARTD is a daemon (a helper) that monitors hard disks that have self-monitoring, analysis, and reporting techniques (self-monitoring, analytics, and Reporting Technology-smart). The smart system allows the hard drive to monitor and report on its own health. An important feature of this is the ability to predict failure, allowing system administrators to avoid data loss.

Smartctl Simple usage

Smartctl-a <device> Check if the device has turned on smart technology. Smartctl-s on <device> If smart technology is not turned on, use this command to turn on smart technology. SMARTCTL-T short <device> background detection of hard disk, time consuming; smartctl-t long <device> background detection hard disk, long time consumption; smartctl-c-T-<devi Ce> The front desk detects the hard disk, consumes the time short; smartctl-c-T Long <device> Front desk detects the hard drive and consumes a long time. is actually using the hard drive Smart self-test program. Smartctl-x <device> interrupt background detection hard drive. Smartctl-l selftest <device> Displays the hard drive detection log. Smartctl-l Error <device> displays a summary of the hard drive errors.
First, confirm the device symbol of the hard disk by DMESG tool. For example, if an IDE hard disk is connected to the slave location on the primary IDE bus, the hard disk device symbol is h for the IDE in/DEV/HDB,HDB, and if it is shown as SDB, it represents SATA and SCSI, and the last subtitle B represents the primary bus. The second hard drive is the slave location, confirming that the hard drive is turned on smart support:

# smartctl-i/DEV/SDA
Smartctl 5.40 2010-10-16 r3189 [I386-redhat-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

= = = START of Information section = = =
Device Model:hitachi hts543225l9sa00
Serial Number:090131fb2f32ylg28jea
Firmware version:fbezc48c
User capacity:250,059,350,016 bytes
Device is:not in Smartctl database [for details use:-P showall]
ATA Version Is:8
ATA Standard IS:ATA-8-ACS Revision 3f
Local Time is:wed 10:10:39 CST
Smart support Is:available-device have smart capability.
Smart Support is:enabled//indicates that Smart supports are enabled
If you see smart support is:disabled that Smart is not enabled, perform the following command to start the smart

# smartctl--smart=on--offlineauto=on--SAVEAUTO=ON/DEV/SDA
Smartctl 5.40 2010-10-16 r3189 [I386-redhat-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

= = = START of enable/disable COMMANDS section = = =
SMART Enabled.
SMART Attribute Autosave Enabled.
SMART Automatic Offline Testing Enabled every four hours.

Now that the smart feature of the hard drive has been opened, perform the following command to see the health of the hard drive

# smartctl-h/DEV/SDA
Smartctl 5.40 2010-10-16 r3189 [I386-redhat-linux-gnu] (local build)
Copyright (C) 2002-1 0 by Bruce Allen, http://smartmontools.sourceforge.net

= = = START of READ smart DATA section = = =
SMART overall-he Alth Self-Assessment Test result:passed
Note the result behind results: PASSED, which indicates that the hard drive is in good health and if failure is shown here, it is best to replace the hard drive with the server immediately. Smart can only report that the disk is no longer healthy, but how long it can continue to run after the alarm is uncertain. Usually, smart alarm parameters are reserved, the disk alarm, will not be broken on the spot, generally can persist for a period of time, some hard drive smart alarm also continued to run for several years, some hard drive smart error days after the bad. But in the event of an alarm, the lucky heart is absolutely impossible ...
#smartctl-A /dev/sda view hard disk details

#smartctl-S on /dev/sda If you do not have smart technology turned on, use this command to turn on smart technology.
#smartctl-t short /dev/sda background detection of hard disk, short time consuming;
#smartctl-T long /dev/sda Background detection of hard disk, long time consuming;
# Smartctl-c-t /DEV/SDA Short front desk detects hard disk, consumes time,
#smartctl-C-t /dev/sda Long front desk detects hard disk and consumes a long time. is actually using the hard drive Smart self-test program.
#smartctl-x /DEV/SDA      interrupts background detection of the hard drive.
#smartctl-L selftest /dev/sda displays the hard disk detection log.
#smartctl-L error /DEV/SDA    show hard disk error rollup.
If you need to log on regularly to the server to run Smartctl, Linux also provides the system process SMARTD, edit the configuration file:1    vi /etc/smartd.conf
Most of this configuration file may be commented out, just write to the current drive-related configuration:

/dev/sda-h-M [email protected]23123.com	Monitor the health status of the disk and ignore it when smart reports passed. Once the failure appears, notify the user of the specified mailbox immediately by mail
/dev/sda-a-m [email protected],[email protected]	Monitor all properties of the disk and ignore it when passed is reported in smart. Once the failure appears, notify the user of the specified mailbox immediately by mail
/dev/twa0-d 3ware,0-a-S l/. /.. /7/00	Monitor all properties of the first ATA disk on the 3ware 9000 controller, and perform long-form self-detection on every Sunday 00:00--01:00
/dev/sg2-d areca,1-a-S l/. /(01\|15)/./22	Monitors all properties of the first SATA disk on the Areca RAID controller, and carries out long-form self-detection on the 1th and 15th day of each week of 22:00--23:00
-S (o/. /.. /./(00\|06\|12\|18) \| s/. /.. /./01\| l/. /.. /6/03)	Off-line self-test in daily 00:00,06:00,12:00,18:00, and in short format self-test on daily 01:00-02:00, and long format self-test in 03:00-04:00 per week 6

After configuring the smartd.conf, you need to perform

/ETC/INIT.D/SMARTD Restart can take effect

Other smartd.conf-related configurations can be found in:

Http://smartmontools.sourceforge.net/man/smartd.conf.5.html
Use badblocks detecting hard drive bad blocks
The Badblocks command can check for damaged chunks in the disk appliance. The instruction must be executed by specifying the disk device to be inspected and the number of disk blocks of the device.
Syntax and parameters: syntax: badblocks [-svw][-b][-o] [disk device] [number of disk blocks] [start chunk] Parameter:-b Specifies the chunk size of the disk in bytes. -O Writes the result of the check to the specified output file. -S shows progress while checking. -V displays detailed information when executed. -W performs a write test when checking. [Disk device] Specifies the disk device to check. [Number of disk blocks] Specifies the total number of blocks for the disk appliance. [Start Block] Specifies the block from which to start the check.
Badblocks detecting disk bad blocks:

badblocks-s//Show Progress-v//Show execution Details/dev/sda1
# badblocks-s-V/DEV/SDA
Checking for blocks from 0 to 244198583
Checking for Bad blocks (read-only test): ^c0.10% done, 0:04 elapsed
Interrupted at block 272896
$badblocks-s//Show Progress-w//to write to detect-v//display execution Details/dev/sda2
# badblocks-w-s-v/dev/sda1
Checking for bad blocks in read-write mode
From Block 0 to 25607577
Testing with pattern 0xaa: ^c0.73% done, 0:03 elapsed

Note that a mounted hard drive cannot be detected in writing
Using the Hdparm test
Test drive Read and write speed
# HDPARM-TT/DEV/SDA
/DEV/SDA:
Timing Cached reads:
1918 MB in 2.00 seconds = 959.62 mb/sec
Timing buffered disk reads:184 MB in 3.00 seconds = 61.26 mb/sec

Hdparm can detect, display and set parameters of IDE or SCSI hard disk.

Grammar:

Hdparm [-cfghiiqttvyyz][-a < cache partition >][-a <0 or 1>][-c <i/o mode >][-d <0 or 1>][-k <0 or 1>][-k <0 or 1 >][-m < partitions >][-n <0 or 1>][-p <pio mode >][-p < number of partitions >][-r <0 or 1>][-s < time >][-u <0 or 1> ][-w <0 or 1>][-x < transmission mode;] [device]
-a< Cache partition > Set the number of partitions that are pre-deposited into the chunk when the file is read, and if the < cache partition > option is not added, the current setting is displayed. -a<0 or 1> to start or close the cache when the file is read. -C&LT;I/O mode > Set IDE32 bit I/O mode. -C detects the power management mode of the IDE hard drive. -d<0 or 1> sets the DMA mode of the disk. -F writes the memory buffer's data to the hard disk, and the buffer is clear. -G Displays the magnetic track, head, magnetic region and other parameters of the hard drive. -h displays Help. -I displays hardware specification information for the hard disk, which is provided by the hard disk itself at boot time. -I directly reads the hardware specifications provided by the hard drive. -k<0 or 1> when resetting the hard drive, the settings of the-DMU parameter are preserved. -k<0 or 1> when resetting the hard drive, the settings of the-APSWXZ parameter are preserved. -m< number of magnetic regions > Sets the number of partitions accessed by the hard disk. -n<0 or 1> ignores errors that occur when the hard disk is written. -p<pio mode > Set the PIO mode of the hard drive. -p< number of magnetic regions > Set the number of partitions on the hard drive internal cache. -Q does not display any information on the screen when the subsequent parameters are executed. -r<0 or 1> sets the read/write mode of the hard drive. -s< time > Set the wait time before the hard drive enters power save mode. -T evaluates the read efficiency of the hard drive. -T Pinggu hard drive cache read efficiency. -u<0 or 1> allows other interrupt requests to be performed while the hard disk is accessed. -V Displays the relevant settings for the drive. -w<0 or 1> sets the drive's write cache. -x< Transfer Mode > set the transmission mode of the hard drive. -y causes the IDE hard drive to enter power-saving mode. -y causes the IDE hard drive to enter sleep mode. -Z turns off the automatic power-down function for some Seagate hard drives.

Reference to: http://hi.baidu.com/dmkj2008/blog/item/df3b031bb514abc1ac6e757f.html

Http://smartmontools.sourceforge.net/man/smartd.conf.5.html

Http://www.bsdlover.cn/html/32/n-5332.html

This article original, reproduced please indicate the author, source

If there is any mistake, please correct me.

Email: [Email protected]

Linux HDD Detection (original)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More