Checking hard disk health with smartctl

To check hard disk for error, run

smartctl -q errorsonly -H -l selftest -l error /dev/sda

To list all smart data, run

smartctl -a -d ata /dev/sda

Here is a hard disk that is failing.

[root@3blogger ~]# smartctl -q errorsonly -H -l selftest -l error /dev/sda
ATA Error Count: 1056 (device log contains only the most recent five errors)
Error 1056 occurred at disk power-on lifetime: 49173 hours (2048 days + 21 hours)
Error 1055 occurred at disk power-on lifetime: 49173 hours (2048 days + 21 hours)
Error 1054 occurred at disk power-on lifetime: 49173 hours (2048 days + 21 hours)
Error 1053 occurred at disk power-on lifetime: 49173 hours (2048 days + 21 hours)
Error 1052 occurred at disk power-on lifetime: 49173 hours (2048 days + 21 hours)

[root@3blogger ~]# 

This server had following messages in /var/log/messages

Mar 10 00:09:20 3blogger kernel: ata1.00: exception Emask 0x0 SAct 0x4008 SErr 0x0 action 0x0
Mar 10 00:09:20 3blogger kernel: ata1.00: irq_stat 0x40000008
Mar 10 00:09:20 3blogger kernel: ata1.00: failed command: READ FPDMA QUEUED
Mar 10 00:09:20 3blogger kernel: ata1.00: cmd 60/08:18:c0:f6:d4/00:00:54:00:00/40 tag 3 ncq 4096 in
Mar 10 00:09:20 3blogger kernel:         res 51/40:03:c5:f6:d4/00:00:54:00:00/40 Emask 0x409 (media error) <F>
Mar 10 00:09:20 3blogger kernel: ata1.00: status: { DRDY ERR }
Mar 10 00:09:20 3blogger kernel: ata1.00: error: { UNC }
Mar 10 00:09:20 3blogger kernel: ata1.00: configured for UDMA/133
Mar 10 00:09:20 3blogger kernel: ata1: EH complete

Posted in Linux