To check hard disk for error, run
smartctl -q errorsonly -H -l selftest -l error /dev/sda
To list all smart data, run
smartctl -a -d ata /dev/sda
Here is a hard disk that is failing.
[root@3blogger ~]# smartctl -q errorsonly -H -l selftest -l error /dev/sda ATA Error Count: 1056 (device log contains only the most recent five errors) Error 1056 occurred at disk power-on lifetime: 49173 hours (2048 days + 21 hours) Error 1055 occurred at disk power-on lifetime: 49173 hours (2048 days + 21 hours) Error 1054 occurred at disk power-on lifetime: 49173 hours (2048 days + 21 hours) Error 1053 occurred at disk power-on lifetime: 49173 hours (2048 days + 21 hours) Error 1052 occurred at disk power-on lifetime: 49173 hours (2048 days + 21 hours) [root@3blogger ~]#
This server had following messages in /var/log/messages
Mar 10 00:09:20 3blogger kernel: ata1.00: exception Emask 0x0 SAct 0x4008 SErr 0x0 action 0x0 Mar 10 00:09:20 3blogger kernel: ata1.00: irq_stat 0x40000008 Mar 10 00:09:20 3blogger kernel: ata1.00: failed command: READ FPDMA QUEUED Mar 10 00:09:20 3blogger kernel: ata1.00: cmd 60/08:18:c0:f6:d4/00:00:54:00:00/40 tag 3 ncq 4096 in Mar 10 00:09:20 3blogger kernel: res 51/40:03:c5:f6:d4/00:00:54:00:00/40 Emask 0x409 (media error) <F> Mar 10 00:09:20 3blogger kernel: ata1.00: status: { DRDY ERR } Mar 10 00:09:20 3blogger kernel: ata1.00: error: { UNC } Mar 10 00:09:20 3blogger kernel: ata1.00: configured for UDMA/133 Mar 10 00:09:20 3blogger kernel: ata1: EH complete