This guide outlines the steps to diagnose hard drive issues, ranging from system freezes (I/O Wait) to physical hardware failure.
Before checking the disk physically, check if the disk is the bottleneck causing system lag or “freezes”.
Run this command to see system activity in real-time:
vmstat 1
What to look for:
If the disk is disconnecting or timing out, the kernel will log it.
dmesg | grep -i "error\|fail\|ata\|scsi"
Red Flags:
Use ``smartctl`` to query the drive's internal health logs.
This command filters out the noise and shows only the critical health indicators:
smartctl -a /dev/sdX | grep -E "(Health|Error|Reallocated|Pending|Uncorrectable|CRC|Load_Cycle|Power_On)"
Replace /dev/sdX with your drive (e.g., /dev/sda)
Here is how to read the attributes based on our diagnosis:
If any of these are greater than 0, the drive is dying and must be replaced immediately.
These indicate why a drive might be slow or unreliable, even if “Healthy”.
Ignore high raw values for:
On Seagate drives, these are internal counters, not error counts. Only worry if the “VALUE” drops below “THRESH”.
If a drive is healthy but causes RAID arrays to freeze during sync (speed drops to ~700KB/s), it is likely SMR (Shingled Magnetic Recording).
Symptoms:
Solution:
| Attribute | Value | Verdict |
|---|---|---|
| Reallocated / Pending | > 0 | REPLACE IMMEDIATELY (Dead) |
| Load Cycle Count | > 600k | WARNING (Mechanical Wear) |
| CRC Errors | > 0 | CHECK CABLE |
| Resync Speed | < 1MB/s | SMR DRIVE (Unsuitable for RAID) |