Example of dangerous smartctl status

From vpsget wiki
Jump to: navigation, search

We show 2 examples from our experience when smart stauts is definitely bad and not bad yet but have critical warnings.


1. Smart health is ok but this disk should be replaced due to the "Failed in segment -->" in SMART Self-test log:

smartctl -a -d megaraid,6 /dev/sda 
Vendor:               SEAGATE 
Product:              ST3450857SS     
Revision:             0008
User Capacity:        450,098,159,616 bytes [450 GB]
Logical block size:   512 bytes
Logical Unit id:      0x5000c50067af9277
Serial number:        6SK24K2K0000B3380QYL
Device type:          disk
Transport protocol:   SAS
Local Time is:        Sat Dec  5 18:03:17 2020 CET
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK 

Current Drive Temperature:     31 C
Drive Trip Temperature:        68 C
Elements in grown defect list: 260
Vendor (Seagate) cache information
 Blocks sent to initiator = 831226979
 Blocks received from initiator = 3341429567
 Blocks read from cache and sent to initiator = 719449590
 Number of read and write commands whose size <= segment size = 2274586818
 Number of read and write commands whose size > segment size = 0
Vendor (Seagate/Hitachi) factory information
 number of hours powered up = 65461.05
 number of minutes until next internal SMART test = 3

Error counter log:
          Errors Corrected by           Total   Correction     Gigabytes    Total
              ECC          rereads/    errors   algorithm      processed    uncorrected
          fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   3841834232      430         0  3841834662   3841834700     253224.329          37
write:         0        0        22        22         39     105260.642           4
verify: 51475275       90         0  51475365   51475451       1351.656          19

Non-medium error count:      698

[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on'] 

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
    Description                              number   (hours)
# 1  Background short  Failed in segment -->       -   65416         874194141 [0x3 0x11 0x0]
# 2  Background short  Completed                   -    8594                 - [-   -    -]
Long (extended) Self Test duration: 4800 seconds [80.0 minutes]


2. This disk have BAD smart status so replace immediately. it's dying.

smartctl -a -d megaraid,7 /dev/sda 
Vendor:               SEAGATE 
Product:              ST3450857SS     
Revision:             0008
User Capacity:        450,098,159,616 bytes [450 GB]
Logical block size:   512 bytes
Logical Unit id:      0x5000c50067afa77b
Serial number:        6SK24JVZ0000B3391F19
Device type:          disk
Transport protocol:   SAS
Local Time is:        Sat Dec  5 18:04:01 2020 CET
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: FAILURE PREDICTION THRESHOLD EXCEEDED [asc=5d, ascq=0] 

Current Drive Temperature:     30 C
Drive Trip Temperature:        68 C
Elements in grown defect list: 186
Vendor (Seagate) cache information
  Blocks sent to initiator = 3152506554
  Blocks received from initiator = 585382373
  Blocks read from cache and sent to initiator = 2997086586
  Number of read and write commands whose size <= segment size = 1228523291
  Number of read and write commands whose size > segment size = 0
Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 65460.53
  number of minutes until next internal SMART test = 17

Error counter log:
          Errors Corrected by           Total   Correction     Gigabytes    Total
              ECC          rereads/    errors   algorithm      processed    uncorrected
          fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   1576586841       37         0  1576586878   1576587292     124750.677         414
write:         0        0         1         1          2      59790.074           0
verify:     3079        7         0      3086       3144          0.121           6 

Non-medium error count:        1

[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background short  Aborted (device reset ?)    -   65431                 - [-   -    -]