檢查硬碟有無支援TLER/ERC/CCTL

首先,這也是方便我自行查詢的文章。這幾天一直在研究硬碟,其中TLER/ERC/CCTL功能實在是個頭疼的問題,其對於硬體式RAID是很重要的功能。其為判定硬碟是否失效的重要功能,只要超過硬碟S.M.A.R.T所設定時間,硬體式RAID控制會自動將硬碟判為失效,甚至自動退出該硬碟。如果是不支援TLER/ERC/CCTL功能之硬碟,由於其等於可允許失效時間為1~2秒,系統會立即判定該硬碟offline.

又由於RAID組成的硬碟大都是同時間購買,零件壽命也差不多在同一期間,所以如果在更換硬碟前後或重建系統時,同時間被判定失效硬碟數量超過可允許失效硬碟數量上限,那將會面臨整個RAID系統資料毀滅的慘況。

至於Linux軟體式RAID (mdadm),其Read Command Timer跟 Write Command Timer 之Timerout預設值為30秒,遠超過一般支援TLER/ERC/CCTL硬碟所預設的7~13秒。但超過時間的硬碟,將判定為失效,會將該硬碟離線(offline)。而S.M.A.R.T不支援TLER / ERC / CCTL的硬碟,在遇到Command Timerout, 系統也會立即判定失效。

然而在NAS廠商的硬碟相容表中,只要是硬體S.M.A.R.T不支援TLER/ERC/CCTL功能的硬碟,就肯定是不建議使用硬碟。
至於S.M.A.R.T有支援TLER/ERC/CCTL功能,但預設值為Disable或0 sec, 由於等於沒有Recovery Timer上限時間,所以會依照預設值30秒或廠商設定時間來判定是否失效。(HGST跟Toshiba一般硬碟均是預設值為Disable或0 sec, 所以其Recovery Timer為無上限。)

所以在挑選RAID用硬碟時,TLER/ERC/CCTL功能是非常重要的。避震/共振均可以用硬體(消振式機殼+避震消振墊)來解決。

WD跟Seagate在Desktop等級硬碟均已取消S.M.A.R.T支援TLER/ERC/CCTL功能,完全無法藉由S.M.A.R.T軟體開啟跟設定,所以即使買9顆貴到爆的WD黑標,其架設成RAID的危險性遠比紅標還高上許多。然而此兩品牌商業版硬碟,  Seagate的ERC time為10 sec, WD為7 sec。

至於Toshiba跟HGST, 即使是Desktop等級硬碟則依然支援ERC/CCTL功能,預設值皆為Disable或0 sec,可利用S.M.A.R.T軟體設定開啟以及設定秒數。所以NAS廠商均有將此兩廠商的Desktop等級硬碟列為建議。至於這Toshiba跟HGST的商業等級硬碟,其支援ERC/CCTL功能,但預設值皆為Disable或0 sec。(HGST原廠答覆:這樣回復時間才沒有上限,一般裝置皆應為Disable, 但特殊型號有做設定。)

雖然大約知道toshiba跟HGST的Desktop等級硬碟並沒有取消CCTL功能,但實際上接上電腦時,有啟動嗎? 真的有支援嗎? 這一直對我來說是個大問號? 總算在今天試出個結果跟方法來!並且也透過台灣代理威健詢問了HGST相關ERC timer的預設值問題。

現在總結以下心得,方便大家也一起自我檢查。

首先有幾個先決條件:
1. MB的BIOS有支援最新的ACHI或相關修正。(我的GA-990FXA-UD5, 要更新至F12,才有修正ACHI問題.),並且開啟S.M.A.R.T.
2. 你的SATA或IDE的驅動程式有支援開啟S.M.A.R.T。(我的GA-990FXA-UD5,使用AMD SATA Controller 1.2.1.331版驅動程式才開啟S.M.A.R.T。至於Marvel 88SE9172要更新到1.2.0.1020才能正常。)
3. 電腦安裝smartmontools, 請下載 http://sourceforge.net/projects/smartmontools/

安裝smartmontools的安裝檔時,windows 64bit的請點取64-bit version.
image

雖然還有一套HDAT2,但其要在純DOS執行,且如果不熟悉操作的話,可能會誤將硬碟資料刪除,所以在此不教。

  1. 你要有硬碟接在支援S.M.A.R.T的SATA或IDE埠上!

5.  TLER/ERC/CCTL 各廠牌有各自的稱呼,
TLER: WD
ERC: Seagate
CCTL: Hitalchi、HGST、Toshiba 、三星


再來就開始檢查

  1. 執行smartmontools 的smartctl(Admin CMD)

image

image

  1. 輸入smartctl –scan,  找出電腦的硬碟代號 (是連字號:兩個『-』scan, 不知道為何在WordPress會變成超長的破折號)
    C:\Program Files\smartmontools\bin>smartctl –scan

可以看到此電腦有四個硬碟, 分別是
/dev/sda (Hitachi HDS721010CLA332)
/dev/sdb (
ST3640323AS
)
/dev/sdc   (Toshiba 3.5″ HDD DT01ACA)
/dev/sdd (WDC WD10EARS-00Y5B1)

請依照圖內文字輸入
image

  1. 首先檢查/dev/sdc (Toshiba 3.5″ HDD DT01ACA)所有的S.M.A.R.T資訊, 輸入smartctl -a /dev/sdc

如果執行後,沒有跑出S.M.A.R.T訊息,或說明不支援S.M.A.R.T,請檢查RAID/IDE/AHCI Controller驅動程式是否為新版,以及檢查BIOS是否開啟S.M.A.R.T.

C:\Program Files\smartmontools\bin>smartctl -a /dev/sdc
smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.org=== START OF INFORMATION SECTION ===
Model Family: Toshiba 3.5″ HDD DT01ACA…
Device Model: TOSHIBA DT01ACA200
Serial Number: 34VU2UVKS
LU WWN Device Id: 5 000039 ff3f5ae6d
Firmware Version: MX4OABB0
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sun Jun 22 22:08:42 2014
SMART support is: Available – device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x80) Offline data collection activity
was never started.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed

without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (14344) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off supp
ort.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 239) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported. (這就是有支援TLER/ERC/CCTL功能)
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_
FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always –
0
2 Throughput_Performance 0x0005 100 100 054 Pre-fail Offline –
0
3 Spin_Up_Time 0x0007 136 136 024 Pre-fail Always –
268 (Average 287)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always –
18
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always –
0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always –
0
8 Seek_Time_Performance 0x0005 100 100 020 Pre-fail Offline –
0
9 Power_On_Hours 0x0012 100 100 000 Old_age Always –
3
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always –
0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always –
17
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always –
18
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always –
18
194 Temperature_Celsius 0x0002 176 176 000 Old_age Always –
34 (Min/Max 25/37)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always –
0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always –
0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline –
0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always –
0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

  1. 再來檢查/dev/sdc (Toshiba 3.5″ HDD DT01ACA)的CCTL的運作狀態為何? 輸入smartctl -l scterc /dev/sdc
C:\Program Files\smartmontools\bin>smartctl -l scterc /dev/sdc
smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.orgSCT Error Recovery Control:
Read: Disabled
Write: Disabled

結果SCT Error Recovery Control是關閉Disable的。
(重點說明: HDD ERC的Disable不是指HDD的ERC功能關閉,而是HDD ERC限定時間無限,改由RAID控制器或 軟體RAID的ERC設定值來做判定是否要將HDD離線。下面還會有解釋。)

  1. 現在要開啟設定/dev/sdc (Toshiba 3.5″ HDD DT01ACA)的TLER功能容許上限時間, 輸入smartctl -l scterc,70,70 /dev/sdc
    (補充說明:其實這不是開啟ERC/TLER的功能,而只是設定ERC的容許上限時間。在HGST跟Toshiba的ERC HDD 做這個設定變更其實沒有任何意義,即使設定了,重開機就又回復為預設值。因為還有 RAID控制器跟軟體RAID的ERC可以決定HDD是否該離線的容許時間上限。)
C:\Program Files\smartmontools\bin>smartctl -l scterc,70,70 /dev/sdc
smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.orgSCT Error Recovery Control set to:
Read: 70 (7.0 seconds)
Write: 70 (7.0 seconds)

現在已經改成的7秒。(WD/Seagate企業級硬碟的TLER/ERC/CCTL回應時間大多設定在7~10秒左右。Toshiba/HGST則是預設Disable)

  1. 如果是不支援的硬碟,其S.M.A.R.T資訊為何?
    剛好我的電腦之dev/sdd WDC WD10EARS-00Y5B1(WD 黑標1TB), 就是徹底的不支援。
C:\Program Files\smartmontools\bin>smartctl -a /dev/sdd
smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.org=== START OF INFORMATION SECTION ===
Model Family: Western Digital Caviar Green (AF)
Device Model: WDC WD10EARS-00Y5B1
Serial Number: WD-WCAV5A820423
LU WWN Device Id: 5 0014ee 2aeff8abf
Firmware Version: 80.00A80
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 2.6, 3.0 Gb/s
Local Time is: Sun Jun 22 22:13:02 2014
SMART support is: Available – device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command
from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed

without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (20460) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off supp
ort.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 236) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x3035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
(找不到SCT Error Recovery Control supported)

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_
FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always –
0
3 Spin_Up_Time 0x0027 148 128 021 Pre-fail Always –
5600
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always –
45
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always –
0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always –
0
9 Power_On_Hours 0x0032 066 066 000 Old_age Always –
25090
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always –
0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always –
0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always –
40
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always –
22
193 Load_Cycle_Count 0x0032 003 003 000 Old_age Always –
591010
194 Temperature_Celsius 0x0022 114 098 000 Old_age Always –
33
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always –
0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always –
0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline –
0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always –
0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline –
0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA
_of_first_error
# 1 Short offline Completed without error 00% 25029 –
# 2 Extended offline Completed without error 00% 24963 –
# 3 Conveyance offline Completed without error 00% 0 –

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

  1. 執行smartctl -l scterc /dev/sdd 檢查, 也是沒用,因為硬體就是不支援,無法由韌體或軟體開啟。所以這類硬碟就千萬不要用在RAID或NAS系統。
C:\Program Files\smartmontools\bin>smartctl -l scterc /dev/sdd
smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.orgSCT Error Recovery Control command not supported
  1. 但就連我古老的Hitachi HDS721010CLA332有支援ERC\CCTL,預設是Disable, 也是可以利用輸入smartctl -l scterc,100,100 /dev/sda 變更。(原廠建議上限10 sec, 但最終建議為Disable.)
:\Program Files\smartmontools\bin>smartctl -a /dev/sda
smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.org=== START OF INFORMATION SECTION ===
Model Family: Hitachi Deskstar 7K1000.C
Device Model: Hitachi HDS721010CLA332

 

SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

C:\Program Files\smartmontools\bin>smartctl -l scterc /dev/sda
smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.orgSCT Error Recovery Control:
Read: Disabled
Write: Disabled
C:\Program Files\smartmontools\bin>smartctl -l scterc,100,100 /dev/sda
smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.orgSCT Error Recovery Control set to:
Read: 100 (10.0 seconds)
Write: 100 (10.0 seconds)

10.  /dev/sdb ST3640323AS也有支援ERC,預設也是Disable, 也是可以利用輸入smartctl -l scterc,100,100 /dev/sdb 開啟

C:\Program Files\smartmontools\bin>smartctl -a /dev/sdb
smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.org=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.11
Device Model: ST3640323AS
Serial Number: 9VK05ZCK
LU WWN Device Id: 5 000c50 00d8a191c
Firmware Version: SD1B
User Capacity: 640,133,946,880 bytes [640 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm

 

SCT capabilities: (0x103b) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

C:\Program Files\smartmontools\bin>smartctl -l scterc /dev/sdb
smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.orgSCT Error Recovery Control:
Read: Disabled
Write: Disabled

 

C:\Program Files\smartmontools\bin>smartctl -l scterc,100,100 /dev/sdb
smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.orgSCT Error Recovery Control set to:
Read: 100 (10.0 seconds)
Write: 100 (10.0 seconds)

但即使設定了,只要關機後,重新開機就沒了…

  1. 那HGST 企業級Ultrastar 7K4000 為何? 我擁有的型號是HGST HUS724020ALA640
C:\Program Files\smartmontools\bin>smartctl -a /dev/sde
smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.org=== START OF INFORMATION SECTION ===
Device Model: HGST HUS724020ALA640
Serial Number: PN2131P6GMEKTP
LU WWN Device Id: 5 000cca 22dc8d601
Firmware Version: MF6OAA70
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Jun 23 23:21:35 2014
SMART support is: Available – device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command
from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed

without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 28) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off supp
ort.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 322) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_
FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always –
0
2 Throughput_Performance 0x0005 136 136 054 Pre-fail Offline –
80
3 Spin_Up_Time 0x0007 128 128 024 Pre-fail Always –
486 (Average 488)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always –
18
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always –
0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always –
0
8 Seek_Time_Performance 0x0005 145 145 020 Pre-fail Offline –
24
9 Power_On_Hours 0x0012 100 100 000 Old_age Always –
763
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always –
0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always –
16
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always –
22
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always –
22
194 Temperature_Celsius 0x0002 153 153 000 Old_age Always –
39 (Min/Max 25/48)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always –
0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always –
0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline –
0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always –
0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

但結果HGST HUS724020ALA640的SCT Error Recovery Control竟然是Disable! 這對當時的我來說可真是晴天霹靂! 因為台灣的媒體一直在強調ERC time的重要性。

C:\Program Files\smartmontools\bin>smartctl -l scterc /dev/sde
smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.orgSCT Error Recovery Control:
Read: Disabled
Write: Disabled

並且用HDAT2 5.0檢查也一樣…Read Command Timer跟 Write Command Timer 皆為0…

20140624_00371020140624_00403320140624_004236

所以HGST 企業級Ultrastar 7K4000 硬碟難道不該用在硬體式RAID儲存上嗎?  (Intel/AMD chipset內建的可是硬體Raid。更不要說企業級專用的LSI RAID Controller.)

於是我就請台灣代理商威健的陳先生,請他幫忙詢問HGST原廠相關的問題。

請威健代詢HGST原廠的問題答覆

image

1. 為何同樣是企業級硬碟,Seagate Constellation ES.3其ERC皆為10 sec, WD為7 sec, 為何HGST則是為Disable?是獨有的優良傳統嗎?

=> If not default disable, what value should be set? Disable means no recovery time limit. (Disable就表示回復時間沒有上限.)

Of course, we have some unique p/n who have unique default value. But generic drive should be default disable.(然,HGST有特殊型號有特殊的設定值,但一般裝置的預設值則為Disable)

2. 依照86頁說明,其ERC Command規格上限時間為10 sec(跟Seagate Constellation ES.3  default value相同),請幫忙確認User是否可自行設定為此值?

=> No, only unique p/n support.(不,只有獨特的型號支援)

3. These command timers are volatile. The default value is 0 (i.e. disable command time-out). 這句話就已經明確表示,即使我自行使用S.M.A.R.T軟體設定command timers,只要冷開機,就是歸回default value. 原廠是否願意提供韌體,讓User開啟此功能?

(否則我幹嘛買企業級硬碟做RAID? 沒有做Enable the Error Recovery Control跟setup timer, Hardware RAID跟高階企業用NAS只要偵測到Disk Error(即使是誤判), 下一秒就是判定Disk fail跟Offline; 即使Reliability再好也沒用啊。)

=> Setting are volatile, so return to default (disable) by power cycle .Disable means no recovery time limit.

(其設定是揮發性的,所以只要電源關閉重啟,其就回到預設值,Disable就表示回復時間沒有上限.)

所以依照HGST原廠的意思:

一般裝置的預設值就應該是Disable

其才沒有回復時間上限

現在只能保佑QNAP的工程師的程式設計有將每一個支援CCTL的硬碟,將其設定Read Command Timer跟 Write Command Timer 之預設值為7秒或更長的時間。

Linux的軟體式RAID的SCSI Disk lyaer的command timeout預設設定為30秒。


TOSHIBA MG03ACA200 此硬碟為菲律賓製,跟日本限定版(MD03ACA200)幾乎同外型,但底部有明顯的不同。

C:\Program Files\smartmontools\bin>smartctl -a /dev/sdc
smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-win8] (sf-6.2-1)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model:     TOSHIBA MG03ACA200
Serial Number:    53OHKA0MF
LU WWN Device Id: 5 000039 4cbf02379
Firmware Version: FL1A
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Thu Jul 03 22:17:59 2014
SMART support is: Available – device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status:  (0x80) Offline data collection activity
was never started.
Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection:                (  120) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off supp
ort.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 337) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_
FAILED RAW_VALUE
1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       –
0
2 Throughput_Performance  0x0005   100   100   050    Pre-fail  Offline      –
0
3 Spin_Up_Time            0x0027   100   100   001    Pre-fail  Always       –
6206
4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       –
3
5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail  Always       –
0
7 Seek_Error_Rate         0x000b   100   100   050    Pre-fail  Always       –
0
8 Seek_Time_Performance   0x0005   100   100   050    Pre-fail  Offline      –
0
9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       –
0
10 Spin_Retry_Count        0x0033   100   100   030    Pre-fail  Always       –
0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       –
3
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       –
0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       –
1
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       –
4
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       –
32 (Min/Max 23/32)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       –
0
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       –
0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      –
0
199 UDMA_CRC_Error_Count    0x0032   200   253   000    Old_age   Always       –
0
220 Disk_Shift              0x0002   100   100   000    Old_age   Always       –
0
222 Loaded_Hours            0x0032   100   100   000    Old_age   Always       –
0
223 Load_Retry_Count        0x0032   100   100   000    Old_age   Always       –
0
224 Load_Friction           0x0022   100   100   000    Old_age   Always       –
0
226 Load-in_Time            0x0026   100   100   000    Old_age   Always       –
103
240 Head_Flying_Hours       0x0001   100   100   001    Pre-fail  Offline      –
0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
1        0        0  Not_testing
2        0        0  Not_testing
3        0        0  Not_testing
4        0        0  Not_testing
5        0        0  Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

 

C:\Program Files\smartmontools\bin>smartctl -l scterc /dev/sdc
smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-win8] (sf-6.2-1)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.org
SCT Error Recovery Control:
Read: Disabled
Write: Disabled
廣告

台灣專有錯誤型號Deskstar 7K4000 4TB H31K40003272SA 、H31K4003254SA、H31KNAS4003272SA

約在5月29日時,台灣HGST在台灣開始販售其Deskstar 7K4000 4TB(H3IK40003272SA ), 結果台灣的線上網站全是註明型號為( H31K40003272SA)

HGST 昱科(Hitachi) 4TB 3.5吋 7200轉 64M快取 SATA3 硬碟(H31K40003272SA)

http://24h.pchome.com.tw/prod/DRAB37-A84289607?q=/S/DRAB37

可是我上了HGST網站跟國外網站卻完全找不到該型號。

http://www.hgst.com/tech/techlib.nsf/techdocs/A9AF74F697524DFD882577F9000CF8BD/$file/Desktop_IDK_ds.pdf

Deskstar 7K4000 4TB 台灣販售版本其正確型號應為: H3IK40003272SA ,
2014-06-04_120357

不知道是哪個白痴,打成H31K40003272SA(把I打成1). 造成這個錯誤型號,全世界只有台灣才搜尋的到.

我把相關訊息傳給Autobuy, 其答覆如下

看來是廠商一開始的商品文案就錯了…

Autobuy後來就修改資料了。但可惜的是還有錯誤未修改
H31K40003272SA

PCHOME線上購物則是一上場就錯到現在
H31K40003272SA
http://24h.pchome.com.tw/prod/DRAB37-A84289607?q=/S/DRAB37

但PCHOME線上購物還不只一個HGST硬碟型號是錯的
H31KNAS4003272SA

http://24h.pchome.com.tw/prod/DRAB37-A84290092?q=/S/DRAB37

H31KNAS4003272SA其正確型號是H3IKNAS4003272SA, (錯誤型號也是把I打成1).
H3IKNAS4003272SA
可參考http://www.hgst.com/tech/techlib.nsf/techdocs/E24F75A14027DBAC88257C74007DB2AD/$file/DS_NAS_ds.pdf

H31K4003254SA

http://24h.pchome.com.tw/prod/DRAB0G-A90052CCZ?q=/S/DRAB37

H31K4003254SA其正確型號是H3IK4003254SA, (錯誤型號也是把I打成1).
可參考http://www.hgst.com/tech/techlib.nsf/techdocs/A9AF74F697524DFD882577F9000CF8BD/$file/Desktop_IDK_ds.pdf

H3IK4003254SA