首先,這也是方便我自行查詢的文章。這幾天一直在研究硬碟,其中TLER/ERC/CCTL功能實在是個頭疼的問題,其對於硬體式RAID是很重要的功能。其為判定硬碟是否失效的重要功能,只要超過硬碟S.M.A.R.T所設定時間,硬體式RAID控制會自動將硬碟判為失效,甚至自動退出該硬碟。如果是不支援TLER/ERC/CCTL功能之硬碟,由於其等於可允許失效時間為1~2秒,系統會立即判定該硬碟offline.
又由於RAID組成的硬碟大都是同時間購買,零件壽命也差不多在同一期間,所以如果在更換硬碟前後或重建系統時,同時間被判定失效硬碟數量超過可允許失效硬碟數量上限,那將會面臨整個RAID系統資料毀滅的慘況。
至於Linux軟體式RAID (mdadm),其Read Command Timer跟 Write Command Timer 之Timerout預設值為30秒,遠超過一般支援TLER/ERC/CCTL硬碟所預設的7~13秒。但超過時間的硬碟,將判定為失效,會將該硬碟離線(offline)。而S.M.A.R.T不支援TLER / ERC / CCTL的硬碟,在遇到Command Timerout, 系統也會立即判定失效。
然而在NAS廠商的硬碟相容表中,只要是硬體S.M.A.R.T不支援TLER/ERC/CCTL功能的硬碟,就肯定是不建議使用硬碟。
至於S.M.A.R.T有支援TLER/ERC/CCTL功能,但預設值為Disable或0 sec, 由於等於沒有Recovery Timer上限時間,所以會依照預設值30秒或廠商設定時間來判定是否失效。(HGST跟Toshiba一般硬碟均是預設值為Disable或0 sec, 所以其Recovery Timer為無上限。)
所以在挑選RAID用硬碟時,TLER/ERC/CCTL功能是非常重要的。避震/共振均可以用硬體(消振式機殼+避震消振墊)來解決。
WD跟Seagate在Desktop等級硬碟均已取消S.M.A.R.T支援TLER/ERC/CCTL功能,完全無法藉由S.M.A.R.T軟體開啟跟設定,所以即使買9顆貴到爆的WD黑標,其架設成RAID的危險性遠比紅標還高上許多。然而此兩品牌商業版硬碟, Seagate的ERC time為10 sec, WD為7 sec。
至於Toshiba跟HGST, 即使是Desktop等級硬碟則依然支援ERC/CCTL功能,預設值皆為Disable或0 sec,可利用S.M.A.R.T軟體設定開啟以及設定秒數。所以NAS廠商均有將此兩廠商的Desktop等級硬碟列為建議。至於這Toshiba跟HGST的商業等級硬碟,其支援ERC/CCTL功能,但預設值皆為Disable或0 sec。(HGST原廠答覆:這樣回復時間才沒有上限,一般裝置皆應為Disable, 但特殊型號有做設定。)
雖然大約知道toshiba跟HGST的Desktop等級硬碟並沒有取消CCTL功能,但實際上接上電腦時,有啟動嗎? 真的有支援嗎? 這一直對我來說是個大問號? 總算在今天試出個結果跟方法來!並且也透過台灣代理威健詢問了HGST相關ERC timer的預設值問題。
現在總結以下心得,方便大家也一起自我檢查。
首先有幾個先決條件:
1. MB的BIOS有支援最新的ACHI或相關修正。(我的GA-990FXA-UD5, 要更新至F12,才有修正ACHI問題.),並且開啟S.M.A.R.T.
2. 你的SATA或IDE的驅動程式有支援開啟S.M.A.R.T。(我的GA-990FXA-UD5,使用AMD SATA Controller 1.2.1.331版驅動程式才開啟S.M.A.R.T。至於Marvel 88SE9172要更新到1.2.0.1020才能正常。)
3. 電腦安裝smartmontools, 請下載 http://sourceforge.net/projects/smartmontools/
安裝smartmontools的安裝檔時,windows 64bit的請點取64-bit version.

雖然還有一套HDAT2,但其要在純DOS執行,且如果不熟悉操作的話,可能會誤將硬碟資料刪除,所以在此不教。
- 你要有硬碟接在支援S.M.A.R.T的SATA或IDE埠上!
5. TLER/ERC/CCTL 各廠牌有各自的稱呼,
TLER: WD
ERC: Seagate
CCTL: Hitalchi、HGST、Toshiba 、三星
再來就開始檢查
- 執行smartmontools 的smartctl(Admin CMD)


- 輸入smartctl –scan, 找出電腦的硬碟代號 (是連字號:兩個『-』scan, 不知道為何在WordPress會變成超長的破折號)
C:\Program Files\smartmontools\bin>smartctl –scan
可以看到此電腦有四個硬碟, 分別是
/dev/sda (Hitachi HDS721010CLA332)
/dev/sdb (ST3640323AS)
/dev/sdc (Toshiba 3.5″ HDD DT01ACA)
/dev/sdd (WDC WD10EARS-00Y5B1)
請依照圖內文字輸入

- 首先檢查/dev/sdc (Toshiba 3.5″ HDD DT01ACA)所有的S.M.A.R.T資訊, 輸入smartctl -a /dev/sdc
如果執行後,沒有跑出S.M.A.R.T訊息,或說明不支援S.M.A.R.T,請檢查RAID/IDE/AHCI Controller驅動程式是否為新版,以及檢查BIOS是否開啟S.M.A.R.T.
| C:\Program Files\smartmontools\bin>smartctl -a /dev/sdc smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1) Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.org=== START OF INFORMATION SECTION === Model Family: Toshiba 3.5″ HDD DT01ACA… Device Model: TOSHIBA DT01ACA200 Serial Number: 34VU2UVKS LU WWN Device Id: 5 000039 ff3f5ae6d Firmware Version: MX4OABB0 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Sun Jun 22 22:08:42 2014 SMART support is: Available – device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === General SMART Values: without error or no self-test has ever SMART Attributes Data Structure revision number: 16 SMART Error Log Version: 1 SMART Self-test log structure revision number 1 SMART Selective self-test log data structure revision number 1 |
- 再來檢查/dev/sdc (Toshiba 3.5″ HDD DT01ACA)的CCTL的運作狀態為何? 輸入smartctl -l scterc /dev/sdc
| C:\Program Files\smartmontools\bin>smartctl -l scterc /dev/sdc smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1) Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.orgSCT Error Recovery Control: Read: Disabled Write: Disabled |
結果SCT Error Recovery Control是關閉Disable的。
(重點說明: HDD ERC的Disable不是指HDD的ERC功能關閉,而是HDD ERC限定時間無限,改由RAID控制器或 軟體RAID的ERC設定值來做判定是否要將HDD離線。下面還會有解釋。)
- 現在要
開啟設定/dev/sdc (Toshiba 3.5″ HDD DT01ACA)的TLER功能容許上限時間, 輸入smartctl -l scterc,70,70 /dev/sdc
(補充說明:其實這不是開啟ERC/TLER的功能,而只是設定ERC的容許上限時間。在HGST跟Toshiba的ERC HDD 做這個設定變更其實沒有任何意義,即使設定了,重開機就又回復為預設值。因為還有 RAID控制器跟軟體RAID的ERC可以決定HDD是否該離線的容許時間上限。)
| C:\Program Files\smartmontools\bin>smartctl -l scterc,70,70 /dev/sdc smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1) Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.orgSCT Error Recovery Control set to: Read: 70 (7.0 seconds) Write: 70 (7.0 seconds) |
現在已經改成的7秒。(WD/Seagate企業級硬碟的TLER/ERC/CCTL回應時間大多設定在7~10秒左右。Toshiba/HGST則是預設Disable)
- 如果是不支援的硬碟,其S.M.A.R.T資訊為何?
剛好我的電腦之dev/sdd WDC WD10EARS-00Y5B1(WD 黑標1TB), 就是徹底的不支援。
| C:\Program Files\smartmontools\bin>smartctl -a /dev/sdd smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1) Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.org=== START OF INFORMATION SECTION === Model Family: Western Digital Caviar Green (AF) Device Model: WDC WD10EARS-00Y5B1 Serial Number: WD-WCAV5A820423 LU WWN Device Id: 5 0014ee 2aeff8abf Firmware Version: 80.00A80 User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS (minor revision not indicated) SATA Version is: SATA 2.6, 3.0 Gb/s Local Time is: Sun Jun 22 22:13:02 2014 SMART support is: Available – device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === General SMART Values: without error or no self-test has ever SMART Attributes Data Structure revision number: 16 SMART Error Log Version: 1 SMART Self-test log structure revision number 1 SMART Selective self-test log data structure revision number 1 |
- 執行smartctl -l scterc /dev/sdd 檢查, 也是沒用,因為硬體就是不支援,無法由韌體或軟體開啟。所以這類硬碟就千萬不要用在RAID或NAS系統。
| C:\Program Files\smartmontools\bin>smartctl -l scterc /dev/sdd smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1) Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.orgSCT Error Recovery Control command not supported |
- 但就連我古老的Hitachi HDS721010CLA332有支援ERC\CCTL,預設是Disable, 也是可以利用輸入smartctl -l scterc,100,100 /dev/sda 變更。(原廠建議上限10 sec, 但最終建議為Disable.)
| :\Program Files\smartmontools\bin>smartctl -a /dev/sda smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1) Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.org=== START OF INFORMATION SECTION === Model Family: Hitachi Deskstar 7K1000.C Device Model: Hitachi HDS721010CLA332
SCT capabilities: (0x003d) SCT Status supported. |
| C:\Program Files\smartmontools\bin>smartctl -l scterc /dev/sda smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1) Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.orgSCT Error Recovery Control: Read: Disabled Write: Disabled |
| C:\Program Files\smartmontools\bin>smartctl -l scterc,100,100 /dev/sda smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1) Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.orgSCT Error Recovery Control set to: Read: 100 (10.0 seconds) Write: 100 (10.0 seconds) |
10. /dev/sdb ST3640323AS也有支援ERC,預設也是Disable, 也是可以利用輸入smartctl -l scterc,100,100 /dev/sdb 開啟
| C:\Program Files\smartmontools\bin>smartctl -a /dev/sdb smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1) Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.org=== START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.11 Device Model: ST3640323AS Serial Number: 9VK05ZCK LU WWN Device Id: 5 000c50 00d8a191c Firmware Version: SD1B User Capacity: 640,133,946,880 bytes [640 GB] Sector Size: 512 bytes logical/physical Rotation Rate: 7200 rpm
SCT capabilities: (0x103b) SCT Status supported. |
| C:\Program Files\smartmontools\bin>smartctl -l scterc /dev/sdb smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1) Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.orgSCT Error Recovery Control: Read: Disabled Write: Disabled |
| C:\Program Files\smartmontools\bin>smartctl -l scterc,100,100 /dev/sdb smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1) Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.orgSCT Error Recovery Control set to: Read: 100 (10.0 seconds) Write: 100 (10.0 seconds) |
但即使設定了,只要關機後,重新開機就沒了…
- 那HGST 企業級Ultrastar 7K4000 為何? 我擁有的型號是HGST HUS724020ALA640
| C:\Program Files\smartmontools\bin>smartctl -a /dev/sde smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1) Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.org=== START OF INFORMATION SECTION === Device Model: HGST HUS724020ALA640 Serial Number: PN2131P6GMEKTP LU WWN Device Id: 5 000cca 22dc8d601 Firmware Version: MF6OAA70 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: 7200 rpm Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Mon Jun 23 23:21:35 2014 SMART support is: Available – device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === General SMART Values: without error or no self-test has ever SMART Attributes Data Structure revision number: 16 SMART Error Log Version: 1 SMART Self-test log structure revision number 1 SMART Selective self-test log data structure revision number 1 |
但結果HGST HUS724020ALA640的SCT Error Recovery Control竟然是Disable! 這對當時的我來說可真是晴天霹靂! 因為台灣的媒體一直在強調ERC time的重要性。
| C:\Program Files\smartmontools\bin>smartctl -l scterc /dev/sde smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-vista-sp2] (sf-6.2-1) Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.orgSCT Error Recovery Control: Read: Disabled Write: Disabled |
並且用HDAT2 5.0檢查也一樣…Read Command Timer跟 Write Command Timer 皆為0…



所以HGST 企業級Ultrastar 7K4000 硬碟難道不該用在硬體式RAID儲存上嗎? (Intel/AMD chipset內建的可是硬體Raid。更不要說企業級專用的LSI RAID Controller.)
於是我就請台灣代理商威健的陳先生,請他幫忙詢問HGST原廠相關的問題。
|
請威健代詢HGST原廠的問題答覆
|
| 1. 為何同樣是企業級硬碟,Seagate Constellation ES.3其ERC皆為10 sec, WD為7 sec, 為何HGST則是為Disable?是獨有的優良傳統嗎?
=> If not default disable, what value should be set? Disable means no recovery time limit. (Disable就表示回復時間沒有上限.) Of course, we have some unique p/n who have unique default value. But generic drive should be default disable.(當然,HGST有特殊型號有特殊的設定值,但一般裝置的預設值則為Disable) |
| 2. 依照86頁說明,其ERC Command規格上限時間為10 sec(跟Seagate Constellation ES.3 default value相同),請幫忙確認User是否可自行設定為此值?
=> No, only unique p/n support.(不,只有獨特的型號支援) |
| 3. These command timers are volatile. The default value is 0 (i.e. disable command time-out). 這句話就已經明確表示,即使我自行使用S.M.A.R.T軟體設定command timers,只要冷開機,就是歸回default value. 原廠是否願意提供韌體,讓User開啟此功能?
(否則我幹嘛買企業級硬碟做RAID? 沒有做Enable the Error Recovery Control跟setup timer, Hardware RAID跟高階企業用NAS只要偵測到Disk Error(即使是誤判), 下一秒就是判定Disk fail跟Offline; 即使Reliability再好也沒用啊。) => Setting are volatile, so return to default (disable) by power cycle .Disable means no recovery time limit. (其設定是揮發性的,所以只要電源關閉重啟,其就回到預設值,Disable就表示回復時間沒有上限.) |
所以依照HGST原廠的意思:
一般裝置的預設值就應該是Disable
其才沒有回復時間上限
現在只能保佑QNAP的工程師的程式設計有將每一個支援CCTL的硬碟,將其設定Read Command Timer跟 Write Command Timer 之預設值為7秒或更長的時間。
Linux的軟體式RAID的SCSI Disk lyaer的command timeout預設設定為30秒。
TOSHIBA MG03ACA200 此硬碟為菲律賓製,跟日本限定版(MD03ACA200)幾乎同外型,但底部有明顯的不同。
| C:\Program Files\smartmontools\bin>smartctl -a /dev/sdc smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-win8] (sf-6.2-1) Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.org === START OF INFORMATION SECTION === Device Model: TOSHIBA MG03ACA200 Serial Number: 53OHKA0MF LU WWN Device Id: 5 000039 4cbf02379 Firmware Version: FL1A User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: 7200 rpm Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ATA8-ACS (minor revision not indicated) SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Thu Jul 03 22:17:59 2014 SMART support is: Available – device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x80) Offline data collection activity was never started. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 120) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off supp ort. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 337) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_ FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 050 Pre-fail Always – 0 2 Throughput_Performance 0x0005 100 100 050 Pre-fail Offline – 0 3 Spin_Up_Time 0x0027 100 100 001 Pre-fail Always – 6206 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always – 3 5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always – 0 7 Seek_Error_Rate 0x000b 100 100 050 Pre-fail Always – 0 8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail Offline – 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always – 0 10 Spin_Retry_Count 0x0033 100 100 030 Pre-fail Always – 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always – 3 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always – 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always – 1 193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always – 4 194 Temperature_Celsius 0x0022 100 100 000 Old_age Always – 32 (Min/Max 23/32) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always – 0 197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always – 0 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline – 0 199 UDMA_CRC_Error_Count 0x0032 200 253 000 Old_age Always – 0 220 Disk_Shift 0x0002 100 100 000 Old_age Always – 0 222 Loaded_Hours 0x0032 100 100 000 Old_age Always – 0 223 Load_Retry_Count 0x0032 100 100 000 Old_age Always – 0 224 Load_Friction 0x0022 100 100 000 Old_age Always – 0 226 Load-in_Time 0x0026 100 100 000 Old_age Always – 103 240 Head_Flying_Hours 0x0001 100 100 001 Pre-fail Offline – 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. |
| C:\Program Files\smartmontools\bin>smartctl -l scterc /dev/sdc smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-win8] (sf-6.2-1) Copyright (C) 2002-13, Bruce Allen, Christian Franke, http://www.smartmontools.org SCT Error Recovery Control: Read: Disabled Write: Disabled |




回覆給im5481 取消回覆