Mac OS X is beating your hard drives to death. Here’s the fix.

You read that right.

Quick link to the fix before I get to my usual rambling: hdapm. Install it and it will automatically set itself to auto-start on each boot and disable the auto park feature for all your drives.

Under Linux you can also use the hdparm command. Please note that you still need to fix your Mac OS X system with hdapm though as it will by default reset the power management on each boot!!
hdparm -B 255 [device]
or, if that throws an error
hdparm -B 254 [device]

device is usually /dev/sda.

My usual rambling as to the background on this problem follows…. 🙂

Inside each of your modern hard disk drives, there is a head “lifter” ramp that the heads are parked on when the disks are not spinning. On older drives, they parked on the media, but times were different back then… the bit densities were lower, the heads floated on a thicker cushion of air, and more importantly, there was room for a layer of lubricant material to be baked onto the disks, kind of like the nonstick coating on cookware. This prevented a condition known as ‘stiction’ which causes the heads to stick to a disc once they settle down. On a modern drive, if you get the heads onto the platter somehow with it spun down, they will stick instantly and tenaciously. The drive usually has firmware routines that induce vibration and make all sorts of silly noises to shake them loose if it happens…

Anyway, the lifter ramp is not a bad idea in itself. Mobile hard drives used it for years to keep the heads safely locked away and prevent scratching / head “crashes” when the drive experiences shock and vibe in handling while powered off. Later drives would also retract the heads if they detected vibration nearing limits using a small accelerometer on the drive. Another variant also used the accelerometer to detect if the system was entering a free fall and would park the heads before impact.

Unfortunately, some goofbag, probably at Western Digital, did some testing and figured out that a drive left spinning with the heads unloaded used less power due to reduced aerodynamic drag but was still reasonably fast to return to service on user interaction. They based the Caviar Green series drives on this “feature” and it seemed okay….

Until the hundreds of thousands of load/unload cycles destroyed the drives in very very short order. OOPSIE NOODLES!!

 

Many other hard drives also support this same method of operation but do not enable it by default, under the Advanced Power Management feature set. The Western Decrepit drives enforce it by default unless you hit them with the wdidle utility and disable it.

Welp, guess what Apple decided to enable, by default, to be all “helpful”?

Here are SMART readouts from a potpourri of Mac systems and drives stewing in my pot.

Note that many hard drives are specified for 300,000 lifetime load/unload cycles. Under aggressive power management settings in average use, the drive may reach this in only a couple of months!!

When the load mechanism wears out, the drive usually exhibits a rapidly increasing amount of read errors. You can usually get your data back, but no guarantees here – I did see one just show up stone dead suddenly when the ramps wore through and the heads BROKE RIGHT OFF.

Model Family:     Seagate Momentus 7200.4
Device Model:     ST9500420AS
Serial Number:    fnord!
LU WWN Device Id: 5 000c50 029ecdd91
Firmware Version: 0002SDM1
User Capacity:    500,107,862,016 bytes [500 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Tue Nov 11 15:14:41 2014 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 105 099 006 Pre-fail Always - 7939035
3 Spin_Up_Time 0x0003 100 100 085 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 098 098 020 Old_age Always - 2705
5 Reallocated_Sector_Ct 0x0033 098 098 036 Pre-fail Always - 58
7 Seek_Error_Rate 0x000f 070 060 030 Pre-fail Always - 73182408209
9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 6174
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 098 098 020 Old_age Always - 2683
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 099 000 Old_age Always - 4295032833
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 063 052 045 Old_age Always - 37 (Min/Max 28/41)
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 160
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 53
193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 403063
194 Temperature_Celsius 0x0022 037 048 000 Old_age Always - 37 (0 16 0 0 0)
195 Hardware_ECC_Recovered 0x001a 043 033 000 Old_age Always - 7939035
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 5677946770670
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 1580450194
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 3181821075
254 Free_Fall_Sensor 0x0032 100 100 000 Old_age Always - 0

=== START OF INFORMATION SECTION ===
Device Model: APPLE HDD HTS541010A9E662
Serial Number: Weebles wobble but they don't fall down
LU WWN Device Id: 5 000cca 6c6cf5b84
Firmware Version: JA0AB560
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA8-ACS T13/1699-D revision 6
SATA Version is: SATA 2.6, 3.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Tue Nov 11 15:45:25 2014 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 062 Pre-fail Always - 0
2 Throughput_Performance 0x0005 100 100 040 Pre-fail Offline - 0
3 Spin_Up_Time 0x0007 174 174 033 Pre-fail Always - 1
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 125
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 100 100 040 Pre-fail Offline - 0
9 Power_On_Hours 0x0012 063 063 000 Old_age Always - 16321
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 125
160 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
191 G-Sense_Error_Rate 0x000a 100 100 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 11
193 Load_Cycle_Count 0x0012 065 065 000 Old_age Always - 353303
194 Temperature_Celsius 0x0002 222 222 000 Old_age Always - 27 (Min/Max 15/54)
195 Hardware_ECC_Recovered 0x000a 100 100 000 Old_age Always - 0
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
223 Load_Retry_Count 0x000a 100 100 000 Old_age Always - 0
254 Free_Fall_Sensor 0x0032 100 100 000 Old_age Always - 5

Where it gets worrisome and silly: the latter set of SMART results are from a Mac Mini *SERVER*, running Mac OS X *SERVER*. Why, Apple? Why did you feel the need to make a SERVER OS aggressively try to save power at the expense of turning the whole thing prematurely into e-waste? WAS THIS REALLY AN IMPROVEMENT?!

update: this little freakshow. OVER ONE MILLION. SUUUUUPER JAAAACKPOT!!! YOU RULE THE UNIVERSE! TROLL, DAMSEL, PEASANT, CATAPULT, JOUST MULTIBALL MADNESS! THE STORM IS COMING, RETURN TO YOUR HOME! DO NOT PANIC! DOHO!!

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Momentus 5400.6
Device Model:     ST9500325AS
Serial Number:    I AM A BANANA.
LU WWN Device Id: 5 000c50 0376f5c6f
Firmware Version: 0002BSM1
User Capacity:    500,107,862,016 bytes [500 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 2.6, 1.5 Gb/s
Local Time is:    Tue Nov 11 16:08:09 2014 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 113 099 006 Pre-fail Always - 54515315
3 Spin_Up_Time 0x0003 098 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 057 057 020 Old_age Always - 44108
5 Reallocated_Sector_Ct 0x0033 093 093 036 Pre-fail Always - 145
7 Seek_Error_Rate 0x000f 072 060 030 Pre-fail Always - 125055254112
9 Power_On_Hours 0x0032 072 072 000 Old_age Always - 24555
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 037 020 Old_age Always - 171
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 099 099 000 Old_age Always - 1
188 Command_Timeout 0x0032 100 099 000 Old_age Always - 60130459662
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 065 054 045 Old_age Always - 35 (Min/Max 22/41)
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 122
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 0
193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 1025477
194 Temperature_Celsius 0x0022 035 046 000 Old_age Always - 35 (0 22 0 0 0)
195 Hardware_ECC_Recovered 0x001a 052 051 000 Old_age Always - 54515315
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged

12 comments

  1. I have used hdapm succesfully in the past to fix this issue
    on stock 750G disks that came with MacBookPro 8,3 (2011, 17in).

    Any advice on getting this to work on a 2.5″ 2T Samsung
    ST2000LM003 HD0M201RAD? (installed in a MacBookPro 8,3)

    Using hdparm -B 254 seems to work from inside
    a live Ubuntu 14.04.1 Desktop DVD boot, but rebooted
    into Mac OS X, hdapm returns:

    disk0: ST2000LM003 HD0M201RAD
    Set APM Level to Oxfe: FAILED: APM not supported

      1. I am having the same issue the the 2TB Seagate. For weeks I got “success” with hdapm, then noticed a system slowdown and saw the log now showed “failed” “APM not supported”. I booted Ubuntu (CD, could not get USB to work, MBP early 2011) and played with hdparm. -B 254 worked but -i showed “Disabled (255)” no matter what. -B reported 254, though. So, boot back to OSX 10.9.4 and still hdapm failed. Boot Ubuntu and just did -B and it said it was still 254, despite -i still saying disabled. So, figured I was good even if hdapm failed. However, on the next OSX boot, hdapm reports success. Don’t know why, or why it didn’t after the first boot. Maybe I did too much with hdparm that first time. Maybe if the system powers off instead of restarting between boots that kills it. Hope not, but my laptop usually runs all day anyway. Maybe I’ll have to endure the slow ubuntu cd boot a few times a year.
        I still don’t know what made the drive stop having success with hdapm after it initially did work the first few boots. If only someone would port hdparm to OS X! Why hasn’t this been done?

        TLDR – if hdapm still fails after setting -B 254 in ubuntu, try starting Ubuntu again and just do “sudo hdparm -B /dev/sda”. Reboot OS X and hdapm might work. If -B does not report 254 the second boot I don’t know. Maybe try -B 254 a second time. Also, try to shut down linux with restart so the mac never fully powers down.

  2. After not getting hdapm to work (still on
    Mac OS X 10.6.8), I finally gave up and did
    something like this, which stops the
    constant parking by overwriting a tiny file
    every 7 sec. No problems for a few years
    on a few machines.

    cheers,
    marty

    #! /bin/sh

    ### /usr/local/sbin/stoppark
    # stoppark: stop Samsung 2T head parks (hdapm fails)

    ### setup
    prog=$0
    logfile=/Users/$USER/Library/Logs/StopPark.log
    mypid=$$
    ps auxwww | grep “$prog” | grep “/bin/sh” | grep -v grep | awk ‘{print $2}’ > $logfile
    pids=`cat $logfile`

    ### at most one copy
    running=””
    for pid in $pids; do
    if [ “$pid” != “$mypid” ]; then running=”$running $pid”; fi
    done
    if [ “$running” != “” ]; then
    logger “$prog: ### already running (pid=${running})”
    rm -f $logfile
    exit 0
    fi

    ### stop head park by write tiny file every 7 sec (attempts park every 8 sec)
    logger “$prog: running (pid=$mypid)”
    i=0
    while [ 1 ]; do
    echo $i > $logfile
    sleep 7
    i=`expr $i + 1`
    done

    ### HOWTO install auto-start for user “sereno”
    #vi ~/Library/LaunchAgents/org.sereno.stoppark.plist
    #————————————————————
    #
    #
    #
    #
    # Label
    # org.sereno.stoppark
    # ProgramArguments
    #
    # /usr/local/sbin/stoppark
    #
    # KeepAlive
    #
    # StandardOutPath
    # /Users/sereno/Library/Logs/StopPark.log
    # StandardErrorPath
    # /Users/sereno/Library/Logs/StopPark.log
    #
    #
    #
    #————————————————————
    ### HOWTO interactively start
    #launchctl load ~/Library/LaunchAgents/org.sereno.stoppark.plist

Leave a Reply

Your email address will not be published. Required fields are marked *