Hard Drive Testing

From FreekiWiki
Revision as of 15:38, 21 June 2012 by Psullivan (talk | contribs)
Jump to navigation Jump to search

This document is currently being rewritten to reflect updates to the disktest program - Patrick 6/21/12

The actual work to be done on hard drive testing ideally needs to happen only twice a day; once for a batch of smaller drives (100GB and less) in the morning and once for a batch of larger drives (larger than 100GB) to run through the afternoon and overnight if necessary.

Setting Up

If there are finished hard drives already on the racks then proceed to Finishing a Batch.

  1. Grab the red tray and head into TARDIS (the locked storage room) then open up the big brown lockbox.
  2. Hopefully the drives in the box are fairly well organized by size and interface. If this is the first batch of the day you'll want to grab smaller drives, if it's the second batch you can go for the larger ones. Load up the tray with an equal number of IDE- and SATA-connected hard drives. Don't forget about the 2.5" (laptop-sized) drives on the smaller top shelf! Be sure to lock up the big brown box again when you're done.
  3. Take the tray of drives over to one of the wiping racks and start connecting them to the boards. Most boards have 4 connections with various mixtures of IDE and SATA cables.
    1. Try to keep all the drives connected to a single board around the same size so we don't have 3 smaller drives finished and waiting around sucking electricity while the 1 larger drive is still finishing.
    2. Look the drives over for identifying information while you're connecting them; be sure you can clearly identify the model and serial number of the drive. You may need this information for sorting the failed drives from the passed drives later on. If you can't find the serial number on a drive then make sure you're attaching it to a board with other drives you can positively identify so you can use process of elimination when identifying the finished drives.
    3. Check the jumpers on IDE drives and make sure they're set to Master. For most drives this is set by a single jumper positioned vertically between the two pins closest to the IDE pins; check the labels on the drives as they will generally indicate if they require a different arrangement (or default to Master with no jumpers at all).
  4. As soon as you've connected all the drives to a single board, turn it on. You can move on to hooking up another board while the first one boots up and does the initial check of the drives.
  5. Once a board has finished booting Disktest has started (you will see a list of the detected hard drives and a prompt asking to start the test) proceed to Starting Disktest.

Starting Disktest

Disktest is our nifty little home-brewed hard drive testing and wiping program.

When Disktest first starts you will be presented with a list of drives that should look something like this:

PASSED sda: IDE 80.0GB <<Seagate SD380830A (5GQ1DB70)>>
FAILED sdb: IDE 80.0GB <<Samsung SV400AH (173G27Q37282S)>>
PASSED sdc: SATA 80.0GB <<Seagate SD380830A (5GQ1HG92)>>
PASSED sdd: SATA 80.0GB <<Western Digital WD800JBB (WMAMF92810)>>

Are these the expected drives and do you want to test them? [yes]:

The information on these lines indicates the following:

  • PASSED/FAILED: The SMART status of the drive after the initial test.
  • sda/sdb/sdc/sdd: The identifier the system has assigned to the drive.
  • IDE/SATA: The drive's connection type.
  • 80.0GB: The capacity of the attached drive.
  • << ... >> : The drive manufacturer, model number, and the device serial number in parenthesis.

Look this screen over carefully before proceeding!

  1. Are any drives marked as FAILED? If so you will want to abort the test and power off the board then replace the failed drive(s) and start the board again.
  2. Are all the attached drives on the list? Double check that the power and IDE/SATA cables are firmly connected and try to determine if the disc is actually spinning. If any connections were loose you will need to abort the test and restart the board to redetect the devices. If the connections seem solid and the disc is spinning you may need to try the drive on another board or with another combination of drives; incompatibilities happen.
  3. Are the drives indicating the capacity they're labeled with? Some variation is normal; a drive labeled 80GB reporting as 83.0GB is common, a drive labeled 200GB indicating 3.4MB is a fail.
  4. Is the manufacturer and model information accurate? Drives from some manufacturers will report some information as "Unknown" and this is fine, but a string of total gibberish is a good indicator of failure.

If anything seems out of order you can abort the test by responding to the "Are these the expected drives..." prompt with an "n", taking you to a final status screen indicating the aborted or failed state of the attached drives. Press enter at the final status screen to power off the board.

If everything is in order you can simply strike Enter at the prompt and the test will begin.

During Testing

While disktest is running it will output status updates on the drives every few seconds.

sda: IDE 80.0GB <<Seagate SD380830A (5GQ1DB70)>>: 13% of wipe complete - 1:58:02
sdb: IDE 80.0GB <<Samsung SV400AH (173G27Q37282S)>>: 80% of badblocks read (part 1) complete - 1:58:02
sdc: SATA 80.0GB <<Seagate SD380830A (5GQ1HG92)>>: all tests passed:
---
-smart test passed
-initiating smart self-test
-badblocks test started
-100% of badblocks read (part two) complete
-smart test passed
-disk wipe started
-100% of wipe complete
-disk wipe finished
-smart test passed
-2:08:32
sdd: SATA 80.0GB <<Western Digital WD800JBB (WMAMF92810)>> 22% of badblocks write (part 2) complete - 1:58:02

Any variation on the outputs above is normal. Drive sdc in the example above has successfully finished testing and wiping and is giving it's final status summary. If you notice incomplete drives' progress percentages or running time failing to advance this may indicate the system has frozen and needs to be restarted. If you suspect a freeze then make a note of which board it is and what the percentages/times indicated are; check it again in 10 minutes and abort testing if there has been no progress.

The video output from the board will automatically switch off after a while; if you do not see any output from a running board you can press any key to "wake" the display. However, avoid the Enter key as this may accidentally power down a finished test before you can view the results.

Testing can be aborted at any time using the keyboard command Ctrl-C. A final status screen will be presented and any drives that have already finished testing will be indicated as such and can be considered passed.

Finishing a Batch

Once disktest has finished you will be presented with a results screen like this:

sda: IDE 80.0GB <<Seagate SD380830A (5GQ1DB70)>>
sda passed! Label and store it.
sdb: IDE 80.0GB <<Samsung SV400AH (173G27Q37282S)>>
sbb failed! Recycle it!
sdc: SATA 80.0GB <<Seagate SD380830A (5GQ1HG92)>>
sdc passed! Label and store it.
sdd: SATA 80.0GB <<Western Digital WD800JBB (WMAMF92810)>>
sdd passed! Label and store it.
Hit enter to shut down.

Fill out a hard drive label (Avery 5167) for each hard drive that has passed, then physically identify the passed drives. You can either attach the labels to the passed drives while they are still on the rack, or press enter to power down