Hard Drive Testing

From FreekiWiki
Jump to navigation Jump to search

This document is written from the perspective of use in Hardware Testing, but the sections on Starting Disktest, During Testing, and Finishing a Batch should be relevant to use in other areas.

The process of hard drive testing incorporates verification of a drive's SMART status, verification of the drive's ability to write and read all available surface area of the drive, and repeated overwriting of any previous data on the drive to ensure donors' data security.

The actual work to be done on hard drive testing ideally needs to happen only twice a day; once for a batch of smaller drives (≤100GB) in the morning and once for a batch of larger drives (>100GB) to run through the afternoon and overnight if necessary.

Important note on Solid State Drives (SSD): The testing methodology we employ is intended for magnetic disc media and use on SSDs may not completely remove user data and/or may shorten the useful life of the device. As the number of donated SSDs is beginning to increase we are currently developing an approach for testing and wiping these devices properly; currently any SSDs received are to be securely stored and should NOT be reused until we can verify secure data destruction.

Setting Up

If there are finished hard drives already on the racks then proceed to Finishing a Batch.

  1. Grab the red tray and head into TARDIS (the locked storage room) then open up the big brown lockbox.
  2. The drives in the box should be fairly well organized by size and interface. If this is the first batch of the day you'll want to grab smaller drives. If it's the second batch you can go for the larger ones. Load up the tray with an equal number of IDE- and SATA-connected hard drives. Don't forget about the 2.5" (laptop-sized) drives on the smaller top shelf! If they are present and you have the time you should grab a few SCSI drives as well. Be sure to lock up the big brown box again when you're done.
  3. Take the tray of drives over to one of the wiping racks and start connecting them to the boards. Most boards have 4 connections with various mixtures of IDE and SATA cables.
    1. Try to keep all the drives connected to a single board around the same size so we don't have 3 smaller drives finished and waiting around sucking electricity while the 1 larger drive is still finishing.
    2. Look the drives over for identifying information while you're connecting them; be sure you can clearly identify the model and serial number of the drive. You may need this information for sorting the failed drives from the passed drives later on. If you can't find the serial number on a drive then make sure you're attaching it to a board with other drives you can positively identify so you can use process of elimination when identifying the finished drives.
    3. Check the jumpers on IDE drives and make sure they're set to Master. For most drives this is set by a single jumper positioned vertically between the two pins closest to the IDE pins; check the labels on the drives as they will generally indicate if they require a different arrangement (or default to Master with no jumpers at all).
    4. SCSI drives can be tested in the bays attached to board 8 on wiper 0 and 2.
  4. As soon as you've connected all the drives to a single board, turn it on. You can move on to hooking up another board while the first one boots up and does the initial check of the drives.
  5. Once a board has finished booting and Disktest has started (you will see a list of the detected hard drives and a prompt asking about drive details) proceed to Starting Disktest.

Starting Disktest

Disktest is our nifty little in-house hard drive testing and wiping program.

When Disktest first starts you will be presented with a list of drives that should look something like this:

PASSED sda: -?-  IDE 80.0GB <<Seagate SD380830A (5GQ1DB70)>>
FAILED sdb: 2.5" IDE 80.0GB <<Samsung SV400AH (173G27Q37282S)>>
PASSED sdc: -?-  SATA 80.0GB <<Seagate SD380830A (5GQ1HG92)>>
PASSED sdd: 3.5" SATA 80.0GB <<Western Digital WD800JBB (WMAMF92810)>>

The information on these lines indicates the following:

  • PASSED/FAILED: The SMART status of the drive after the initial test.
  • sda/sdb/sdc/sdd: The identifier the system has assigned to the drive.
  • -?-/2.5"/3.5": The form factor of the hard drive.
  • IDE/SATA: The drive's connection type.
  • 80.0GB: The capacity of the attached drive.
  • << ... >> : The drive manufacturer, model number, and the device serial number in parenthesis.

Form Factor Logging

If any of the detected drives have an unknown form factor (indicated with -?-) you will receive the following prompt:

Correcting drive details:
        1: Correct Serial Numbers
        2: Correct Form Factors
        3: Finished Making Corrections
What would you like to do?:

Select option 2 and use the following menus to specify the correct form factor for the attached drives. Laptop-sized drives are 2.5", desktop-sized drives are 3.5", and anything else can be recorded as Other. While making changes to form factors be sure to also check that serial numbers are accurate and correct them as necessary; these are required for proper tracking of secure data destruction batches.

Note on form factor logging: Hard drives have no way of reporting their form factor, so we are using a system that looks up the form factor associated with the last-tested hard drive of the same model. If you misidentify a drive's form factor it will be misidentified for all subsequently tested drives with the same model number until it is corrected for future devices.

Verifying Drive Details

If you did not need to specify any form factors then the first prompt you will receive is:

Are the hard drive serial numbers and form factors displayed correctly above? [yes]:

If any corrections are necessary then an answer of 'no' will take you to the same data correction submenus you would have been presented with for Form Factor Logging above.

Once all drive details have been confirmed the next prompt will be:

Begin testing the listed drives? [yes]:

Before starting the test, go through the following checks:

  1. Are any drives marked as FAILED? If so you will want to abort the test and power off the board then replace the failed drive(s) and start the board again.
  2. Are all the attached drives on the list? Double check that the power and IDE/SATA cables are firmly connected and try to determine if the disc is actually spinning. If any connections were loose you will need to abort the test and restart the board to redetect the devices. If the connections seem solid and the disc is spinning you may need to try the drive on another board or with another combination of drives; incompatibilities happen.
  3. Are the drives indicating the capacity they're labeled with? Some variation is normal; a drive labeled 80GB reporting as 83.0GB is common, a drive labeled 200GB indicating 3.4MB is a fail.
  4. Is the manufacturer and model information accurate? Drives from some manufacturers will report some information as "Unknown" and this is fine, but a string of total gibberish is a good indicator of failure.
  5. Are the listed form factors and serial numbers correct? At this point there is no further option to make corrections without aborting entirely and starting over, but you should do so if necessary.

If you notice any of the above problems then answer 'no' at the prompt. The test will be aborted and you will be given a final status screen that will indicate the drives did not complete testing or were recorded as failed due to a SMART failure.

If it looks like everything is in order then just strike Enter to give a positive response and the test will begin!

During Testing

While disktest is running it will output status updates on the drives every few seconds.

sda: 3.5" IDE 80.0GB <<Seagate SD380830A (5GQ1DB70)>>: 13% of wipe complete - 2:58:02
sdb: 2.5" IDE 80.0GB <<Samsung SV400AH (173G27Q37282S)>>: 80% of badblocks read (part 1) complete - 2:58:02
sdc: 3.5" SATA 80.0GB <<Seagate SD380830A (5GQ1HG92)>>: all tests passed:
---
-smart test passed
-initiating smart self-test
-badblocks test started
-100% of badblocks read (part two) complete
-smart test passed
-disk wipe started
-100% of wipe complete
-disk wipe finished
-smart test passed
-2:08:32
sdd: 3.5" SATA 80.0GB <<Western Digital WD800JBB (WMAMF92810)>> 22% of badblocks write (part 2) complete - 2:58:02

Any variation on the outputs above is normal. Drive sdc in the example above has successfully finished testing and wiping and is giving it's final status summary. If you notice drives' progress percentages or running time failing to advance this may indicate the system has frozen and needs to be restarted. If you suspect a freeze then make a note of which board it is and what the percentages/times indicated are; check it again in 10 minutes and abort testing if there has been no progress.

The video output from the board will automatically switch off after a while; if you do not see any output from a running board you can press any key to "wake" the display. However, avoid the Enter key as this may accidentally power down a finished test before you can view the results.

Testing can be aborted at any time using the keyboard command Ctrl-C. A final status screen will be presented and any drives that have already finished testing will be indicated as such and can be considered passed.

Finishing a Batch

Once disktest has finished you will be presented with a results screen like this:

sda: IDE 80.0GB <<Seagate SD380830A (5GQ1DB70)>>
sda passed! Label and store it.
sdb: IDE 80.0GB <<Samsung SV400AH (173G27Q37282S)>>
sdb failed! Recycle it!
sdc: SATA 80.0GB <<Seagate SD380830A (5GQ1HG92)>>
sdc passed! Label and store it.
sdd: SATA 80.0GB <<Western Digital WD800JBB (WMAMF92810)>>
sdd passed! Label and store it.
Hit enter to shut down.
  1. Fill out a hard drive label (Avery 5167) for each hard drive that has passed, then physically identify the passed drives.
  2. Attach the labels to the passed drives. You can either do this while they are still on the rack, or press enter to power down the board and disconnect the drives before labeling them.
    1. Be certain you have positively identified all drives before powering down the system as there is no easy way to recall these results once the board is off.
    2. Do not place labels over model numbers, serial numbers, or other identifying information that may be necessary for future tests.
  3. Place the passed and labeled drives in the green tray to await storage in TARDIS.
  4. Destroy failed drives promptly.
  5. Start a new batch!

Related Tasks

Destroying Failed Hard Drives

  • 3.5" drives (desktop sized) should be destroyed in the Wheel of Death (hand-cranked drill). Position the drive so that the drill will penetrate the magnetic platters; avoid drilling straight through the spindle as this damages the drive enclosure more than the data-bearing platters and makes them difficult to disassemble in recycling.
  • 2.5" drives (laptop sized) can be placed on the floor and struck (not too hard!) with a hammer so that the enclosure is bent in and crushing the platters.

Sorting Incoming Hard Drives

Incoming hard drives will be placed in the tan lockbox just inside the door to TARDIS; this should be checked regularly and sorted into the labeled stacks in the big brown lockbox. Currently they are grouped by size (≤100GB, >100GB to <300GB, >300GB/all SCSI) and interface (IDE/SATA).

Storing Passed Hard Drives

  • Passed 3.5" (desktop) drives should be stored on the labeled shelves on the west wall inside TARDIS. The top shelf is for SCSI drives, the second shelf is for IDE drives, and the third shelf is for SATA drives. The stacks are grouped by size and the shelves themselves are labeled with what sizes go where. There are also special purpose bins for size/interface combinations needed for special systems and projects.
  • Passed 2.5" (laptop) drives are stored in the blue bins on the top shelf at the southwest corner of TARDIS; the bins are organized by size and interface.

Future: Certificate of Destruction Tracking

Work In Progress Notes

  • When we get a box of drives which we will need to generate a report or certificate for, all of the serial numbers will be added to a new "disktest batch" in the database.
    • If the donor requires us to track removing them from systems, the optional system serial numbers need to be entered too.
  • The "disktest batch" report will then collect information as matching drives are tested until it has a "PASSED" or "destroyed" status for all drives, which will allow the report to be finalized.
    • While testing on machines designated for testing "certificate of destruction" drives, undetected or incorrect hard drive serial numbers will need to be entered twice to confirm they are correct. The tester should check they are reported correctly when asked "Are the hard drive serial numbers displayed correctly above?".
    • If a drive is not detected in disktest, it can just be marked destroyed.
  • Drives that complete testing without a "PASSED" status (FAIL, STOPPED, ABORTED, etc) will need to be marked as "destroyed" in the "disktest batch" record to confirm physical destruction before they can be considered resolved.
    • Note: They should be marked as they're being destroyed, or a printed version of the list should be used to track as this happens.
  • After all hard drives have either "PASSED" or been marked as "destroyed", the report can be marked as finalized and used.