Difference between revisions of "Hard Drive Testing/Disktest"

From FreekiWiki
Jump to navigation Jump to search
(Add technical details of disktest and fgdb.rb tracking)
 
(→‎DATABASE LOGGING: add note about editing)
Line 53: Line 53:
 
http://data/disktest_batches/show/4
 
http://data/disktest_batches/show/4
 
(where 4 is the id of the relevant disktest_batch that was created)
 
(where 4 is the id of the relevant disktest_batch that was created)
 +
 +
The batch details can also be managed here, such as drives that
 +
have been destroyed, when the batch is finalized (providing report
 +
to user), etc, by using the edit link.

Revision as of 09:50, 25 October 2013

Disktest is used to test, wipe and track data destruction on IDE and SATA drives.

It is automatically configured using settings in lts.conf, deployed via LTSP.

TESTING PROCESS

It test each drive in parallel as follows:

  • Starts a timer
    • This is done using the specified DISKTEST_TIME_LIMIT_PER_GB, in seconds, multiplied by the drive size
    • It is ignored if the model name matches DISKTEST_TIME_LIMIT_IGNORED_MODELS
    • If the drive is very tiny, it may fall back to DISKTEST_TIME_LIMIT_MINIMUM
    • If testing this drive does not finish within the timeout, it is ABORTED
  • It initiates a short SMART test
    • with flags: '-q', 'silent', '-t', 'short',
  • Checks SMART status
    • with flags: '-q', 'silent', '--all'
      • IF the bus is not SCSI (meaning SATA/IDE), it also passes also '-d', 'ata'
    • If smart returns 2, it stops with RETRY state (which is like a failure)
    • If smart returns >2, it stops with FAILED state
  • Runs badblocks for testing
    • /sbin/badblocks -e 1 -c 1024 -swt 0xffffffff DEV
    • if it exists nonzero, we FAIL the drive
  • Checks SMART status
    • As described above
  • Does its own wipe
    • writes 1's
    • writes 1024 bytes of urandom repeatedly
    • checks that the number of bytes successfully written (from the system call perspective) is twice the drive length
  • Checks SMART status
    • As described above
  • Concludes the drive has been successfully wiped with PASSED state

EXPLANATION OF DISKTEST STATES

UNTESTED - used while the drive is being tested, or if it failed for unknown reason PASSED - if the testing process went without error ABORTED - used when the timeout is reached as a fail state RETRY - used if smart returns DISKTEST_LOGTO_FGDB2, which indiciates RAID controller issues or odd drive problems are preventing any form of successful SMART testing STOPPED - used if the user stops the testing program, using Ctrl-C interrupt FAILED - if the system determines part of the test actually failed

DATABASE LOGGING

When configured with a DISKTEST_LOGTO_FGDB server to log to, the beginning of the process through the end (if it finishes in any of the above states, not power failure) is logged within the database, including finishing state.

The saved data can be queried from the data sec sidebar link, or here: http://data/disktest_runs

Also, if batch containing a serial number is created on this page: http://data/disktest_batches (whichs shows batches not yet finalized)

Then as testing completes (or if it already has), the following status report will be updated, based on data from the above: http://data/disktest_batches/show/4 (where 4 is the id of the relevant disktest_batch that was created)

The batch details can also be managed here, such as drives that have been destroyed, when the batch is finalized (providing report to user), etc, by using the edit link.