Disk Burn In Test Script

Linux Scripts:
DiskBurnInTest.sh (the main script)
DiskBurnInTest_HeadMove.sh (random head movement during the tests)

When my personal 500g main dump disk died an early death a second time in 2016, I wasn't too impressed with what I found for sale. Research showed that most disks were now trash being shoveled on the consumer. Many failed the first week (high infant mortality). While there are disk test programs out there, I needed something a little more advanced that was SMART and error aware and could output speed graphs. That's why I wrote this script. Read below the usage parts (about half way down) to see what turned into a long lecture.

I find it disappointing that I can no longer recommend any brand or series of disk either personally or professionally. The only things I can recommend are to buy cheap and to test often because quality is long gone.

This is an example graph from my first new Western Digital Blue 320g disk in my RAID1. This is what a normal head and rotational platter disk should look like.

This is an example graph from my second new Western Digital Blue 320g disk in my RAID1. Note the dips just after 100g. This disk should be nearly identical to the one above (they were bought together), but it will probably be the first to show significant problems in just under a year. This is really disappointing for a brand new disk. The Blue series are "supposed" to be some of WD's better disks. I guess not. Note that only the badblocks write with verify tests show this dip. Read only tests are mediocre for showing drive health.

This is the test graph from my DVR's old Hitachi 120g disk (just over 10 years run time at the time of this test). Note how the start of the graph rises, peaks, then falls normally. This indicates some kind of problem at the beginning of the disk. I ended up repartitioning the disk just after the 25g peak point to avoid video corruption problems and to not stress the beginning of the disk. This disk is one of the RMA problems I complain about below. When it came back, Hitachi DFT kept finding small problems and fixing them. I could never get a solid error to send back the disk again, even with several days of continuous testing. Considering how long this disk has survived, it kept going in spite of the manufacturer's stupidity.

This is a test graph from the 1u backup server's 750g Seagate Barracuda disk (3.6 years run time at the time of this test). The 1u was in a coloc for most of its life backing up other servers in the rack. This is another example of the RMA problems I complain about below. This disk was bought shortly after the 750's came out and was expensive. It died an early death and got RMA'd. What returned was this monstrosity. This one always ran slow for some reason. I ran a few passes with SeaTools and everything came back fine, so it got put back into production since the backups were way behind schedule and I had other things I needed to be working on. I don't know how to even interpret this graph. The test ran in 1g chunks, so there are just over 700 data points. Disk internals should interleave the platters, so there's probably something wrong with one of the platters or heads. The drive has never lost data, though. If I would have had this script to generate this graph way back then, I would have called up support and gotten someone fired in the repair division. There's no excuse for something like this. This disk should have never left the repair center. For people who don't believe in quotas and sloppiness, this disk is a prime example.

This is one of the backup disks from my personal ancient P3 Xeon offline backup system (the case has a large fan in front of the drive bays, so I kept it around). The speed line is mostly flat because the controller and/or CPU cannot keep up with the disk's speed. Unfortunately, these types of graphs aren't very useful.

Disk Burn In Test usage and manual. Version: 2016-06-08

Usage: ./DiskBurnInTest.sh <options> <disk block device>

Options:
 -?: Show this help text.
 -b dangerous: Enables batch mode. Run this script in normal mode first to make
      sure the options you choose work like you expect. If you enable an over
      write mode, this option will happily blank your disk without any
      confirmation. You have been warned.
 -c #: Change the Block Count. Do not mess with this unless you really know
      what you're doing. Default is 64 blocks.
 -d: Run the disk's built in self tests after all these tests. The SMART
      "standard" is poorly implemented in many disks and this may not work as
      expected or at all. If so, it will error out at the end.
 -D: Skip all these tests and run the disk's built in self tests as "-d" would.
 -e #: Maximum bad block count before the loop is aborted. Default is 0 for do
      not abort and complete the loop. For "dd", the number of "-p" passes will
      be stopped for that loop if the max count is hit.
 -f dangerous: enable "badblocks" force option. If this is used against a live
      filesystem, it will definitely corrupt it. You have been warned.
 -h #: Enable the Head Move script every # seconds. This will move the disk
      head to random positions during the tests to help show any mechanical
      problems. This option will slow down the tests slightly. This option is
      useless on SSD's and the various flash drives. The Head Move Script will                                                                                   
      be recreated each time the script is run if it isn't set to an absolute                                                                                    
      path at the top of this script.                                                                                                                            
 -k #: Disk Chunk Count. By default the disk is broken up into 100 sections                                                                                      
      (plus a left over if it doesn't divide evenly). On a small and fast disk                                                                                   
      this may make each section fly by too fast for useful statistics. In that                                                                                  
      case, choose a number less than 100 to reduce the number of chunks.                                                                                        
 -K #: Disk Chunks in Megabytes. This is similar to "-k" above except that you                                                                                   
      can specify the chunk size in megabytes.                                                                                                                   
 -l opt: Test Length. The tests run by the "badblocks" program (-t "normal"                                                                                      
      and "erase") have the option for different lengths. For "dd" tests, this                                                                                   
      value is ignored. Valid choices for "opt" are "quick", "short", "normal"                                                                                   
      (default), "long", or "extended". Write patterns break out into:                                                                                           
      quick: zeros                                                                                                                                               
      short: ones, zeros                                                                                                                                         
      normal: even bits, odd bits, ones, zeros                                                                                                                   
      long: random, even bits, odd bits, random, ones, zeros                                                                                                     
      extended: random, even bits, random, odd bits, random, ones, random, zeros                                                                                 
 -p #: Number of passes inside each loop to run.                                                                                                                 
 -r #: Resume a test # megabytes into the disk. Given the other parameters, the                                                                                  
      loop block where the megabyte number is contained will be restarted. If                                                                                    
      the "-R" parameter is also used, this number must be less than or equal                                                                                    
      to it.                                                                                                                                                     
 -R #: Stop the tests # megabytes into the disk. Given the other parameters,                                                                                     
      the loop block where the megabyte is contained will be the last block                                                                                      
      tested. If the "-r" parameter is also used, this number must be greater                                                                                    
      than or equal to it.
 -s #: Change the Block Size. Do not mess with this unless you really know
      what you're doing. Default is 1,048,576 (1 megabyte).
 -S #: Enables SMART Error Loop that will repeat the current loop block until
      there is no SMART Error reported or the # count is hit. This is for
      annoying disks that have intermittent bad block problems that need to be
      RMA'd but can't until there is a solid error reported by the
      manufacturer's disk test. Be sure to have your data backed up with this
      option as a continual loop on a disk that is going bad will likely kill
      it. If the disk does not support SMART, this option is ignored.
 -t opt: The Loop Type test to run. This controls what program and what test is
      executed on the disk. Valid choices for "opt" are "normal" (default),
      "readonly", "writeinplace", "erase", and "zero". The only option safe to
      run on a live file system is "readonly". All others must be unmounted.
      For "normal" and "writeinplace" tests, your data will be preserved unless
      there is a system crash or power outage. In that case, "Block Size" times
      "Block Count" bytes could be lost. Don't run questionable programs or
      run these tests without a good UPS or full laptop battery charge to be
      safe. In general for any disk test programs, read only and write quick
      without verify tests are a bit mediocre in identifying disk problems. Use
      "normal" and "erase" modes in "badblocks" for more accurate tests. Test
      types break out into:
      normal: runs "badblocks" in non-overwrite mode verifying each pass. Since
         data is preserved, "badblocks" has to first read the data off the
         disk, write the test patterns and verify, and then write the data back
         to the disk. The extra read and write will be a little slower compared
         to "erase" mode.
      readonly: "dd" reads the raw disk and dumps it into /dev/null. Since this
         is a read only test, it is very safe and extremely fast. This test
         depends on the disk internals to verify each block read.
      writeinplace: "write in place" uses "dd" to read a block and then write
         it back in the same place. "dd" does no verification and depends on
         the drive internals for that.
      erase: runs "badblocks" in data overwrite mode verifying each pass. All
         data on the disk will be lost. This test requires the "-w dangerous"
         safety flag to verify you really want to lose everything.
      zero: "dd" writes zeros to the disk wiping out all data. There is no
         verification pass. This test requires the "-w dangerous" safety flag
         to verify you really want to lose everything.
 -w dangerous: Safety flag to enable the "erase" and "zero" loop test types
      that will ERASE and DESTROY any and ALL data on your disk. Undelete and
      data recovery programs will NOT restore your lost data after this. All of
      your data will be TOTALLY GONE! You have been warned. The over write
      modes should only be used on disks that are new and unused, on disks that
      are about to be reformatted, on disks that need to be wiped for security
      reasons, or on disks that need M$ windoze fully uninstalled. The "erase"
      and "zero" modes will not run without this option.

Disk Block Device: This is the path to the disk block device usually in /dev. A
   link can be used for safety or to create a disk label. Example: If I'm
   testing on a live system and /dev/sda is my operating system disk and
   /dev/sdb is my test disk, I'll often create a link by:
      ln -s /dev/sdb /dev/disk/test0
   so I don't accidentally screw up and wipe my operating system drive. If
   /dev/sdb is one of my backup disks and I want a meaningful label, use:
      ln -s /dev/sdb /dev/disk/backup
   All of these tests can also be run on individual partitions instead of the
   full disk.

When this script is first run, it will print out some diagnostic information
about the selected disk. It's a bit messy, but read it carefully to make sure
the disk is correct.

Under that information will be some warnings if problems are found. Pay close
attention to them. Some disks do not support SMART very well, and SMART logging
may get disabled (flash sticks don't support SMART at all). The tests will
still run but without the SMART verification at the end of each loop.

Usage Examples:
   To test the second hard disk in a system using the default options of
   "-t normal -l normal":
   ./DiskBurnInTest.sh /dev/sdb

   To do a regular maintenance test on the main operating system disk and run
   the disk's built in self tests after (do this from a Linux Live CD):
   ./DiskBurnInTest.sh -t normal -l short -d /dev/sda

   To test the first hard disk in a system only using the disk's built in self
   tests (safe to do while the disk is running, "readonly" gets skipped):
   ./DiskBurnInTest.sh -t readonly -D /dev/sda

   To burn in a new disk in the 3rd disk position with no data on it yet, do 3
   passes each loop, and move the head away to a random position every second:
   ./DiskBurnInTest.sh -t erase -l extended -w dangerous -p 3 -h 1 /dev/sdc

   To quickly check a live file system on the first disk and do built in self
   tests after (no unmounting needed):
   ./DiskBurnInTest.sh -t readonly -d /dev/sda

   To test a smaller SSD where the loops go by too quickly in 10 chunks instead
   of 100:
   ./DiskBurnInTest.sh -t writeinplace -k 10 /dev/sda
   or in 10g chunks:
   ./DiskBurnInTest.sh -t writeinplace -K 10240 /dev/sda
   Note that SSD and flash drive have a limited write life. Testing them too
   hard will lead them to an early death. "Solid State" doesn't have the moving
   mechanical parts of "disk platter" based drives, so they shouldn't need
   heavy testing unless they are showing abnormal behavior. Generic flash
   drives (like USB sticks) do not have SMART capabilities, so their reporting
   will be limited.

   To quickly uninstall M$ windoze (that virus infected, spyware magnet, built
   in privacy violator, and all around lousy excuse for a "professional"
   operating system):
   ./DiskBurnInTest.sh -t zero -w dangerous /dev/sda
   And a slightly slower way with verification:
   ./DiskBurnInTest.sh -t erase -l quick -w dangerous /dev/sda

   This script will output a gnuplot file with the runtime stats. If gnuplot is
   installed (on this system or another), to convert the gnuplot file to a PNG
   image, simply run in the directory:
   gnuplot *.gnuplot


-------------------------------------------------------------------------------
Rants, Reasonings, Theory Of Operation, and The Manual
(This part came out long, but it is necessary for understanding.)

Standard Disclaimer. No warranties implied or given. Don't blame me for 
you losing your data. Keep backups. Refresh your backups before deep tests 
and burn in's. Also use a good UPS (see the write test warning above). 
Cheap disks and expensive disks that are really cheap inside can go cranky 
very quickly. I've made every effort to make this script stable and 
reliable since I use it myself and do not want to lose my own data. That 
doesn't mean this script has been written perfectly. That doesn't mean it 
will work correctly with every disk and Linux variant under the sun. I 
wrote this script for Debian based Linux Live CD variants since that's 
what I use, and I don't see many RPM based ones. I've tried to keep the 
script general enough that it should work under either, though.

Learning And Testing. This script requires basic Linux experience. This 
isn't intended for beginners. This isn't hard, but a careless screw up 
could wipe your disk and lose your data. You need to be root to run these 
tests. You're going to need to know basic shell operations and how to 
identify your disk. This is beyond this script to teach you. You can 
easily test this script in a VM with a small virtual disk and booting a 
Linux Live CD to see how it handles. An old stand alone computer with a 
single test disk works even better. This script is open source and can be 
easily checked. I'm not hiding anything. If you find something 
questionable, check the source code. It is commented and easy to read.

I chose to write this script not to reinvent the wheel, but to simplify 
things and to get a SMART reading for each loop block so I could get an 
idea where the errors were happening on the disk. I needed something more 
in depth than what the other programs offered. I wrote this script to help 
automate disk testing and make it a bit simpler with a standardized 
interface presented by the script. I wrote this script because some 
manufacturer's test programs cannot see SCSI disks, SCSI cards, RAID 
cards, USB storage, and other PCI controllers.

I wrote this script to help find new disks that were bad (infant 
mortality) and weed out disks made with poor manufacturing practices. Any 
new and capable disk should easily handle all these tests. These burn in 
tests are designed to find a disk failure within the easy return period 
after buying a new disk. During this period it is easier to take it back, 
swap it out, get another size, or get another brand. The intent is to not 
be stuck with a bad design and have to constantly fight with the 
manufacturer's warranty RMA's, endless return shipping costs (they count 
on you not wanting to pay this), and problems for the life of the 
warranty.

I wrote this script to push semi-failing disks that the manufacturer's 
drive tests keep fixing and refuse to mark as bad even though they're 
clearly dying and needed to be RMA'd. I've been a victim of a few of 
these. I've also been a victim of return RMA's that still had problems and 
kept trying to fix themselves even though they were clearly defective.

It's really sad how disk quality keeps getting worse and worse over the 
years. The manufacturers only care about cutting corners and increasing 
profits. It's pathetic how manufacturers use their warranty as a profit 
center this way as many people won't take the time or effort to RMA a bad 
disk. Many people / operations / data centers don't want or can't receive 
a used disk in return (receiving a used disk for a new one is criminal). 
This just encourages bad behavior and more sloppiness that I consider to 
be fraudulent. It's awful how data center / enterprise level disks use the 
same lousy designs with a new label slapped on them... like a special 
sticker on the disk will make it superior in some way.

These tests are designed to push a disk for a burn in, not burn it up to 
destruction. If a disk passes the tests, I want it to keep my data, not 
break it and cause an early death. It's like taking someone out for a run, 
not dragging them behind the car on the highway. If you modify this script 
for "highway" use, don't mention it and don't mention me. Say you came up 
with the destruction all on your own. Needless destruction violates the 
spirit and intention of this script.

"1000 vs 1024" or "Base10 vs Base2". While on the subject of bashing 
manufacturers, back in the old days, all disk drives used to have their 
size listed in true megabytes... as in a base2 numbering system. Then one 
day, some idiots in sales&marketing decided to change the disk numbering 
world to compensate for their "overly small products" (insult intended). 
This took a complicated topic and made it even more complicated to satisfy 
their tiny egos. Disk controllers and interfaces are small computers and 
all computers use binary (base2) numbers. Any rounded decimal numbers 
(base10) are semi-emulated (to keep this description simple for 
non-admins) by computers for humans and not native for their calculations. 
All the calculations done in this script are done for binary boundaries. 
Choosing base10 numbers for these calculations will cause you problems and 
headaches. People who choose to use those damned "mibi" and "kibi" 
notations can go shove them where the sun don't shine (insult intended). 
One megabyte isn't redefined as 1 million bytes, it's 1,048,576 bytes... 
because that's what the computer, the controller, and the disk needs. If 
this is used in an index, that number will be one less because counting 
starts with 0 instead of 1. Unfortunately the "dd" program used in some of 
these tests counts in millions instead of mega when given the rate per 
second. This makes the numbers look faster than what they are in base2. 
Keep that in mind before complaining about the CSV rate logs.

Test your computer. While I was researching for this script, I came across 
a few accounts of bad computer hardware causing data corruption during the 
disk tests. Many people falsely blamed it on something else. If you have 
bad RAM and the disk test program writes the data you want to keep to that 
RAM buffer so it can be written back at the end of the pass, that data 
just became corrupted. Since the disk test program will use the same 
buffer for all the blocks, everything you have just became corrupted. This 
applies to ALL disk test programs that operate as I just described. None 
are immune. This is why I recommend testing hardware BEFORE doing any in 
depth disk tests. I also recommend testing all hardware (especially brand 
new) on a regular basis as computer manufacturers have been getting just 
as sloppy as disk manufacturers.

UltimateBootCD.com (UBCD). This is a nice consolidation CD of many 
computer diagnostic tools and various utilities. You can fetch the various 
independent tools if you want, but my instructions will reference UBCD.

RAM Test. UBCD: Memory / memtest86. Let this run for 24 hours. If there is 
a bad RAM stick, it will usually show up almost immediately. It usually 
won't catch anything intermittent until after 12 hours. Any bad RAM sticks 
either get RMA'd if still under warranty or go in the trash. No arguments. 
Memtest will also catch bad and poorly designed motherboards, so some 
extra testing should be done if it shows problems.

CPU, RAM, and Motherboard Test. UBCD: CPU / Mersenne Prime Test (highest 
version that will boot). This is often referred to as MPrime. This does 
complex math calculations that will show if something is broken. While 
UBCD lists this as a CPU test (which it mostly is), it will also show 
problems with RAM and the motherboard like memtest. This program found a 
RAM problem in my first quad that memtest passed. If MPrime finds errors, 
trace down the problem. If hardware is bad, it gets warrantied or goes in 
the trash. No arguments.

Manufacturer's Disk Test. UBCD: HDD / Diagnosis. In the menu, pick the 
brand of disk that you have. I usually recommend running the 
manufacturer's test after running the burn in tests and the disk self 
tests. This will do custom diagnostics on the disk that cannot be done in 
Linux. If there are problems in the disk queue, it will flush them out. If 
you're reading these instructions and this script sounds like too much for 
you, run HDAT2 and then this test and ignore this script.

HDAT2 Disk Test. UBCD: HDD / Diagnosis / HDAT2. If this script is too much 
for you or you just want a second opinion, HDAT2 is a good choice. I 
periodically recommend this to friends and clients because of its 
simplicity compared to my script. Select your hard disk from the list, 
select "Device Test Menu", select "Detect and fix bad sectors menu", and 
select "Fix with Read/Write/Read". This test can take several hours 
depending on the disk size. Since this test writes to the disk but keeps 
your data, the same power outage warning applies as above.

Linux Live CD's. In writing this script, I kept most of the functions in 
mind for use with a Linux Live CD. This is the recommend way of disk 
testing for most systems as the operating system disk cannot have an in 
depth test while that operating system is running. This also allows 
windoze users to check their hard disks since these tests only access the 
disks on a low level and could care less about the file system and 
operating system. If you're running a Linux system and boot to a Linux 
Live CD, be warned that the hard disk order will tend to move around. Be 
careful and make sure you're testing the disk you really want to test. 
That's why the diag print out is first printed along with so many other 
warnings.

Protecting Other Disks. If you are new to all this and unsure which disk 
you really want to test, shut your computer down and UNPLUG EVERY OTHER 
DISK in your system. Only leave the disk you want to test plugged in. If 
your important disks are unplugged, this program (or any other program) 
cannot wipe them and destroy their data. Having a separate computer with 
no disks except the disk you want to test plugged in is also a good idea. 
This can be left in the corner of the room for days as the disk gets a 
deep test. I find it very scary to do irreversible disk operations in a 
live system that could instantly destroy it and lose all my data. You 
should, too. That's why the diag print out is first printed along with so 
many other warnings. That's also why I mention unplugging or using a 
separate computer.

Test Run Times. Depending on the test type, disk size, and disk speed, the 
test can run for several hours to several days. A new disk should be run 
for at least a few days with a few passes. If you have a test you like but 
it ends before then, increase the number of passes per loop to stretch it 
out. I usually run the "erase" test on my new disks. I also prefer to run 
"zero" and "readonly" to get some benchmark times to compare the disk 
later in its life. A regular maintence test shouldn't take more than a few 
hours. The idea is to make sure it still works correctly but not to run it 
too hard. If a disk is acting up, back it up and run a deeper test.

After this script's tests, I'll run the disk's built in self tests using 
"smartctl" with "-t short", "-t long", and "-t offline". I've combined 
these tests with the "-d" option. It may or may not work very well 
depending on what the drive supports and reports. The first 2 test status 
can be seen with the "smartctl -c /dev/disk" option under "Self-test 
execution status" (some disks only show this with "smartctl -a 
/dev/disk"). The "offline" test does it's own thing and can take about the 
same time as the "long" test. There is no status update on it besides the 
original estimated end time when it first runs. To finish the testing 
round, I'll use the manufacturer's test disk (see UBCD above) as a final 
pass to flush out any queued problems that got stuck in the disk's 
controller. Yes, a full battery of tests will take a very long time. If 
you want to make sure your programs and data stays safe, there is no fast 
"cheat" path.

Warning from a real world problem I've seen: Excessively pushing a disk 
that's near failure over and over may kill the entire disk and shut it 
totally off. Some disks don't handle end of life errors very well. Use the 
deep tests with caution. Always back up a disk like this before pushing it 
that far.

Estimated Time Function. The estimated time to finish function cannot take 
into account the disk slowing at the end. This is natural for all platter 
based hard disks. They can cut their transfer rate as much as half. It 
will add about 25% longer than the first estimate at Loop 0. SSD's and
flash drives shouldn't have this problem.

The "badblocks" and "dd" programs are mixed because "badblocks" is much 
faster than "dd" at writing and verifying non-zero patterns. "dd" is good 
for generic tests and very fast wipes. Note that if "dd" hits a range of 
bad blocks that cannot be fixed, it may abort early and not check the rest 
of the loop block.

RAM Resources. To run this script using the "dd" tests, you'll only need a 
couple of extra megabytes of RAM. "badblocks" will need much more. For the 
"erase" test, you'll need 2 * "Block Size" * "Block Count". Since 64megs 
is the default value, you'll need 128megs of RAM. For the "normal" test, 
you'll need 3 * "Block Size" * "Block Count". Since 64megs is the default 
value, you'll need 192megs of RAM (the extra 64megs is for the save data 
buffer). For most modern systems, this won't be a problem. On a memory 
limited system, change "Block Count" to 32, 16, or 8 ("-c 32" or "-c 16" or
"-c 8"). These numbers will keep the proper binary boundaries and not slow
down the tests too much.

Test Buffer Sizes. I've heard some people argue that the test buffer sizes 
mentioned above should be larger than the onboard disk cache. This is not 
true with direct write tests. All the tests in this script are direct 
write. None use any RAM caches. While doing the tests, I've watched the 
drive light. It flashed constant as expected.

The Disk Log. Its name format is "Hostname_DiskDevice_Date_Time.log". 
Under normal circumstances, a small subset of what's displayed on the 
screen will be entered into the log with the INFO header. If a SMART 
problem is detected, "badblocks" or "dd" exits with a non-zero return 
code, or the kern.log shows a searchable entry, an entry will be made with 
the ERROR header. Note that hitting CTRL-C to abort out of a test will 
make "badblocks" and "dd" return a non-zero return code on some systems. 
This will create a false log entry that should be deleted unless you 
killed the script because it was hanging badly and there really were bad 
blocks. Also note that every entry for a particular disk cannot be grep'd 
for from kern.log. If you're tracking down intermittent problems, use the 
date stamp and go back to kern.log to see the full entry.

For long term reference, this script will output up to 3 "smartctl --xall"
logs. The first is before the loops start showing the initial drive
condition. This second is after all the loops have run. The optional third 
is after the "long" built in self test. These last two will show any 
changes and built in error logs if the drive supports it.

This script will watch 4 block related SMART parameters for errors. Each 
one of these will indicate a new problem or show the count of past 
problems. Most of the SMART parameters are dumb and not very useful for 
showing drive health. These 4 parameters are important for the tests in 
this script. Not all disks have these 4 parameters. A missing parameter 
will be ignored. If the disk doesn't have any of these 4 parameters, SMART 
monitoring will be automatically disabled with a warning printed.
5 Reallocated_Sector_Ct
196 Reallocated_Event_Count
197 Current_Pending_Sector
198 Offline_Uncorrectable

On a side note, "10 Spin_Retry_Count" is useful to see if a disk's 
mechanics has started to fail. It doesn't have anything to do with bad 
blocks, so it isn't monitored by this script.

The CSV Log. This log contains statistics on how fast the test performed. The 
first number is the loop count. The second number is megabytes/second. The 
third number is the end of the loop block in megabytes. To find the 
beginning of the loop block, add 1 to the previous entry. The file name is 
in the format of 
"HostName_DiskDevice_Date_Time_TestType_TestLength_Passes_HeadMoveTime_
TotalRunTime.csv". I chose that long name so I'd know where, when, and what 
type of test was run on the disk. I also added a gnuplot file derived from 
this data. If gnuplot is installed, it will output a PNG file. If not, the 
gnuplot file is self contained and can be copied to another computer with 
gnuplot. Run it with "gnuplot FILE.gnuplot" or see below for batch 
processing. I recommend keeping the CSV and PNG files and comparing them 
to earlier and later tests as a simple indicator of disk health. If this 
script is being run from a Linux Live CD, save them to a USB stick or a 
network share. If one of the loop blocks has a noticeably slower rate than 
the adjacent blocks, that's a good indicator that there is a problem 
within that loop block.

If the gnuplot files are properly formed (like mine), it is easy to 
execute them all at once:
gnuplot *.gnuplot
For reference, executing gnuplot files one at a time in batch mode:
ls *.gnuplot | xargs -l1 gnuplot
or
for sFile in `ls *.gnuplot` ; do gnuplot ${sFile} ; done

Note that the rates in the CSV log won't be very accurate if each loop 
pass only takes several seconds or less to run. Standard shell scripts can 
only measure time in whole seconds and this limits the rate resolution 
calculations. Use the "-k" or "-K" options to decrease the number of loop 
passes to help solve this. Ideally each loop should be 2-5 minutes.

Disk speeds reported at the end of each loop are for how fast the disk got 
through that loop, not how fast the individual reads/writes were inside 
the loop. It is not an actual speed unless using "readonly" or "zero" 
tests in one pass. The "badblocks" program will have multiple reads and 
writes and will show up much slower. If the disk isn't having any 
problems, it is still operating just as fast as a single pass with each 
read and write operation. Just keep in mind that multiples will slow down 
the apparent performance. The number of passes option will do the same 
thing.

For long term reference take pictures or scan the disk and labels when you 
first get the disk. It is much easier to dig out a picture than to pull out 
a disk for model, serial number, manufacturing batch, etc.

Needed Programs Installed. Most Debian based Linux Live CD's will have 
most of the programs needed to run this script already installed. The 
smartmontools package (and gsmartcontrol for a visualized SMART log) is 
usually left out, though. This script will check for the needed programs 
and print the missing ones if there are problems. Debian based installs
can use the command below to install the missing pieces. RPM based
installs will have something similar.
# sudo su ; apt-get update ; apt-get -y install bash bc coreutils e2fsprogs
gnu-fdisk grep hostname mount procps sed time util-linux smartmontools
gsmartcontrol

A warning about SSD and flash partition boundary alignment. This is a way
more complicated topic than will be covered in this script, but you need
to be aware of it. If this script is given a raw disk device and not a
partition on that disk, starting at the very beginning, it should be very
fast using the default values. If you give this script a partition on the
same disk and it is much slower, then the partition alignment is off. Be
careful changing the "Block Size" and "Block Count" options as that could
really screw up performance. The default values and optional ones I've
given above should work 99% of the time. Changing those values to
something non-standard could really screw up performance. The access is
slow because the disk is writing across 2 blocks instead of 1. This
causes a lot of needless wear and tear on the flash internals. If you run
into this problem, do a web search on the terms "flash partition boundary
alignment". Your device will need to be repartitioned with the correct
numbers. Sometimes these calculations are easy, sometimes not. Once
completed, this script should run at the same rate as the raw disk device
and a partition on that disk. Also note that some file systems can also have
block alignment issues, but this script does not operate on the file system
level. Technically head and platter based hard disks can also have partition
alignment issues along with the file system chosen, but it's much less
pronounced.

To wrap this up, let me make some comments about the arrogant stupidity of
disbelievers in testing. If this doesn't prove it for you right in front
of your own face, then there's no hope for you. Please stop polluting the
world with your arrogant stupidity. Real admins know better. Ignorant
users just shoot their mouths off trying to sound intelligent and
important. I've waded through a lot of that stupidity researching various
disk testing methods for this script. If you're one of those users
offended by this paragraph, do me and everyone else a favor by not using
this script and going far away. You don't want to hear me and the rest of
us don't want to hear you. Admins and hardware guys with a lot more
experience than me believe in testing because they've learned the hard
way. It's not difficult to do a web search to see what they say. If you're
one of those detestable people known as a "fake reviewer", please go jump
off a tall building. Lots of people have lost important and sometimes
irreplaceable data from your lies, not to mention wasted salvage and replace
time from a lousy product.