MTBF of a RAID-0 system (or dual cpu/memory where one unit CAN NOT continue
without the other) will always be lower than a single drive unless the
standard deviation (they never quote SD) of the MTBF is zero. i.e they all
fail simultaneously at MTBF and none before - pretty unlikely I think.
Neither will the MTBF be halfed unless 1/2 the devices fail immediately and
the other half last exactly 2x MTBF.
The reality for MTBF of a RAID-0 will lie in between.
MTBF is not really of great use for our puposes. Disk drive MTBF these days
are quoted at about 250,000+ hours (>28 years continouous use)! I certainly
have my doubts about these accelerated testing methods. But whatever happens
to MTBF for multiple inter-dependant drives will be pretty irrelevant for
the lifetime of our usage of the device.
Cummalative failure rate is a much more useful figure for us and for a small
number of fairly reliable inter-dependant devices this is nearly an additive
figure - but not quite.
Seagate reckon about 3.41% (flat-line model) will fail during the first 5
years of use (assuming you only use it for 2400 hours a year [6 1/2 hours a
day]) :
http://www.seagate.com/docs/pdf/newsinfo/disc/drive_reliability.pdf
To calculate the failure rate for multiple inter-dependant devices you need
to find the product of the survival rate and subtract it from 1.
eg. Survival rate for 5 years is 100%-3.41% = 96.59% = 0.9659
So :
2 drives in raid-0 configuration running for 5 years 1-(0.9659*0.9659)
= 0.06006975 = 6.0% - note this is not quite double.
3 drives in raid-0 configuration running for 5 years
1-(0.9659*0.965*0.9659) = 0.088737622625 = 8.9% - even further from triple
Long before 5 years (AA excluded of course - has he got bored and gone to
irritate other mailing lists?) you will probably want bigger and better
storage . You can calculate figures for different expected usage yourself.
Other manufacturers and probably other seagate product ranges will vary.
Steve
PS. Call me picky, but I notice that Seagate's actual warranty failure rate
exceeds or equals their so-called "conservative" flat-line model. The author
should seriously consider becoming a politician :-)
----- Original Message -----
From: "Austin Franklin" <darkroom@ix.netcom.com>
To: <filmscanners@halftone.co.uk>
Sent: Monday, November 12, 2001 11:05 PM
Subject: RE: filmscanners: Best solution for HD and images
> > Seems like you have done everything and also know everything.
>
> Not everything, but having been an engineer for 25 years, I have done many
> projects including digital imaging systems, and SCSI systems... What I do
> know, I know, and what I don't know, I know I don't know. I don't just
make
> things up.
>
> > I don't
> > know how your company (or you) determined MTBF of a RAID0 system but
> > most companies as Compaq, IBM, Sun, Adaptec, etc. say that MTBF will
> > decrease.
>
> There is only one article I have seen that says this, and I have had
> discussions with the authors about this. Do you have any reference to
> articles/spec sheets that make this claim?
>
> Interestingly enough, MTBF does not derate for adding a second CPU or for
> adding more memory to the system...
>
> > Exactly because of the reduced MTBF of a system with multiple
> > HDs Berkeley has suggested the RAID system.
>
> Is this "study" published anywhere? If so, I'd like to see it.
>
> > The RAID system is supposed
> > to relax the impact of the reduced MTBF. That doesn't mean the MTBF
> > becomes higher when a RAID system is deployed but it just makes it more
> > likely that the failure can be repaired.
>
> Failure recovery is entirely different from MTBF.
>
> > I see though where your (company's) calculation might come from.
>
> The company was Digital, BTW. We had an entire department devoted to MTBF
> testing...and specifically to storage MTBF assessment.
>
> > You
> > can determine MTBF for a certain device by testing for example 10000
> > drives for 1000 hours and then divide the total of 10000*1000 hours by
> > the number of failures.
>
> That's not really how you determine MTBF. MTBF is an average. You are
> right, you need a large sample to test though.
>
> > Nevertheless, this calculation doesn't apply to RAID as a RAID system
> > has to be considered as a single identity.
>
> Exactly, and that is why you don't get any decrease in MTBF by adding
> drives. It's really simple.
>
> > So you cannot claim that
> > because you have 10 HDs your RAID system is working 10*1=10 hours in
> > each single hour. Your RAID system is ONE identity and therefore is
> > working only 1 hours each hour it is up. Therefore the MTBF decreases.
>
> Why does the MTBF decrease? You have a magical "therefore" that doesn't
> follow.
>
> If you tested 1000 drives by themselves, and you got an MTBF of 1,000,000
> hours, let's say...take those 1000 drives, and make 500 RAID 0 systems,
and
> your MTBF will NOT decrease notably, if at all, from drive failure. It
may
> from other factors like power supply or thermal, but not from drive
failure.
>
>