Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Failure Rate Estimation M.Lampton UCB SSL • An upper limit on failure rate, or a lower limit on MTTF, is required to establish system reliability • The limit is to be obtained by measurement • How many failure-free operations, or hours of operation, will establish a given limit? • What is meant by the confidence level of a limit? • General reference: I.Bazovsky “Reliability Theory & Practice” Prentice-Hall 1961, 1982. Simplest case: independent random failures ref: I.Bazovsky Chapters 3,4 • Failures are discrete, complete, and unambiguous: a failure happens, or it doesn’t. • Failures are independent and they happen at random. They are not correlated. They obey Poisson statistics. • Failure rate is statistically constant: no infant mortality, no wear-out, no limited life effects. m N e m P ( N , m) N! m P (0, m) e How to test for failure rate? • Any test will involve measuring some random events • The test, when repeated, can/will give various answers • Yikes • How to quantify a test result when the events being observed are random? • Answer is Confidence Level What is Confidence Level? • Confidence = probability of making a correct conclusion, given your test data. • Pick a confidence level, say 90% • Then adjust the wording of your conclusion so that it is correct at least 90% of the time, for every possible true parameter value. What would we expect? • • • • • Let the true failure rate = R failures/hour Let a test have duration T hours. Mean number of expected failures is RT. Probability of zero failures is exp(-RT). The confidence region for R given zero observed failures and given any desired confidence C is... log e (1 /(1 C )) 0 R T or 0 R 2.3/T for 90% confidence . Why does this work? • There are just two possibilities: R 2.3/T then t he claim is, in fact, correct. or R 2.3/T then t he claim is wrong, but the observed result (zero failures! ) is unusual : only a 10% chance, if R 2.3/T only a 9% chance, if R 2.4/T only an 8% chance, if R 2.5/T only a 1% chance, if R 4.6/T only a 0.1% chance, if R 6.9/T, etc. Example • • • • • • Have a 40000 hour mission (5 years). Want MTTF>100000 hours with 90% confidence. Therefore want R<1E-5/hour, 90% confidence. Therefore need T>2.3E5 unit hours with zero failures. This is a lot! One unit=29 years; 10 units=2.9 years. Probably need “accelerated testing” i.e. faster cycling, or more frequent thermal stresses, or higher RPMs, or whatever makes for accelerated failure rates. But then, you need to estimate how much acceleration you are actually obtaining -- a problem in itself.