Confidence limits on percentiles based on test results with a few failures – non-parametric versus exponential

Mark P. Kaminskiy

CONFIDENCE LIMITS ON PERCENTILES BASED ON TEST RESULTS WITH A FEW FAILURES - NON-PARAMETRIC VERSUS EXPONENTIAL

Mark P. Kaminskiy ABSTRACT

The non-parametric lower confidence limits on percentiles in the classes of continuous and increasing failure rate distributions are compared to their parametric exponential counterpart for the Type II censored data. In opposite to the common belief that non-parametric estimation procedures are always less effective than analogous parametric procedures, in the considered case, it turns out that the non-parametric procedures provide either better or the same confidence estimates. In a particular case, when the data include only one uncensored observation (failure) and all three estimates (in the classes of continuous distributions, increasing failure rate distributions and the exponential distribution) exist, the respective three lower confidence limits on percentiles coincide.

Index Terms -- Confidence limits on percentiles, parametric estimation, non-parametric estimation, Type II censoring, reliability test planning

Acronym1

CDF cumulative distribution function IFR increasing failure rate

Notation

F(t) time to failure CDF

y confidence probability

p quantile level, 100p percentile level

tp 100p percentile of time to failure

Tp(y) lower y confidence limit on tp

Ix(a, b) incomplete beta function

nmin(p, y) minimal sample size needed to estimate Tp(y)

En(x) function rounding x up to the closest integer

F(x, k) CDF of x2 distribution with k degrees of freedom

^ (k) yth quantile of Chi-square distribution with k degrees of freedom

r(m) Gamma function

gamma(l, z) lower incomplete Gamma function

1. INTRODUCTION

The lower confidence limit on the 100pth percentile (or pth quantile, or quantile of level p) is one of the most popular reliability measures. The random variable Tp(y) is the lower y confidence limit (y = 1 - a) on the 100pth percentile tp, if these quantities satisfy the following relationship:

P

J d (F (t))> 1 - p

(r)

> r

(1)

where F(t) is the time to failure cumulative distribution function (CDF), and F(tp) = p.

so

1 The singular and plural of an acronym are always spelled the same.

The statistical procedures for constructing the lower confidence limits on percentiles are now available for the most popular lifetime distributions e.g., exponential, Weibull, and Lognormal. These procedures can be found in the popular books on statistical reliability engineering and lifetime data analysis; see for instance, Nelson [1], Lawless [2] , Kapur and Lamberson [3].

Many modern hardware products are so reliable that reliability engineers often deal with the test data having a few distinct (uncensored) failure times, which makes the problem of reliability estimation under strong censoring an important practical matter.

Another closely related problem is the reliability demonstration test planning. Based on the corresponding estimation procedures, the demonstration test planning related to the lifetime percentiles can be performed using the non-parametric as well as parametric approaches. The respective software tools are realized in commercially available software systems, e.g. Weibull++ developed by Reliasoft.

There is a common belief among reliability statisticians and engineers that non-parametric estimation procedures are always less effective than analogous parametric procedures. In this paper, we are going to show that in the case of strong Type II censoring (failure terminated testing), the non-parametric procedures for the percentiles estimation provide either better or the same results as their exponential counterparts. Note that the exponential distribution is still the most popular lifetime distribution in reliability engineering.

2. NONPARAMETRIC LIMITS IN CLASS OF CONTINUOUS DISTRIBUTIONS

The Type II censoring (the failure terminated testing) case is considered. Let t(r) be the time to the rth failure observed during a test of a sample of n identical items. The rth failure time (order statistic) t(r) is the lower y confidence limit on the 100pth percentile tp in the class of continuous distributions, if its order number, r, satisfies the following inequality (Wilks [4]):

Ip (r, n — r +1) > y (2)

where y = 1 - a, and Ix(a,b) is the incomplete beta function, given by

J ta—1 (1 — t )b—1 dt

Ix(a, b) = ^-

J ta—1(1 — t )b—1 dt

0

and 0 < x < 1.

In other words, if p, r, n, and y satisfy inequality (2), then the lower y confidence limit Tp(n, r, y) on the 100pth percentile tp is equal to the rth order statistic, i.e.

Tp(^ r, y) = t(r) (2-1)

It should be noted that for practical applications, the left side of (2) must be as close to the right side as possible.

Note also, that for given y and p, there exists a minimal necessary sample size, nmin(p, y), for which the time to the first failure t(1) is the lower y confidence limit on the 100pth percentile tp, i.e.,

Tp(nmin, 1, y) = t(1). (2-2)

Using relationship (2), this minimal sample size can be evaluated as

nmrn(P, Y) = En

ln(1 - r)

ln(1 - p)

(3)

where En(x) is the function rounding x up to the closest integer. The En(x) function usually makes the confidence probability y a little higher than one needs, which is illustrated by the following table.

Tablel. Actual Confidence Probabilities for y = 0.9 in Equation (3)

Quantile level p nmin Actual y (left side of (2))

0.1 22 0.902

0.05 45 0.901

0.01 229 0.900

Note that, Equation (3) is often used in reliability demonstration test planning.

3. PARAMETRIC LIMITS FOR EXPONENTIAL DISTRIBUTION

Now let us assume that the TTF distribution is exponential. Under this assumption, for the same Type II censored data, one can also estimate the lower confidence limit on percentile, using the same sample of nmin(p, y) identical items with the first and only failure at time t(i).

Consider the well-known lower confidence limit on percentile Tp(y) of exponential distribution for a Type II censored sample of size n with r uncensored failure times t(i) < t(2) < . . . < t(r). This lower confidence limit is given by:

2Tnr (- ln(1 - p))

*?(2r)

Tp (n, r,r) = nA 2,: N (4)

where Tnr = ^ t(i) + (n - r)t(r) is the total failure-free operation time accumulated by all items of the

i=1

sample (total time on test), and ^Y2(2r) is the yth quantile of Chi-square distribution with 2r degrees of freedom.

In the particular case of r = 1, the lower y confidence limit (4) takes on the following form:

- 2nt(i) ln(1 - p)

Tp (n,1,r) =-^--(4-1)

Now, we are going to show that, if the sample size n in the confidence estimate (4-1) is given by Equation (3), the confidence estimate (4-1) is reduced to t(1), i.e., Tp(n, 1, y) = t(1). In other words, in this case, the nonparametric lower y confidence limit on percentile coincides with its parametric (exponential) counterpart.

Let F(x, k) be the CDF of x2 distribution with k degrees of freedom, which is given by

gamma(k /2, x /2) ...

F (xk) =-r(k72)--(5)

where r(m) denotes the Gamma function, and gamma(l, z) is the lower incomplete Gamma function, which is defined as

gamma (l, x) = J t1 1e *dt .

0

In our case of k = 2 (see Equation (4-1)), the CDF (5) is reduced to:

F(x2) = gamma (1, x/2) = 1-e"x/2 (5-1)

Using (5-1), the yth quantile of the Chi-square distribution with 2 degrees of freedom ^2(2) can be written as

Z?(2) = -2ln(1 - r) (6)

Replacing n and ^r2(2) in (4-1) by the right sides of (3) and (6) respectively, one gets

Tp(n, 1, y) = tw, (4-2)

which proves that in the considered case, the exponential lower y confidence limit on percentile (41) coincides with its nonparametric counterpart in the class of continuous distributions.

3. NONPARAMETRIC ESTIMATION IN CLASS OF IFR DISTRIBUTIONS

The mentioned above minimal required sample size nmin(p, y) can be a serious limitation to applying the non-parametric lower confidence limit in the class of continuous distribution (2). This limitation stimulated obtaining the lower confidence limit in the narrower class of increasing failure rate (IFR) distributions by Barlow and Proschan [5, 6].

For the Type II censored sample of size n with r uncensored failure times

t(1) < t(2) < . . . < t(r) , the Barlow-Proschan lower confidence limit on the 100pth percentile of IFR distribution is given as

r J) = Tm

min

- 2ln(l - p) 1

Xl(2r)

n

It is important to note that, if

min

- 2ln(1 - p) 1

n

- 2ln(1 - p)

Xl(2r)

(7)

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

(8)

*r(2r)

the lower confidence limit for IFR distributions (7) coincides with the lower confidence limit for the exponential distribution (4). On the other hand, it can be shown that, if

- 2ln(1 - p) 1

1

ni^i ¡j) i i

min -^—=— (9)

Xr(2r) nJ n

then the statistic t(r) order number r satisfies Inequality (2) with the same parameters n, p and y (Kaminsky, [7]). Thus, if condition (9) is satisfied, one has two competing non-parametric confidence estimates - one in the class of continuous distribution and another in the class of IFR distributions.

At this point, an expected question is "which estimate is better, if Equation (9) holds?" Below, we are going to show that the better lower confidence limit is the one given by Equation (2-1), which is applicable to any continuous time to failure distribution.

If condition (9) is satisfied, according Equation (2-1), the lower y confidence limit on the percentile tp in the class of continuous distributions exists as the rth order statistic t(r). Under the same condition, according to Equation (7), the respective lower y confidence limit on the percentile tp in the class of IFR distribution is given by

T Z ta)+(n -r x>)

T (n,r ,y) = = --(10)

nn

Comparing the right side of (2-1) with the right side of (10), one comes to conclusion that under condition (9) the lower y confidence limit on percentile tp in the class of continuous distributions is better, as the one taking on a greater value compared to its IFR counterpart. Finally, it is easily to show that under condition (9), the exponential confidence estimate (given by Equation (4)) is worse compared even to the respective lower y confidence limit on the percentile of IFR distributions (Equation (7)).

If Equation (8) holds, the lower y confidence limit on the percentile tp in the class of continuous distributions does not exist, i.e. the order number r of t(r) does not satisfy Inequality (2). Nevertheless, the respective IFR confidence limit on tp does exist and coincides with its exponential counterpart (4).

We still have one case unconsidered. It is the case when the sample size n is too small to construct the low y confidence limit in the class of continuous distributions, and there is one failure in the sample, i.e. r = 1 and n < nmin(p, y). In this case, Equation (8) holds, and the lower confidence limit for IFR distributions (7) coincides with the lower confidence limit for the exponential distribution (4).

5. CONCLUSIONS

We have compared the procedure for constructing lower confidence limits for percentile of the exponential distribution with its non-parametric alternatives in the classes of continuous distributions and IFR distributions.

Table 2 on next page displays a summary of the above discussion. Analyzing the table helps one to come to the following conclusions.

1. In all the cases when both non-parametric estimates are available, the estimates in the class of continuous distribution provide either better or the same results as the IFR estimates and the parametric estimates for the exponential distribution.

2. In some cases, the non-parametric estimates for IFR distribution coincide with the respective exponential estimates, which comes with no surprise, if one recalls that the exponential distribution belongs to the IFR class (Barlow and Proschan [5]).

3. In an important from practical standpoint case, when the sample size is minimal needed to get the non-parametric estimate in the class of continuous distribution based on the first and only failure, it is shown that all three considered estimation procedures provide the same result.

4. For the given sample size n, number of uncensored failure times r, percentage 100p, and confidence probability y, Table 1 helps to choose the non-parametric estimation procedure yielding the same or better result than the one based on the assumption of exponentially distributed failure times. This can be especially helpful in the situations when the samples are strongly censored and applying goodness-of-fit tests is not very useful.

Table 2. Non-parametric and Exponential Lower y Confidence Limits Tp for 100pth Percentile for Type II Censored Sample of Size n with r Uncensored Failure Times_

Sample Size, n Number of uncensored failure times, r Distribution or Class of Distributions . T-2ln(1 - p) 1 ] min -2- |_ Zr(2r) n J

Continuous Tp cont IFR Tp IFR Exponential Tp Exp

n < nmin2 1 Does not exists Eq. (7) Tp IFR Tp Exp Eq. (4) Tp Exp = Tp IFR - 2ln(1 - p) Xy(2r)

n nmin 1 Eq. (2-2) Tp cont t(1) Eq. (7) Tp IFR = t(1) Eq. (4-2) Tp Exp = t(1) 1 n

n < nmin > 1 Does not exists as t(r) Eq. (7) Tp IFR = Tp Exp Eq. (4) Tp IFR = Tp Exp - 2ln(1 - p) %2(2r)

n > nmin > 1 Eq. (2-1) Tp cont t(r) Tp cont > Tp IFR Eq. (7) Tp cont > Tp IFR > Tp Exp Eq. (4) Tp Exp < Tp IFR 1 n

REFERENCES

1. Nelson, W. Applied life data analysis. New York, Wiley, 1982

2. Lawless, J. Statistical models and methods for lifetime data. 2nd edition. New York, Wiley-Interscience, 2003

3. Kapur, K. and Lamberson, L. Reliability in Engineering Design. New York, Wiley, 1997

4. Wilks, S. Mathematical statistics, Wiley, New York, 1962

5. Barlow, R. and Proschan, F. Mathematical theory of reliability. Philadelphia, SIAM, 1996

6. Barlow, R. and Proschan, F. "Tolerance and Confidence Limits for Classes of Distributions Based on Failure Rates," Annals of Mathematical Statistics, 37, 6, pp. 1593-1601, 1966

7. Kaminsky, M. "Non-parametric Confidence Estimation of Quantiles of Duration of Failure-Free Operation in Classes of Continuous and Aging Distributions," Engineering Cybernetics, 22, 1, pp. 10 - 14, 1984

Mark P. Kaminskiy is the chief statistician at the Center of Technology and Systems Management of the University of Maryland (College Park), USA. Dr. Kaminskiy is a researcher & consultant in reliability engineering, risk analysis and life data analysis. He has conducted numerous research & consulting projects funded by the government & industrial companies, such as Department of Transportation, Coast Guards, Army Corps of Engineers, Navy, Nuclear Regulatory Commission, American Society of Mechanical Engineers, Ford Motor Company, General Dynamics, Qualcomm Inc, and several other engineering companies. He has taught several graduate courses on Reliability Engineering at the University of Maryland. He is a coauthor of Reliability Engineering and Risk Analysis book, the second edition of which was published in 2009.

Dr. Kaminskiy is the author or coauthor of over 50 publications in journals, conference proceedings, and reports.

2 «mm is given by Equation (3)

Confidence limits on percentiles based on test results with a few failures – non-parametric versus exponential Текст научной статьи по специальности «Математика»

Аннотация научной статьи по математике, автор научной работы — Mark P. Kaminskiy

Похожие темы научных работ по математике , автор научной работы — Mark P. Kaminskiy

Текст научной работы на тему «Confidence limits on percentiles based on test results with a few failures – non-parametric versus exponential»