SELECTED APPROACHES FOR RELIABILITY COMPARISON OF HIGHLY
RELIABLE ITEMS
D. Valis, Z. Vintr •
University of Defence, Brno, Czech Republic e-mail: [email protected], [email protected]
M. Koucky •
Technical University of Liberec, Liberec, Czech Republic e-mail: miroslav.koucky@tul. cz
ABSTRACT
The application of electronic elements introduces a number of advantages as well as disadvantages. The paper deals with advanced method of dependability - reliability analysis procedure of a highly reliable item. The data on manufacturing and operating of a few hundred thousands pieces of the highly reliable devices are available and from the statistical point of view they are very important collection/set. However, concerning some pieces of the items the manufacturing procedure of them was not made, controlled and checked accurately. The procedure described in the paper is based on the thorough data analysis aiming at the operation and manufacturing of these electronic elements. As the data sets collected are statistically non-coherent the objective of the paper is to make a statistical assessment and evaluation of the results. Failure rates calculations and their relation comparability regarding the both sets are presented in the paper.
1 INTRODUCTION
As we know from previous publications the item is initialised by start power. We have also discovered from the previous publications that the reliability assessment of the items may highlight some mathematical non-coherence. The data sets which are available have different digits number therefore their comparability might be problematic. That is why the measures - failure rates calculated must be tested before claiming their comparability in terms of the functional description - characteristic of the item. All the terms used are in accordance with the IEC 60050/191. The whole calculation has been made from the reason that unfortunately non-intentional causes resulted in non-compliance with the manufacturing process during development and manufacturing a new item. While manufacturing the item a relatively minor shortening of program protocol took place, thereby shortening the initialisation time. This situation resulted in the production of many of incorrectly manufactured items where the initialisation time was shortened by the program. The non-compliance with the manufacturing process was detected only by accident and that was after some time. However, most of the items manufactured this way have been mounted in systems and they have been in operation. In the paper we are going to address reliability assessment of a highly reliable electronic item.
In this paper the evaluated application of reliability data analysis techniques - procedure for comparison of two constant failure rates is perceived of an item produced for systems' specific
use/utilization. Item is implemented in a system in order to control one of the step functions of the system.
Based on the assumptions and the calculation which have been made before, the reliability measure values for correctly and incorrectly programmed items were found. These values were calculated at the required confidence level. By comparing these values we were able to determine whether the error affects the item reliability during a manufacturing process.
However, concerning the field data we face a theoretical problem. The data set is apparently different concerning a digit place in terms of the operation time of the item sets. It means that correctly manufactured items obviously operate for a shorter time than the ones manufactured incorrectly. This situation can affect a calculation procedure as well as a comparison of the results. Taking into account this situation it is necessary to test the field data using the statistical test which is supposed to prove their comparability. The results of the test are mentioned in the second paper named "Statistical comparing of reliability of two sets of highly reliable items". For more details see e.g. Holub (1992) or Finn (1998).
2 APPLICATION OF A COMPARISON TECHNIQUE
In this case when taking into account two sets of objects we have to consider reliability measures where there is a presumption that the sets can be different. Time to failure is for both sets independent and fulfils the presumption of exponential distribution. For more details see IEC 60605-4, IEC 61 650 or Lipson & Sheth (1973).
It is necessary to introduce other important relations which are essential for next steps. Because it is a case of non-repaired items, we can assume that:
accumulated operation/test time is calculated as a sum of times to failure; all the objects belong to the same original set.
In order to use the comparison procedures the following data are required:
an observed number of valid failures i'i a r2 in two observing periods - it is fulfilled; accumulated valid test times T'* in these two periods - it is fulfilled; - the confidence level should be stated/chosen if required;
All the information is available and it is possible to continue working with it.
Following the IEC 61 650 we choose the accurate calculation of two constant failure rates comparison using F - distribution. We calculate/using the equation (1).
r T* '1+1 2
For the chosen confidence level we get the fc (either for 1 - oco = 0,90 or for 1 - oco = 0,95) from the tables of F - distribution stated in the appendix A of the document EN 60812:2006.
l=Fx_ao(v^2) (2)
where Vi = 2{r\ + 1); v2 = 2r2.
Next we use the decision criteria given in the table 2 of the IEC 61650 stating that if f >fc, then wi < wj, or if/< fc_ then wi = w2). Generally the recommended confidence level for calculation is oco = 5% or 10% which corresponds with (1 - a0) - fractiles, that is 0,95 - fractiles or 0,90 - fractiles of F- distribution.
The calculation:
The calculation has been intentionally modified due to the industrial secret and due to not providing the sensitive data about the product. The confidence level was stated at 95%.
The mean time to failure is calculated according to the (1)
a) for incorrectly manufactured items:
_ 2T = 2.230 995 532 h
m'FIC~ Xl, 68^48
461991064 h 6
m1J7,r =- =6,73 . 10 h
,F/C 68,648
where accumulated operation time of all wrongly manufactured items according to the assumption
n
given in IEC 61650, chapter 4, article 4, and according to the formula is T*F = ^tf =
t=i
230 995 532 h; a number of the degrees of freedom according to the formula (2) is v = 2/' + 1= 2.25 + 1 = 51; the chi-square for 51 degrees of freedom and the confidence level a= 95% is 68,648.
b) for correctly manufactured items
_ 2T*P!C = 2.56 864 717/;
miF,c 2
113 729 434/? 1 ^ 1A?U
mlF,c =-—- = 1,46 . 107h
where accumulated operation time of all wrongly manufactured items according to the assumption
n
given in IEC 6150, chapter 4, article 4, and according to the formula is
rc = ^,c = 56 864 717h;
t=i
c
a number of the degrees of freedom according to the formulae (2) is v
= 2r + 1 = 2.1 + 1 = 3; the chi-square for 3 degrees of freedom and the confidence level a= 95% is 7,8.
The calculation of the/according to the formula (1) r, 7T 1 230 995 532
j =_=_x_!_ =_x_
/ j +1 T* 25 + 1 56 864 717 / = 0,156
Next, the calculation of the fc according to the formula (2)
/C=*U(V1>V2)= 19,476 where Vi = 2(n + 1) = 2(25+1) = 52; v2 = 2r2 = 2
As the calculation introduced above and using this approach shows that f<fc. Based onto the assumption of the /'-distribution approach we can state that the failure rates of the basic sets №/ = wj, so they are constant.
3 WEIBULL REGRESS ANALYSIS APPROACH UTILISATION
Following approach is based onto Weibull regress model where the scale parameter is modelled using both two-parametric function and covariate.
We have to consider a random sample 7 = 1,.where Xt is time to failure, or time of
censoring; dt censoring indicator (ci = 1, if X is time to failure or ci = 0 if X. is time of censoring); zi is variable (so called covariate) having values:
Zi = 0, ifX, is time for item of /-' type Zi = 1, if XL is time for item of C_type
Objective of the analysis is to state whether the difference in the technical life of the both types of items is significant from the statistical point of view.
The answer might be based onto the Weibull regress model where the scale parameter is modelled using both two-parametric function X(z, /3) = = exp(/?„ + fixz) and covariate r. Let's assume that the/is the Weibull probability density function
f{t)Uata-leM-tta) for t >0 (3)
[O everywhere else
where X = > 0,a > 0 are Weibull distribution parameters (6 . scale parameter, a . shape parameter).
The Weibull reliability function is defined as follows:
«flj*4*» for,>-° (4)
[l everywhere else
We assume that X parameter is function of time for our application X(z, /?) = exp(/?„ + fixz), where P = (PmP]) is a vector of unknown parameters and r is variable (so called covariate) reaching two values in our case:
z = 0 for first type of sample (items of the F_type);
z = 1 for second type of sample (items of the C type)
The probability density function might be stated in the following form than:
f{f p z) = jeXP(/?0 + Az)ctfa-1 exp(/?° + for t> 0 (5)
[ 0 everywhere else
The reliability function can be expressed in following way:
R(r J3 z) = jexp(~ 6Xp(/?0 + frz)t<Xfor t~° (6)
[ 1 everywhere else
We use the method of maximal plausibility for unknown parameters estimation a,/3 in this
regression model
Plausibility function is defined in the following form than:
L(y,a,j3) = n (Ay,; P, z))dt (R(y,; P,
i=1
The function will be expressed in following way after taking the logarithm of the function and regarding to the expressions of (5) and (6):
n n
l{y,a,P) = d\noc + YJdi((«-1 )y, XexP(^ +Po+ Az,)>
i=i i=i n
where yf = Inand d =
i=1
To find the maximally plausible estimation of the parameters a, (3 we have to create a system of partial differential equations. The system has presumably following form:
dl(v.cc.jB) d ^ , ^ , 0 0 , A
a = ■—+ IXV> "exP+ Po + P\z,) = 0
act a /=1 /=1
dI(v,a,P)
¿Po dKy,a,P) dP
= d- ^expCqv,. + /30 + /3lzj) = 0
i=i
n n
=Yadiz< ~Yjz< exp(^;,+Po+Piz,)=°
We get following estimations of the parameters applying the numerical calculation of above mentioned equations:
a = 1,2801;A0 =-17,8800; A =-1,9785 (7)
Consequently we get the so called "information matrix" while conducting the second partial derivation of the plausible function logarithm:
-^-=----¿^y, exp(q)A + A, + ^z,)
dor a"
iR2 = ~L exP (W + Ao + M)
"A i /■=i
= - - X rr exp(^, + J30 + /7,2,) "Pi i= 1
d2l(y-a,j3) ^ . _ _ . zaza = z< exP(^ + A + M)
«A »A ,=i
' ^ = -2. ^ exPC^ + A> + M)
d2l(v\a,j3) ^ . . . .
a = "Z exP(^, + A, + A-,) 9 A tT
And finally we can also determine the standard deviations of our estimations for (7). We get at the
cr (d) = 0,2369; cr(A,) = 1,5710; cr(A) = 1,0075 (8)
Now it is remarkable that the claim of un-existence of life time difference in terms of the both items types reliability (itemF, itemC) might be converted to hypothesis test: H0: A = 0xHl : A * 0
(zero hypothesis equals to the goodness of fit).
The test statistic has the value of 3,856 if we use a statistical test based onto Neymann-Pearson lemma (Wald's test might be also used as an alternative but it goes to the same results). Such test is a classical statistical test based onto plausibility ratio - likelihood ratio test. Its test statistics has the following form Li?(A) = 2[/(v;A)-/(v;0)J and the is asymptotically distributed. The value /(v;....) is value of logarithm plausibility function. The Wald's test is in our case based onto
(A)2
following test statistics-which has also asymptotic distribution of the xl •
var(A)
The result got by the Neymann-Pearson lemma calculated on the confidence level 0,05 leads to rejection of the H0 hypothesis. Therefore we can claim that the statistical difference between both item types is significant.
4 RISK ANALYSIS RESULTING FROM THE FAILURE OCCURRENCE - FUZZY APPROACH
The description of the item behaviour presented above indicates some possible situations. Such item behaviour might cause a failure occurrence with all possible consequences. We need to assess both the potential of such situation occurrence and the consequences. Risk assessment is on of suitable tools which might be used for this purpose.
In this phase of observing and assessing the objects we are talking about possibilities of risk characteristics assessment. Since we know the item failure probability can be stated using the approaches above. Than we need to assess the consequences of the failure occurrence which is next fragment of risk (as stated in the usual form). The detection possibility is also about to be stated but is recommended to use the approaches mentioned in standards (e.g. IEC 60605-4). Total risk number might be calculated either by common approaches or by another, non-traditional -soft, method. One of such method might be also fuzzy logic.
Let us assume that any technical object in any instant of time can occur in any operational state (operational condition, failure state or partially failure state - functionality is limited, but not lost). A transfer between these states is subject to stochastic laws. As suitable means to depict transfers between individual operational states is use a theory of Markov processes. However, we shall not deal with a description of transfers between individual operational states. A greater attention will be paid to mathematical modelling of effects related to a transfer between individual states. As transfers between states are connected with a number of effects, it is very important to deal with them in more detail. The most important and from the respect of the function of the object also the most critical is a transfer from an operational state into a fault (using hardware approach). This transfer can result in the worst effects. However, it will depend what is the mechanism of a transfer. If a transfer is caused by a scheduled downtime of the object because of the preventive maintenance, it is unpleasant matter, but better than if, for example, a transfer caused by an unexpected failure with devastating results.
To evaluate severity of effects of failures of technical objects, we decided to use fuzzy set theory 0. Since this theory uses vague terms that already appear in classification of severity of failure effects, then a decision on acceptability of failure and determination on the importance of the object on which the failure appeared. Simultaneously, it is possible using this theory to assign numerical value to the studied circumstance and thus we consider it suitable. Through this theory it is also possible to include severities of failure effects D of single objects into a fuzzy set Here, we shall assume that single fuzzy sub-sets consist of coefficients of failure effect severity. Based on the seriousness of these effects it will be later determined to what level are the given groups indispensable. To classify the failure effect criticality in relation to the inherent availability of technical object we have selected the following three criteria of influence on: Function - Di, Safety - D2,
Recovery-related costs - D3.
For every of these criteria we created an ascending scale of coefficients to enable to assess a seriousness of possible effects of failure related to the individual criteria. The scale is determined by a set I with four elements Ie {1;2;3;4}, while a value of coefficient of individual effect of failure in relation to selected criteria is denoted Dh where /e<l,2,3>. The principle is that with an increasing value of coefficient increases also a severity of effect. These values serve as the basis to express a severity of failure effect D. Scales of severity criteria are in the tables 1-3. For more details see for example Valis & Vintr (2006), Novak (1999) or Driankov & Hellendorn & Reifrank (1993).
Tab. 1 - Categorization of failures from the viewpoint of effects on the system functionality
Definition Coefficient of significance /)/
Even after a failure, a system is capable to fulfil all required functions. 1
A failure partially limits an ability of the system to perform a required function, but the crew can cope with the impacts. 2
A failure significantly limits an ability of the system to perform some of required functions and the crew is not capable cope with the impacts of failure with its own force. 3
A failure prevents a system to fulfil the required functions. 4
Tab. 2 - Categorization of failures from the viewpoint of safety of the system
Definition Coefficient of significance D2
A failure has no effect on a safety of the system, crew and environment.
A failure results in a lowering of safety of the system, crew and environment. 2
A failure causes a situation when the system is dangerous for the system, crew and environment. 3
A failure results in a direct threat of health and lives of people or great losses of property. 4
Tab. 3 - Categorization of failures from the viewpoint of repair cista
Definition Coefficient of significance l) <
Removal of failure effects does not require costs higher than 0.1 % of the system purchase costs.
Removal of failure effects does not require costs higher than 1 % of the system purchase costs. 2
Removal of failure effects does not require costs higher than 10 % of the system purchase costs. 3
Removal of failure effects requires costs higher than 10 % of the system purchase costs. 4
The resulting coefficient D is at the same time a coefficient of seriousness of a given object and a relation expresses it:
D= Di.D2.D3- Dmin=l,Dmax=64. (9)
To construct a fuzzy sub-set, a "fuzzification of values" is used. Actual observed values of physical values are bounded and are expressed by means of real numbers. Therefore as a universum of fuzzy numbers that represent vague concepts related with a classification of failure effects, a suitable closed interval for every of them will be sufficient. We will reach single classes of failure effects (seriousness) by dividing the resulting coefficient D into suitable sub-intervals (see above). For practical use and graphical representation a trapezoidal fuzzy number is suitable, see Figure 1, where /u expresses a function of applicability and x obtained fuzzy number.
To determine the actual functions of applicability for fuzzified value of selected value it is enough to identify in what interval this value usually occurs. This interval is then a core of found fuzzy number and we denote it (b,c). For a demonstrated example, this core is always expressed by limit values of individual coefficients of significance of failures. Further, it is determined what values a variable certainly does not assume. A set of these values we assume to be expressed as (-°°;a) u (d;°°), while a < b < c < d. Then an interval (a;d) is a support-set „A" of found fuzzy number. A function of applicability of found fuzzy number into a set „A" we express as follows:
/lia(x) = max
mm
x - a x — d
(10)
/
b — a c - d
For another procedure, it is necessary to determine individual fuzzy sets and based on them perform final categorization of failure effects. For this purpose, a four-level categorization of the failure effects recommended in many international standards, is used:
Minor: assigned fuzzy set (1;4);
Maj or: assigned fuzzy set (6; 16);
Critical: assigned fuzzy set (18;36); Catastrophic: assigned fuzzy set (48;64).
Figure 2 graphically represents an applicability of severity of effects of individual failures into fuzzy sets.
Function of applicability
Figure 2. Graphical model of fuzzy sets for evaluation of severity of failure effects
A failure occurrence might have various consequences. Speaking about the consequences impacts in our case of the highly reliable electronic item implemented inside a complex system. Therefore the precise and adequate failure profile has to be determined in the risk analysis. The procedures described above might serve to express the total risk number consisting from the well known form:
R = PxC (11)
where P - is the probability value; and C - is the value of consequences.
The additional index of the detection might be also applied but we would recommend to follow standards like IEC 60605-4 to handle with this characteristic for risk assessment procedures.
5 CONCLUSION
The procedure as described above was used to calculate and compare the reliability measures - failure rates in this case of the single sets which served as correctly and incorrectly made electronic items. Following the obtained results a possible effect of a manufacturing error upon the items reliability was estimated. As we can see from the results although the data sets are different -they have different size of the information which they contain - we need to compare them.
Consequently we need to state if the results in the form of the failure rate are comparable and statistically same. These claims can prove the dependability of the product and finally safe the good name of the company producing a valuable goods. This fact should be referred to when carrying out statistical data evaluation using the introduced tools.
The above-mentioned ad-hoc procedure was designed as a tool to provide assessment of the effects/failure occurrence of the use of vetronics elements on the total system's dependability. This method assumes that an assessment of the effects of failures of individual subsystems will be done in a described way, at first without the vetronics components and then with the vetronics components. Based on a comparison of results of both analyses it can be assessed whether, and in what extent, vetronics can influence a dependability of the system. Finally, this method also enables to assess and to state the importance level of the individual components and subsystems from the viewpoint of capability of the system to perform required functions. It also provides to involve other criteria of evaluation such as for example security robustness or corrective maintenance costs.
6 ACKNOWLEDGEMENTS
This paper was supported by the GA Czech Republic project number 101/08/P020 „Contribution to Risk Analysis of Technical Sets and Equipment", and by the Ministry of Education, Czech Republic project number 1M06047 „The Centre for Production Quality and Dependability".
REFERENCES
IEC 60050/191 1990. International Electrotechnical Vocabulary - Part 191: Quality and dependability of services. IEC 60605-4 Equipment reliability testing - Part 4: Statistical procedures for exponential distribution - Point estimates,
confidence intervals, prediction intervals and tolerance intervals. EN 60812:2006 Analysis techniques for system reliability - Procedure for failure mode and effects analysis (FMEA). IEC 61650 Reliability data analysis techniques - Procedures for comparison of two constant failure rates and two
constant failure (event) intensities. Lipson, Ch., Sheth, N. J. 1973. Statistical Design and Analysis of Engineering Experiments', New York: Mc Graw Hill. Holub, R. 1992. Dependability tests (stochastic methods). Brno: Military Academy. Novak, V. 1999. Mathematical principles of fuzzy logic, Boston: Kluwer. Finn, J. 1998. Electronic component reliability, Chichester [UK]: John Wiley & Sons Ltd. Driankov, D., Hellendoorn, H., Reifrank, M. 1993. An introduction to fuzzy control, Berlin: Springer-Verlag. Valis, D., Vintr, Z. 2006. Dependability of Mechatronics Systems in Military Vehicle Design. In: Proceedings of the European Safety and Reliability Conference "ESREL 2006", London/Leiden/New York/Philadelphia/Singapore: Taylor&Francis Group, pp. 1703 - 1707.