Reliability analysis of a multi state system with common cause failures using Markov Regenerative Process
Vidhya G Nair, M. Manoharan •
Department of Statistics, University of Calicut, Kerala - 673 635 (India) [email protected], [email protected]
Abstract
In this paper the dynamic reliability behaviour in terms of common cause failures is studied and a state space model has been formed for the evaluation of performance measures of multi state system. The concept of renewal is employed and the Markov Regenerative Process has been used for assessment of availability of the system. Using proposed technique we obtain the transition kernel and formulas for the steady state probabilities of the system. A numerical example is proposed to demonstrate the real possibility of the proposed technique.
Keywords: Multi state system, Common cause failures, Markov Regenerative Process, Availability
I Introduction
Failures of multiple components of a system due to a common cause is called Common Cause Failures (CCF). CCF is the one of the most important issues in evaluation of system reliability. When compared to random failures, which affect individual components, the frequency of CCF has relatively low expectancy. According to Rausand and Hoyland [11] common cause failures is a dependent failure in which two or more component fault states exist simultaneously or within short time interval and are direct result of a shared cause. Beta(^) factor model is the most commonly used model for common cause failures of the multi state system [3]. The p factor model describes the correlation between the independent random component failures and common cause failures in a redundant multi state system. A set of powerful techniques that proved for the solution of non-Markovian models is based on the ideas grouped under the Markov renewal theory. The application of Markov renewal theory for finding reliability and availability of stochastic systems is discussed in [6]. Semi-Markov process is the most widely used and adopted non-Markovian model for evaluating reliability and availability of multi state system. A good reference on the semi-Markov process (SMP) is [8] which discusses the the theory of SMP very clearly, also gives examples which helps to understanding the theory and how to apply the model in many real life situations. The stationary character of Markov regeneratve process (MRGP) has been studied in [10]. Most of the theoretical foundations of Markov regeneratve process (MRGP) were discussed in [2] in which it is named as semi regenerative process. One of the first paper which consider semi-regenerative processes is in Russian (refer [13]). For a concise review on Semiregenerative, decomposable Semi-regenerative Processes and their applications one may refer to [12]. The transient and steady state analysis of stochastic petri nets are discussed analytically and numerically in [1]. MRGPs have been used to evaluating reliability and availability of the system. Some examples concerning reliability and availability of power plants and fault tree systems can
be found in [4,5,9,16]. Many other examples and applications of MRGP in the dependability context has been solved using SHARPE software [14] as demonstrated in [17]. Semi Markov, Markov regenerative models and Phase type expansion with a number of solved examples were discussed in [15]. The system-level reliability of a heterogeneous double redundant renewable system under Marshall-Olkin failure model in the case when repair times of its components have a general continuous distribution is studied in [7]. The mathematical model proposed therein allows to obtain the explicit expression in terms of Laplace transform for the system reliability function.
II Markov Regenerative Process
Consider a stochastic process {Z(t), t > 0} with state space H. Suppose every time a certain phenomena occurs, the future of the process Z after that time becomes a probabilistic replica of the future after time zero. Such time which is usually random is called regeneration time of Z. Such process is named as regeneration process. In a Markov Regenerative Process (MRGP) the stochastic evolution between two successive regeneration points depends only on the state of regeneration not on the evolution before regeneration.
Following [1] a stochastic process {Z(t),t >0} on H is called an MRGP if there exist a Markov renewal sequence {(Yn,Sn),n > 0} of random variable such that all conditional finite dimensional distribution of {Z(Sn + t),t > 0} given {Z(u),0 <u< Sn,Yn = i} i E H are the same as those of {Z(t), t > 0} given Y0 = i.
From the above definition we obtain embedded Markov chain (EMC) in {Z(t),t > 0}. Global kernel K(t) gives a description of the evolution of process from the Markovian regenerative moment without describing the happenings between regenerative moments.
K(t) = Kij(t) = Pr{Y1 = j,S1 < t/Y0 = i}Vi,j E H An MRGP can change states between two consecutive Markov renewal moments. E(t) is the local kernel which explains the state probabilities of the process during the interval between successive Markov regenerative moments.
E(t) = Eij(t) = Pr{Z(t) = j,S1 > t/Y0 = i}Vi,j E H The matrix of conditional transition probabilities are given by
Vij(t) = Pr{Z(t) = j/Zo = i}Vi,j E H In many real life problems involving Markov Renewal Process our primary aim to compute Vy (t) effectively and hence several performance measures of interest like Availability, Reliability based on Vij(t)
The conditional transition probabilities Vi} (t) at any instant t can be computed as V^t) = Pr{Z(t) = j.SJZo = i} + ZkEn' J« dK(u)Vkj(t - u)Vi,j E H A Markov renewal equation is defined by this set of integral equations. Equation can be expressed in Matrix form as
V(t) = E(t) + K(t).V(t)
Laplace-Steiltjes transform K(s) and E(s) of K(t) and E(t) respectively can obtained as K(s) = J0°° e-stdK(t) E(s) = J0°° e-stdE(t)
Then
7(s) = E(s) + K(s)V(s) = [I - K(s)]-1E(s)
V(t) can be obtained by taking inverse laplace transform of 7(s)
P(t) 1Xft = P(0)1Xa X V(t)axa
For the purpose of the steady state analysis of an MRGP the following two matrices a =
Vidhya G Nair, M. Manoharan RT&A, No 3 (50) RELIABILITY ANALYSIS OF A MULTI STATE SYSTEM_Volume 13, September 2018
[aij] and 0 = [<pij] should be calculated. ay is the Mean time the process from state i spends in state j. (p = [<pij] is the one step transition probability matrix of the embedded Markov chain. The two matrices are defined as
a = Jt°°_0E(t)dt = lim1E(s) (1)
$ = limK(t) = limK(s) (2)
To obtain the steady-state probabilities of the MRGP, at first we have to solve the steady-state probabilities of the embedded discrete time Markov chain by solving
V = V.0
v.e = 1
where e is a column vector with its elements equal to 1 and v is a row vector. Steady state probability vector is
v = [V!, V2,... vk] where ken The steady state probability n = [ni,n2, ...nk] of the MRGP is given by
va ,ON
U = --(3)
vae
Steady state Availability of system
Let n = {0,1, ...,k} be the set of all possible states of a system. Let n' denote the subset of states in which the system is functioning and let F = n — n' denote the states in which the system is failed. The long term availability of the system is the mean proportion of time when the system is functioning. Steady state system availability can be obtained by
^ = Zjen' Kj (4)
III Parallel System with Single Repair Facility and CCF
Consider a system which consists of two components named A and B. A single repairman is assigned for the system with the First Come First Served (FCFS) scheduling policy for repair. When the components A or B fails the repairman begins to repair if he is not busy. When one component is already under repair and the other component fails then the second component has to wait for repair till the repairman is free. The lifetime of components A and B are exponentially distributed with the rates XA and AB respectively. The distribution function of the repair times of components A and B are GA(t) and GB(t) respectively. Let nA(t) and nB(t) be the respective repair rates of components A and B. Also in this case common cause failure involving both components A and B can occur with probability p. Define the stochastic process Z = {Z(t); t > 0} to represent the system state at any instant t. Z(t) e {1,2,3,4,5}
System is in state
1, if both components are working at time t
2, if component A is under repair while component B is working at time t
3, if component B is under repair while component A is working at time t
4, if component A is under repair while component B is waiting for repair at time t
or due to common cause failure in which the repairman randomly selects component A is the first to be repaired
5, if component B is under repair while component A is waiting for repair at time t
or due to common cause failure in which the repairman randomly selects component B is the first to be repaired
We can define that all state transitions correspond to Markov renewal moments S = {Sn; n e N} and the embedded Markov chain Yn;n e N such that Yn is the state of the system at time
Sn+(i.e,V„ = Z(Sn+))
Figure 1: State transition diagram
Analysis of the above reliability transition diagram shows that Z is an MRGP with an embedded markov chain (EMC) defined by the states 1, 2 and 3. We can observe the transition to states 4 and 5 do not belong to the EMC since they are non-renewal moments. System is in state 1 if both A and B are up states and the repairman is free. Component A can fail at rate XA and reach state 2. The component A is repaired with cdf GA(t) to bring the system back to state 1. If component B fell down during repair time of component A , the system jumps to state 4. When the component B is down the system reaches the state 3 and when B is repaired with repair time cdf GB (t) to back the system state 1. But the component A fail jumping the state 3 to state 5. To find the distribution of Z for MRGP we have to construct kernel matrices [global kernel matrix ^(t)and local kernel matrix E(t)]. RA, RB be the time to repair and LA and LB be the times to failure of A and B respectively.
/0 k12(t) k13(t)\ K(t) = \ k21(t) 0 k23(t) ( ) \k31(t) k32(t) 0
K12(t) = Pr{If A fails before B or common cause failures occur and repairman chose to
repair A first and completed the repair action}
Ra)}
= Pr[Z(S1) = 2,S1< t/Z0 = 1} = Pr{(LA < t n LB > La) U (Ra < t n (La = LB) < = (1- ß)ÄA J e-(XA+XB)udu + ^(Aa+ Ab) J e-(ÄA+ÄB)uGA(t - u)du
K13(t) = (1- P)Xb JH e-(i^udu +{(Xa+ Ab) J« e-(^)uGB(t - u)du K21(t) = Pr{Repair A is finished up to time t and B has not failed during repair A}
= Pr{Z(S1) = < t/Z0 = 2} = = Pr{RA <tnLB> Ra} = J e-XBudGA(u)
K23(t) = Pr{Repair A is not finished up to time t and B failed during the repair A}
= Pr{Z(S1) = 3,S1< t/Zo = 2} = J0 (1 - e-ABu)dGA(u)
K31O:) = Pr{Z(S1) = 3,S1< t/Zo = 3} = J« e-X*udGB(u)
Vidhya G Nair, M. Manoharan RT&A, No 3 (50) RELIABILITY ANALYSIS OF A MULTI STATE SYSTEM_Volume 13, September 2018
K32(t) = Pr{Z(S1) = 2,Si< t/Z0 = 3} = 10 (1 — e-ÄAU)dGB(u)
E(t) =
/En(t) 0
0 E22(t) 00
0 0
Ess(t)
E14(t) Els(t)\ E24(t) 0 0 E3S(t)
E11(t) = Pr{Remaining state 1 until time t} = Pr[Z(t) = 1,S1> t/Z0 = 1} = (1
ß)e
-aA+xB)t
E22 (t)=Pr{repair A is not finished up to time t and B has not failed}
= Pr{Z(t) = 2,S1> t/Z0 = 2} = (1 — GA(t))e
- xBt
- xAt
Ess(t) = (1 — GB(t)e
E14(t)=^e-(X^)t E15(t)=2e-(X^)t E24 (t) =Pr{repair A is not finished up to time t and B has not failed} = (1 — GA(t))(1 — e-^)
E3S(t) = (1 — GB(t))(1 — e-^) Laplace-Steiltjes transform of Global Kernel Matrix is
0
K(s) =
Ga(S+Àb) Gb(S+Äa)
(1-ß)XA ß(XA+XB)G A (s) s+XA+XB 2(s+XA+XB) 0
GB(s) — GB(s + XA)
(1-ß)XB ß(XA+XB)GA(s)\ s+XA+XB 2(s+Xa+Xb)
Ga(S) — Ga(S+Äb) 0
Laplace-Steiltjes transform of Local Kernel Matrix is
' (1-ß)s
S+ÄA+ÄB 0
S+Ab 0
(1 — G^s + A,,))
0 0
S
S+ÀA
(1 — GB(s + XA))
E(s) =
ßs
2(s+äa+äb) 0
S+ÄB
ßs
Ga(S+*b)
2(s+äa+äb) 0
s+ÄA — G^(s)+7+rAG^(s + Ä^),
0
0
IV Numerical Illustration
Consider a numerical example wherein the components have deterministic repair-times with distribution functions,
GA(t) = u(t — ßA),ßA > 0 GB(t) = u(t — ßB),ßB > 0 where u(t) is the unit step function. The units are hours for repair-time (parameters ßA and ßB) and hour-1 for the failure rates (parameters XA and AB). The values of parameters of the system are given below.
Component A ß
A 0.01 5
B 0.01 5
K(s) =
(1-0)0.01 00.02e-5s s+0.02 2(s+0.02)
-5(s+0.01) -5(s+0.01)
-5(s+0.01)
(1-0)0.01 @0.02e-5s\^
s+0.02 2(s+0.02) e-5s — g-5(s+0.01)
0
0
0
e
/(l-P)s
s+0.02 0
0
(
The Steady state probability vector is
0.892074(1 — /),0.045738,0.045738,0.446037/ + 0.008225,0.446037/3 + 0.008225
_J_(1-e-5(s+0.01)) 0 s+0.01 v '
E(s) =
2(s+0.02)
(1-e-5(s+001)) 0
+0.01
Ps
2(s+0.02)
,e-5(s+0.01) 0
; + s c-5(s+0.01) s+0.01
0
0
0.01
-5s
+
-
+0.01
0.01
0
— e
+0.01
Steady state Availability
= n1+n2+n3 = 0.98355(1 —/) (5)
Impact of the common cause failures on the system is evaluated for the corresponding model. The MRGP steady state availability can be calculated for varying common cause failure probability / value. By analyzing the MRGP for the above numerical values, the graph depicted in Fig. 2 is obtained.
Steady state Availability Graph
1
0.95 0.9 0.85 0.8 <8 0.75 0.7 0.65 0.6 0.55 0.5
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
P
Figure 2: Steady state availability of the system for varying / The graph reveals how the steady state availability (Ara) of the system varies by changing the common cause failure probability / from 0 to 0.5. On viewing the graph we can observe a clear linear trend of the Am with respect to /.
V Conclusion
In this paper analytical techniques based on MRGP are explored for modeling and evaluation of availability of multi state system. A parallel system of two components with common cause failures were elaborated to show the applicability of MRGP in the evaluation of performance
measures with numerical example. Since MRGP can overcome limitations of SMP to some extent,
one can solve a wide range of problems in system reliability on similar lines.
VI Acknowledgments
The authors are grateful to the anonymous reviewer for his/her comments and short
remarks on the earlier version of this paper which improved its presentation.
References
[1] Choi,H. , Kulkarni, V. G. and Trivedi, K. S. (1994). Markov Re- generative Stochastic Petri Nets. Performance Evaluation, 20: 337-357.
[2] Cinlar, E. (1975). Introduction to Stochastic Processes. ProcessesPrentice-Hall, Englewood Cliffs, N.Y.
[3] Fleming, K. N. (1974). A reliability model for common mode failures in redundant safety systems. Technical Report GA 13284, General Atomic Co., Pittsburg, PA.
[4] Fricks, R., Telek, M., Puliafito, A. and Trivedi, K. (1997). Markov renewal theory applied to performability evaluation, in: K. Bagchi, G. Zobrist (Eds.), State-of-the Art in Performance Modeling and Simulation. Modeling and Simulation of Advanced Computer Systems: Applications and Systems, Gordon and Breach Publishers, Newark, NJ, EUA, pp. 193-236.
[5] Fricks, R., Yin, L. and Trivedi, K. (2002). Application of semi-Markov process and CTMC to evaluation of UPS system availability, in: RAMS2002, pp. 584-591.
[6] Kulkarni, V. G. (1995). Modeling and Analysis of Stochastic Systems, Chapman and Hall, London, UK.
[7] Kozyrev, D., Rykov, V. and Kolev, N. (2018). Reliability Function of Renewable System under Marshall-Olkin Failure Model. Reliability: Theory and Applications, Vol. 13, No 1 (48) March, pp.39-46.
[8] Limnios,N. and Oprisan, G. (2001). Semi-Markov Processes and Reliability, Statistics for Industry and Technology, Birkhauser, Boston, MA, USA.
[9] Perman, M., Senegacnik, A. and Tuma, M.(1997). Semi-Markov models with an application to power-plant reliability analysis, IEEE Transactions on Reliability, 46 (4): 526-532.
[10] Pyke, R. and Schaufele, R. (1966). The Existence and Uniqueness of Stationary Measures for Markov Renewal Sequences, Ann. Math. Statist., 37: 1439-1462.
[11] Rausand, M. and Hoyland, A. (2004). System Reliability theory Models, Statistical Methods
and Applications, Wiley Int, Canada.
[12] Rykov, V. (2011). Decomposable Semir-regeneranive Processes and their Applications, in: monograph LAMPERT Academic Publishing, 75pp.
[13] Rykov, V. and Ystrebenetsky, M. (1971). On regenerative processes with several types of regeneration states. Cybernetics, N 3, pp. 82-86, Kiev. (In Russian).
[14] Sahner, R., Trivedi, K. S. and Puliafito, A. (1995). Performance and Reliability Analysis of
Computer Systems: An Example-Based Approach Using the SHARPE Software Package, Kluwer Academic Publishers, Dordrecht, The Netherlands, .
[15] Trivedi, K. S. and Bobbio, A. (2017). Reliability and Availability Engineering, Modeling, Analysis and Applications, Cambridge University Press, UK.
[16] Wereley, N. and Walker, B.(1988). Approximate semi-Markov chain reliability models, in: 27th IEEE Conference on Decision and Control, vol. 3: pp. 2322-2329.
[17] Xie, W. (1999). Markov Regenerative Process in Sharpe, Masterss Thesis, Duke University, Department of Electrical and Computer Engineering, Durham, NC, USA..