Научная статья на тему 'STOCHASTIC ANALYSIS OF A COMPLEX REPAIRABLE SYSTEM WITH A CONSTRAIN ON THE NUMBER OF REPAIRS'

STOCHASTIC ANALYSIS OF A COMPLEX REPAIRABLE SYSTEM WITH A CONSTRAIN ON THE NUMBER OF REPAIRS Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
0
0
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
Reliability / repair facility / warm standby redundant system / optimum replacement / Mean time to system failure

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — K. Shankar Bhat, Miriam Kalpana Simon

Reliability characteristics of repairable systems have been studied in the past in great detail by numerous researchers. Their findings are based mainly on the significant assumption that the repairs are carried out by one or more repair facilities, and the process of repair renews the functional behavior of the components or units in the system. In other words, the statistical properties of the components or units can be restored by carrying out the repair upon failure. This means that failed units may be trea ted “a s good a s new” after each repair. In many practical situations we observe that in the process of making a unit as good-as-new, considerable damage will be done to the operational ability of the repair facility, which may reflect upon the repair rates of the units in subsequent repairs. Intuitively, we expect that the average repair time of a unit to increase after each repair. This paper makes an attempt to incorporate these concepts in a two unit warm standby redundant system in which the efficiency, equivalently, repair capacity of the repair facility decreases upon each repair. Subsequently, the process of repair may not contribute significantly in improving the system reliability. In order to increase the system reliability and that the system might be available in the long run, an optimum replacement of the repair facility in terms of the mean time to system failure (MTSF) is suggested.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «STOCHASTIC ANALYSIS OF A COMPLEX REPAIRABLE SYSTEM WITH A CONSTRAIN ON THE NUMBER OF REPAIRS»

STOCHASTIC ANALYSIS OF A COMPLEX REPAIRABLE SYSTEM WITH A CONSTRAIN ON THE NUMBER OF REPAIRS

..2.. .... ..=0

Online unit 1 fails at u

The jth repair of unit 2 is

Late Dr. K. Shankar Bhat

Miriam Kalpana Simon

is not over before u

Madras Christian College

and will be over at v

miriamkalpana@mcc.edu.in

over before u

The standby unit is

Abstract

Operable

Reliability characteristics of repairable systems have been studied in the past in great detail by numerous researchers. Their findings are based mainly on the significant assumption that the repairs are carried out by one or more repair facilities, and the process of repair renews the functional behavior of the components or units in the system. In other words, the statistical properties of the components or units can be restored by carrying out the repair upon failure. This means that failed units may be treated �as good as new� after each repair. In many practical situations we observe that in the process of making a unit as good-as-new, considerable damage will be done to the operational ability of the repair facility, which may reflect upon the repair rates of the units in subsequent repairs. Intuitively, we expect that the average repair time of a unit to increase after each repair. This paper makes an attempt to incorporate these concepts in a two unit warm standby redundant system in which the efficiency, equivalently, repair capacity of the repair facility decreases upon each repair. Subsequently, the process of repair may not contribute significantly in improving the system reliability. In order to increase the system reliability and that the system might be available in the long run, an optimum replacement of the repair facility in terms of the mean time to system failure (MTSF) is suggested.

under repair The repair is over at v

..1,..+1....(..,..+....)

Keywords: Reliability, repair facility, warm standby redundant system, optimum replacement, Mean time to system failure.

I. Introduction

A great majority of real systems are repaired after they fail rather than replaced in toto. Jensen and Petersen [8] identified Printed Circuit Board (PCB) as a good example for repairable systems. This is not particularly so since failed PCB�s are often discarded and replaced by new ones. Nevertheless, it does emphasize the point that the systems such as sonar systems, radar systems or communication systems of which PCB�s form a small proportion are certainly repaired rather than discarded. Repair maintenance is sought to increase the Mean Time To System Failure (MTSF) vis-a-vis system reliability. In addition to standby redundancy, repair maintenance is often resorted to improve the system reliability. System components or units are repaired upon their failure. Operable standbys are switched over to online for efficient functioning of the system, during the repair of the online failed units.

One of the important objectives of a system engineer is to resort to repair maintenance that increases the mean time to system failure by removing the bottle necks or constrains hindering on the improvement of the system reliability. System performance and reliability characteristics have been studied for such systems by great many researchers. Bhat, Gururajan and Nayak [1] provided the availability and reliability measures and the MTBF of a two-unit cold standby system supported by a single repair facility. Cao and Wu [3] obtained reliability quantities of the system and the repair facility of a two-dissimilar-unit cold standby system where the repair facility is subject to failure and can be replaced by a new one after it fails.

Chaudhary, Sharma and Gupta [4] deals with a system composed of two-non identical units and a single repairman when the joint distribution of failure and repair times for each unit is bivariate exponential distribution. The stochastic analysis of a two-identical unit cold standby system wherein a single repair facility appears in and disappears from the system randomly is considered by Gupta and Bhardwaj [5].

Gupta and Tyagi [6] discusses the stochastic analysis of a two identical unit cold standby system model with a single repairman depending upon the perfect and imperfect environment. Reliability, availability and interval reliability measures of a two-unit warm standby system with a single repair facility wherein the lifetime of the functioning unit has a general distribution, while the standby unit has a phase-type distribution is derived by Gururajan and Srinivasan [7].

Kumar, Malik and Nandal [9] described the stochastic analysis of a repairable system consisting of two non-identical units with a single repairman and the distribution for failure rates of the units has been considered as negative exponential while arbitrary distributions have been taken for repair and treatment rates. A warm standby repairable system including two dissimilar units, one repairman and imperfect switching mechanism is studied by Sadeghi and Roghanian [11].

A well accounted bibliography in this direction is also found in Osaki and Nakagawa [10], Srinivasan and Subramanian [12] and Bhat and Gururajan [2]. Their findings are based mainly on the noteworthy assumption that the repairs are carried out by one or more repair facilities, and the process of repair renews the functional behavior of the components or units in the system. In other words, the statistical properties of the components or units can be restored by carrying out the repair upon failure. This means that failed units may be treated �as-good-as-new� after each repair. In many practical situations we observe that in the process of making a unit as-good-as-new, considerable damage may be done to the operational ability of the repair facility, which may reflect upon the repair rates of the units in subsequent repairs. Instinctively, we expect that the average repair time of a unit to increase after each repair. At one stage the repair facility will have little contribution to our desire of increasing the system reliability. At this stage it is worthwhile to replace the repair facility by a new one. This paper incorporates the above mentioned ideas in a two unit warm standby repairable system and arrives at an optimum replacement policy, clearly indicating the stage of replacement as a function of MTSF.

II.System Description

Let us characterize the complex two unit standby redundant repairable system under study.

[01]The system consists of two dissimilar units having same statistical properties. Theunits are labeled as ..1 and ..2. Initially, ..1 is put online and ..2 is kept as a warm standby.Whenever a unit fails while functioning online the standby unit is switched over to onlinefor functioning, and the online failed unit is sent for repair.

[02]The unit that is kept in standby is vulnerable to failure. This unit is sent to repair uponfailure in the standby state, and is restored immediately after the completion of its repair.

[03]There are two repair facilities RF1 and RF2. Online failed units are repaired using RF1and standby failed units are repaired using RF2.

[04]The repair time distribution of online failed unit (in RF1) is different on each failure.Furthermore, it is assumed that the repair rate of each unit increases as the number ofrepair increases. In other words, the efficiency of the repair facility decreases after eachrepair completion. On the other hand, a unit that is failed in the standby state is as-good-as-new after repair

[05]The operational ability of the repair facility is considered not satisfactory once itcompletes 2.. number of repairs.

[06]A unit that has completed 2..... repair may not help us in our objective of improvementof the system reliability. At this stage, a replacement policy for the repair facility may beconsidered feasible.

[07]All switchover times involved in the system operation are negligible and the switchthat performs the switchover operation is immune to failure.

III.Notation

....(.),....(.),.....(.) ........,........,..... .... .............. ........ .... ........ .., .. = 1,2

......(.),......(.) ........,........ .... ........ .. ......... ...................... ............ ...... ..... ..... ........, .. = 1,2,� .. = 1,2

IV.Stochastic Behaviour of the Standby Unit

During the failure free operation of the one unit online, the behavior of the standby unit can be completely described by a stochastic process {......(..),.. > 0}.

......(..) ...=.... {..... .............. ........ .... .... .......... .. .... .. ...................... ....... .... .............. ..... .......... .. .... (....,0)}

..,.. = ..,..; .. = ................,.. = .......... ............ (1)

We observe that the time spent in the state .. and .. by the standby unit forms an alternating renewal process whose transition probabilities may be described through the functions ......(..).

Thus, the transition probabilities represented through ......(.) are evaluated as

......(..)=[. +....(.+.)..][. + .].1;......(..) =1�......(..) (2)

......(..)=[. .....(.+.)..][. + .].1;......(..)=1�......(..) (3)

We observe that the function ......(..) is the p-function of the Kingman�s regenerative phenomenon. Since these functions find repeated usage in our discussion, we provide their Laplace Stieltjes Transforms.

.......(..)=. [(.+.)..]+. [(.+.)(..+.+.)] (4)

.......(..)=. [(.+.)..].. [(.+.)(..+.+.)] (5)

.......(..)=. [(.+.)..]..[(.+.)(..+.+.)] (6)

.......(..)=.[(.+.)..]+.[(.+.)(..+.+.)](7)

V.Reliability Analysis

In our effort to characterize the system we define the following events:

...,0:.......... ....... ........ ..,........ ..... ...... ........ ............ ...... ............ ........ .......,........ ............ .... .............. ............

..=1,2 . ..,1:.......... ....... ..... ............ .... ........ .. ........ ............; .... ....... .............. .... ................ .............. ........ .... ...... ............

..=1,2; .. = 1,2,�,...

These events, constituting themselves into a regenerative process, facilitate us to trace the behavior of the system completely on a time horizon. For the system to be continuously operable in (0,..], it is necessary that at the instant of failure of unit .., unit (3...) should be in operable condition. To facilitate this, the following auxiliary, system down forbidding functions are defined.

....(..,..)...=Pr {.1,..+1 ............ .............. .. ...... ..+... ...... ..... ............ .... ................ .... (0,..] / .2,.. .... ..=0}

.. = 1,2,� ...1 (8)

We observe that the function Pr(j, t) .t represents the pdf of the time interval between .2 , j and .1 , j+1 events. Further

....(..,..)...=....{.2,..+1............ .............. .. ...... ..+......... ..... ............ .... ................ ....(0,..]/.1,..+1.... ..=0}

.. = 1,2,� ...1 (9)

....(..,..)...=....{.2,.............. .............. .. ...... ..+... ...... ..... ............ .... ................ ....(0,..] /.2,...1 .... ..=0} .. = 1,2,� .. (10)

We notice that ....(..,..) ... represents the pdf of the time interval between two successive .2 events. Similarly,

...(..,..)... = ....{.2,.. ............ .............. .. ...... ..+... ...... ..... ............ .... ................ .... (0,..]/.2,0 .... ..=0}

.. = 1,2,� .. (11)

We observe that the functions (9) and (10) are system down forbidding functions in the sense that a system down is not acceptable between the occurrences of two events. These functions are evaluated with the help of regenerative events ..... observing that, at the instant of failure of ..1 at which epoch ..2 has completed its ..... repair and is found in operable condition in its standby state.

Thus, the pdf�s between two successive events are

....(..,..)=..1(..){..2,..(..)� [. + ....(.+.)..][.+.].1} .. = 1,2,� .. (12) ....(..,..)=..2(..){..1,..+1(..)� [. + ....(.+.)..][.+.].1} ..=1,2,� ...1 (13)

The pdf between successive regenerative events are obtained using the forward recurrence relation between the P and Q functions and also between H and . functions Thus

....(..,..)=....(...1 ,..) � ....(...1 ,..) .. = 2,3,� .. (14)

...(..,..)=...(...1 ,..) � ....(.. ,..) .. = 2,3,� .. (15)

...(1,..)=....(1 ,..) (16)

where ....(1,..)=..2(..) {..1,1(..) � [. + ....(.+.)..][.+.].1}.

Observing that a unit switched online after repair at which epoch the repair of the other unit commences, is a point of regeneration, we are in a position to write an expression for the reliability function of the system. The reliability function ..(..,..) of the system is given by

..(..,..) = ...1(..)+..1(..) ......(..)� ...2(..)+..1(..)......(..)�. ...(..,..)� {...1(..)+....(..,..) � ...2(..)}

.. = 1,2,� .. (17)

The expression (17) is obtained by considering the following mutually exclusive and exhaustive cases:

(a)..1, which is fresh and has not gone through any repairs, does not fail before ...

(b)..2, that is instantaneously switched over from standby and has not gone through any repair tillthen, does not fail before ...

(c).... while operating online after ..... repair (.. = 1,2,� ..) does not fail before t.

VI.Availability Analysis

The following auxiliary system-down allowing functions are defined to obtain p.d.f. of time intervals between ...,.. events.

....(..,..)...=....{.1,..+1 ............ .............. .. ...... ..+... .... (0,..]/.2,.. .... .. = 0} ..=1,2,� ...1 (18)

....(..,..)...=....{.2,..+1 ............ .............. .. ...... ..+... .... (0,..]/.1,..+1 .... .. = 0} ..=1,2,� ...1 (19)

....(..,..)...=....{.2,.. ............ .............. .. ...... ..+... .... (0,..]/.2,...1 .... .. = 0} ..=1,2,� .. (20)

...(..,..)...=....{.2,.............. .............. .. ...... ..+... .... (0,..]/.2,0.... .. = 0} ..=1,2,� .. (21)

Schematic representation of system behavior between ..2,.. and ..1,..+1 events is shown in Figure 1.

Figure 1: A Schematic Representation of Evaluation of Pa(j,t)

We observe that the functions [18] and [19] are system down allowing functions. We scrutinize the first possibility that, at the instant of failure of ..1, ..2 has complete its ..... repair and is found in operable condition in its standby state. The term that corresponds to this possibility is:

..1(..)[..2,..(..) � {[. +....(.+.)..][.+.].1}] ..=1,2,� ..

The second possibility corresponds to the situation when the online ..1 fails at .., at this time point ..2 is found �not in operable condition� and is undergoing repair. The unit is switched over to online for its operation at the epoch of the repair completion. Thus, the term that corresponds to this possibility is:

..1(..)[..2,..(..)�{[......(.+.)..][.+.].1}� ......] ..=1,2,� ..

Thirdly, when online ..1 fails at .., the ..... repair of ..2 is not over before .. and the same will be over at .., .. > ... This probability is given by

..1(..)..2,..(..) ..=1,2,� ..

Thus, in its totality, the pdf of the time interval between a .2,.. event and .1,..+1 event is given by

....(..,..)=..1(..)[..2,..(..)� {[.+....(.+.)..][.+.].1}]+..1(..){..2,..(..)�{[......(.+.)..][.+.].1}� ......]

+..1(..)[..2,..(..)] ..=1,2,� .. (22)

Similarly,

....(..,..)=..2(..){..1,..+1(..)�{[. +....(.+.)..][.+.].1}]+..2(..){..1,..+1(..)�{[......(.+.)..][.+.].1}

� ......]+..2(..)[..1,..+1(..)] ..=1,2,� ...1 (23)

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

By means of (22) and (23) we obtain,

....(..,..)=....(...1,..)�....(...1,..) ..=2,3,� .. (24)

...(..,..)=...(...1,..)�....(..,..) ..=2,3,� .. (25)

...(1,..)=....(1,..) (26)

where ....(1,..)=..2(..)[..1,1(..)�{[.+....(.+.)..][.+.].1}]+..2(..)[..1,1(..)�{[......(.+.)..][.+.].1 }�......]

+..2(..)[..1,1(..)]

The availability function ..(..,..) of the system is derived by taking into consideration the following mutually exclusive and exhaustive cases:

(a)..1, which is fresh and has not gone through any repairs, does not fail before ...

(b)..2, that is instantaneously switched over from standby and has not gone through any repair tillthen, does not fail before ...

(c).... while operating online after ..... repair (.. = 1,2,� ..) does not fail before ...

..(..,..)=...1(..)+..1(..)[......(..)+......(..)�......}� ...2(..)+..1(..)[......(..)+......(..)�......]�....(..,..)

� {...1(..)+....(..,..)�...2(..) } ..=1,2,� .. (27)

VII.Mean Time To System Failure

In the analysis of the system we have assumed arbitrary failure time and repair time distributions for the units while working online. For the purpose of illustration we consider a model in which both the units are identical by virtue of their statistical properties and their failure time distributions are exponential. In addition to the assumptions made for the standby unit, we formalize the failure time and repair time distributions of the online units.

....(..)=....... . > 0,i = 1,2

......(..)=...............>0,..= 1,2 ; ..=1,2,� ..

The integral equations given in (17) are solved using Laplace transform technique and ...(..,.. ), the Laplace transform of ..(..,..) is:

...(..,.. )=1..+..+...(..)..+..+...(..)...1..(..).......(..)....=2....=1[1..+..+.......(..)(..+..)3(..+..+....)] (28)

where ..1..(..)= ....1.(..)(..+..)(..+..+....),......(..)=..2.........1[.(..)]2(..+..)2(..+..+....)(..+..+.....1) and .(..)=1(..+..)[..(..+..)+..(..+..+..+..)]

We observe that ...(..,..) is a rational function of its arguments and can be easily inverted for small values of ... Thus, the reliability can be explicitly computed for small values of k.

The Mean Time To System Failure (MTSF) is given by

...(0,.. )=1..+.(0)+.(0)...1..(0).......(0)....=2....=1[1+.(0)1+....)] (29)

where ..1..(0)= .(0)1+..1. ......(0)=[.(0)]2(1+....)(1+.....1), .(..)=1(..+..)[....+....+..+..] and ....=......

A coding is written for the precise evaluation of R*(0, k). The program evaluates the MTSF for specified values of the parameters. As a function of t and k, MTSF is evaluated for specific values of the parameters and are tabulated in Table 1.

Table 1: MTSF of the system for the parameters(., ., ., .) = (0.95, 50, 10, 40)

k

MTSF = R*(0)

k

MTSF = R*(0)

1

3.1756

21

6.2046

2

4.0626

22

6.2056

3

4.6862

24

6.2028

4

5.1269

26

6.2075

6

5.6607

27

6.2077

8

5.9303

28

6.2078

10

6.0669

30

6.2079

12

6.1362

32

6.2080

13

6.1568

33

6.2081

15

6.1820

34

6.2081

16

6.1895

35

6.2081

18

6.1986

36

6.2081

19

6.2013

37

6.2081

20

6.2033

38

6.2081

The graphical representation of ...(0,.. ) for specific values of parameters is depicted in Figure 2.

2,5

4,5

6,5

1

8

16

22

30

36

MTSF

k

Figure 2: Graphical Representation of the MTSF for (., ., ., .) = (0.95, 50, 10, 40)

The graph clearly indicates that there is no improvement in MTSF once it completes 2k = 34 repairs. Intuitively one would conclude that it is not worthwhile to retain the repair facility once it completes 2k = 34 repairs. Consequently, we suggest at this stage that the repair facility should be replaced by a new one in order to increase the system performance and to make the system to be available in the long run.

VIII.A Provision for Replacement of Repair Facility

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

A wise strategy suggests that when a repair facility is unable to perform its operation it should be scrapped. If one follows this strategy the system becomes unavailable in the long run. However, a prudent policy is to replace the repair facility by a new one so that the system might be available in the long run. When a repair facility completes 2k repairs, it is replaced by a similar new repair facility. The variable k realizes into a number at which MTSF stabilizes in the sense that

...(0,..)=...(0,..+..), .. = 1,2,3 � (30)

The policy of replacement is as follows:

�After nk-th repair completion of unit 1, the old repair facility is scrapped and a new repair facility is introduced. Here n denotes the number of such replacements, n . 1. We suggest replacement of repair facility only and not operable units. When a unit, while operating online after nk-th repair, fails, it is switched over to the new repair facility; at this epoch an operable standby is instantaneously switched online.�

I.Reliability Analysis of the Modified System

Let us define

...(0,..)...=....{.1,1............ .............. .. ...... ..+......... ..... ............ .... ................ ....(0,..]/.1,1.... ..=0} (31)

The function ...(0,..) is the pdf of time interval between two successive .1,1 events, duringwhich the system being operable between these two events. Thus

...(0,..)=...(..,..)�[..2,..(..)�{[.+....(.+.).. ][.+.].1}]..1(..) (32)

The reliability function of the modified system is given by

..1(..,..)=1..1(..,..)+[.{...(0,..)}.....=1]�{...2(..)+....(..,..)....=1{...1(..)+{..2,..(..)�......(..)}..1(..)�...2(..)}}

(33)

where 1R1(t, k) is the expression given in the right hand side of (17). The equation (33) is derived by considering the following mutually exclusive and exhaustive possibilities

(a)the interval (0, t] is not intercepted by an .1, 1 event.

(b)the interval (0, t] is intercepted by at least one .1, 1 event.

II.Availability Analysis of the Modified System

The pdf of time interval between system-down allowing regenerative events is evaluated through

...(0,..)...=....{.1,1............ .............. (..,..+...)/.1,1 .... ..=0} (34)

and is given by

...(0,..)=...(..,..)� [{..2,..(..)�......(..)}..1(..)+{..2,..(..)� ......(..)�......}..1(..)] (35)

Arguments that lead to the derivation of (19) will give us the availability function of the system with a provision for a replacement of repair facility. Thus,

..1(..,..)=1..1(.....)+.[{...(0,..)}..]...=1�{...2(..)+....(..,..)....=1{...1(..)+[{..2,..(..)� ......(..)}..1(..)

+{..2,..(..)� ......(..)�......}..1(..)]� ...2(..)}} (36)

where 1A1(t, k) is the expression given on the right hand side of (27).

The steady state availability of the system is given by

...=lim..>...(..)=lim..>0...(..)

References

[1]Bhat, K. S., Gururajan, M. and Nayak, P. (1988). A study of a 2-unit system with randombreakdown�of the repair facility. Microelectronics Reliability, 28(3):369-371.

[2]Bhat, K. S. and Gururajan, M. (1993). A two-unit cold standby system with imperfect repairand excessive availability period. Microelectronics Reliability, 33(4):509-512.

[3]Cao, J. and Wu, Y. (1989). Reliability analysis of a two-unit cold standby system with areplaceable repair facility. Microelectronics Reliability, 29(2):145-150.

[4]Chaudhary, P., Sharma, A. and Gupta, R. (2022). A Discrete Parametric Markov-ChainModel of a Two NonIdentical Units Warm Standby Repairable System with Two Types of Failure. Reliability: Theory & Applications, 17(2 (68)):21-30.

[5]Gupta, R. and Bhardwaj, P. (2019). A discrete parametric Markov-chain model of a two unitcold standby system with appearance and disappearance of repairman. Reliability: Theory & Applications, 14(1):13-22.

[6]Gupta, R. and Tyagi, A. (2019). A Discrete Parametric Markov-Chain Model Of A Two-UnitCold Standby System With Repair Efficiency Depending On Environment. Reliability: Theory & Applications, 14(1):23-33.

[7]Gururajan, M. and Srinivasan, B. (1995). A complex two-unit system with randombreakdown of repair facility. Microelectronics Reliability, 35(2):299-302.

[8]Jensen, F. and Petersen, N. E. Burn-in, A Wiley-Interscience Publication, 1982

[9]Kumar, N., Malik, S. C. and Nandal, N. (2022). Stochastic analysis of a repairable system ofnon-identical units with priority and conditional failure of repairman. Reliability: Theory & Applications, 17(1 (67)):123-133.

[10]Osaki, S. and Nakagawa, T. (1976). Bibliography for reliability and availability ofstochastic systems. IEEE Transactions on Reliability, 25(4):284-287.

[11]Sadeghi, M. and Roghanian, E. (2017). Reliability analysis of a warm standby repairablesystem with two cases of imperfect switching mechanism. Scientia Iranica, 24(2):808-822.

[12]Srinivasan, S. K. and Subramanian, R. Probabilistic analysis of redundant systems,Springer-Verlag, 1980.

i Надоели баннеры? Вы всегда можете отключить рекламу.