A Bayes Analysis and Comparison of Weibull and Lognormal Based Accelerated Test Models with Actual
Lifetimes Unknown
Prof. S.K. Upadhyay •
Department of Statistics, Banaras Hindu University, Varanasi- 221005, India; DST- Centre for Interdisciplinary Mathematical Sciences, Banaras Hindu University,
Varanasi- 221005, India Email id: skupadhyay@gmail.com
Miss. Reema Sharma •
Department of Statistics, Banaras Hindu University, Varanasi- 221005, India Email id: reema. stat@gmail. com
Abstract
The paper considers an accelerated test situation where the actual lifetimes of the items are not directly observable rather their status are known in the form of binary outcomes. By assuming two widely entertained models, namely the Weibull and the lognormal distributions, for the actual lifetimes, the paper provides full Bayesian analysis of the entertained models when both scale and shape parameters of the models are allowed to vary over the covariates involved in the study, thus giving rise to corresponding accelerated test models. The Bayes implementation is based on sample based approaches, namely the Metropolis algorithm and the Gibbs sampler using proper priors of the parameters where the prior elicitation is based on the expert testimonies. The situation involving missing items where actual status is also unknown is additionally entertained using the same modelling assumption. A comparison between the two entertained models is carried out using some standard Bayesian model comparison tools. Finally, numerical illustration is provided based on a given set of current status data and some relevant findings are reported.
Keywords: Binary outcomes, Missing items, Accelerated testing, Weibull distribution, Lognormal distribution, Log-linear link function, Metropolis algorithm, Gibbs sampler, Model comparison.
1 Introduction
Generally, in life testing experiments, the items or equipments are put on test to observe their exact failure times and, based on the same; various reliability characteristics of the items under consideration are studied. There are situations, however, where exact lifetimes are not observable and the experimenter only happens to know the status of the items with regard to their failure or survival. That is the item is either surviving at the time of observation or found in the failed state. Thus the resulting data, often in the form of binary outcomes, may represent one of the two states of the items at the time of observation and generally referred to as the current status data. An
important example includes time to occurrence of tumour in animal carcinogenicity experiments, where one might not be able to observe the exact time to appearance of tumour in the subject rather only observes status of the tumour at a particular time, that is, whether the tumour is present or not. Other examples include testing of electro explosive devices, missiles, rocket motors, air bags in cars, etc. (see Balakrishnan and Ling (2012, 2013)) where the items are found either in working state or in failed state at the time of observation. Thus in all these situations the actual lifetimes are unknown and the experimenter is only observing the status of the items or subjects at the time of observation. Also, while dealing with such life testing experiments, there is a possibility of getting some of the experimental units missing during experimentation due to some known or unknown reasons and, as such, the experimenter is even not in a position to know exactly if such items were surviving or already failed at the time of observation. Say, for example, in animal carcinogenicity experiments involving mice, some of the experimental units (mice) might not be available at their expected places at a specified point of time and, as such, it is not possible to know exactly about their current status even. A similar kind of situation was also studied by Sharma and Upadhyay (2018a) with regard to engineering experiments when the actual lifetimes and the status of some of the items are both unknown.
Fan et al. (2009), Balakrishnan and Ling (2012) and Sharma and Upadhyay (2018a) are some of the important references on the analyses of current status data. Whereas Fan et al. (2009) and Sharma and Upadhyay (2018a) provide Bayesian analysis of such datasets, Balakrishnan and Ling (2012) deals mainly with the classical inferences. Other important references on the analysis of current status data include Balakrishnan and Ling (2013) and Balakrishnan and Ling (2014) where the authors used different lifetime models in their work on classical maximum likelihood (ML) estimation and observed that Weibull distribution stands better than other considered models.
Before we come across some other relevant concepts, we need to consider the appropriate lifetime distributions that are capable of representing the actual lifetimes, which are not exactly known in the present scenario. It may be noted here that we do not have the actual lifetimes of items in the present situation rather only have information on the status of the units at the time of observation, that is, failed or surviving. Thus, if the time of observation is T, the actual lifetime either falls below T or goes beyond T. Since the lifetimes are continuous variates, it is almost unlikely that the failure occurs exactly at T.
Among the various lifetime models, the two-parameter Weibull distribution and the two-parameter lognormal distribution, specified by their scale and shape parameters, are widely used lifetime models in the literature. The two-parameter Weibull distribution is a quite flexible and a rich family that has the capability of accommodating all three hazard rate shapes, that is, increasing, decreasing and constant. This is perhaps the reason that the model is highly explored model and used in a wide variety of situations (see, for example, Lawless (2002), Upadhyay (2010)). Similarly, the two-parameter lognormal distribution is known for its non-monotone hazard rate shape that initially increases and attains maxima. It then decreases and finally approaches to zero for large lifetimes and also at the initial lifetimes (see Lawless (2002)). It is often proclaimed that this decreasing nature of lognormal hazard rate with large lifetimes makes lognormal a less popular lifetime distribution regardless of its versatile hazard rate. However, in spite of this discouraging fact, the distribution receives the attention of a number of reliability practitioners, especially in situations where very large lifetimes are not of interest. We do not go into the details of various inferential developments related to these models due to paucity of space. The interested readers may, however, refer to Mann et al. (1974), Lawless (2002) and Singpurwalla (2006), among others, where the last reference primarily concerns with Bayesian developments.
The Weibull and lognormal distributions differ in their tails and both may fit a dataset equally well in their middle ranges. In fact, when both the distributions are fitted to a lifetime data, the Weibull distribution has an earlier lower tail than the corresponding lognormal distribution. In other words, we can say that a low Weibull percentile is always below the corresponding
lognormal percentile making the Weibull distribution more pessimistic (see, for example, Nelson 1990). In spite of several such comparative remarks, the two models are used simultaneously by a number of authors for a variety of lifetime data sets (see, for example, Dumonceaux et al. (1973), Wang (1999) and Upadhyay and Peshwani (2003)). The authors have concluded that the two models appear to be good contenders to each other and, therefore, each one can be used as an alternative to other in a variety of situations. The important classical references on the model comparison include Dumonceaux et al. (1973), Meeker (1984), Kim and Yum (2008), etc. Meeker (1984), however, extended the task of model comparison by focussing on accelerated test plans involving censored data for the two models. The Bayesian contributions on model comparison between Weibull and lognormal models include Kirn et al. (2000), Upadhyay and Peshwani (2003) and Araujo and Pereira (2007). It may be noted that some of these references provide extensive treatment on model comparison and conclude their findings based on various model comparison tools of Bayesian paradigm.
Generally, the experimental units used for the considered situations are highly reliable and, therefore, laboratory based experimentation may result in a very few failures or even no failures in normal operating environment. As a matter of fact, the outcomes of such experimentation may provide one-sided information, that is, all the experimental units are surviving at the time of observation and none have failed. The problem can be resolved to a large extent if the experiment is conducted in an accelerated environment where we allow the items under test to operate at the accelerated levels of the stress(es) or covariates to induce early failures. Say, for instance, this might be the accelerated level of dose of the chemical responsible to induce the tumour with reference to the animal carcinogenicity experiment or the accelerated levels of the stresses such as temperature, humidity and voltage, etc., with reference to the testing of electro explosive devices, missiles, rocket motors, airbags in cars, etc. Moreover, although such experiments are performed in an accelerated environment, the primary objective remains drawing the inferences based on the observed data in the normal operating environment. This can, of course, be achieved by means of a suitable life-stress relationship used to relate the lifetimes with the applied covariate(s)/stress(es). Such relationships are generally decided based on several biological considerations in animal carcinogenicity experiment or physical considerations in engineering experiments (see, for example, Nelson (1990) and Lawless (2002)).
A number of life-stress relationships are suggested in the literature of accelerated testing. Important among these are Arrhenius, Eyring, inverse power, exponential, exponential-power, quadratic and polynomial relationships, etc. These relationships are generally used when the characteristic life is assumed to be influenced by only one covariate or stress variable involved in the study. When multiple stresses or covariates are involved in the process, the most commonly used relationship is the log-linear relationship, which is formed under the assumption that the characteristic life has a log-linear relation with the stress(es). This relationship offers a generalized version although it can be used for the situation where one or two stresses affect the process. Another apparent advantage associated with the log-linear relationship is mathematical convenience in its use. These are some of the reasons that led to maximum usage of log-linear relationship in a variety of situations. A detailed discussion on life-stress relationships and the related issues can be had from Meeker and Escobar (1998), Wang (1999), Nelson (1990) and Sen (2016), etc.
Most of the accelerated tests work under the assumption that only the scale parameter is influenced by the covariate(s)/stress(es) involved in the study whereas the shape parameter remains constant over the covariate(s)/stress(es) (see, for example, Nelson (1990)). However, this is not true in practice for all kinds of datasets and, therefore, the assumption of constant shape parameter with respect to the covariates may hide some important features of the units involved in the study. Meeter and Meeker (1994) is a good reference where applicability of non-constant shape parameter has been given by means of some examples (see also Balakrishnan and Ling (2013)).
The present paper provides Bayes analysis of both Weibull and lognormal based accelerated test models with an assumption that both the parameters of the two models are likely to be affected by the considered stress variates. The paper also considers missing data situation by assuming a hypothetical scenario in the assumed current status data although the hypothetical missing data scenario appears to be quite realistic in practice. Finally, the two models are compared using some standard Bayesian tools such as the deviance information criterion (DIC) and the expected posterior predictive loss (EPPL) criterion.
The plan of the paper is as under. The next section provides the formulation of the likelihood function corresponding to two considered models for a general form of current status data when the experiment is subject to accelerated testing and both scale and shape parameters of the model are affected by the stress variables. Section 3 details the Bayesian model formulation and also comments on the implementation of the Metropolis and the Gibbs sampler algorithms to get the desired posterior based inferences in both non-missing and missing data situations. In section 4, model selection criteria, namely the DIC and the EPPL are discussed in brief. This section is given for completeness only. Section 5 provides numerical illustration based on a real dataset. Finally, the paper ends with a brief conclusion given in the last section.
2 The Models and the Likelihood Functions
Let us consider a life testing experiment that involves I experimental groups where ith group consists of Kt experimental units, i = 1,2,...,I. Thus, in all, the experiment involves testing of 'Z'i=1 Kt experimental units. Besides, we also assume that in ith experimental group, the Kt units are observed for their status in terms of either failure or survival at time T^ where the lifetime of each unit is affected by J kinds of covariates, say x^; i = 1,2,...,I, j = 1,2,...,]. Accordingly, the observed number of failures or survivors is recorded. Obviously, the resulting outcomes are available in the form of binary data where binary zero is used to represent the failed state and binary one for the state of survival. Let nt and rt = (Kt — nt) denote the number of failures (count of binary zeros) and number of survivals (count of binary ones), respectively, observed at time T\ in the ith experimental group when each unit in the group subject to ] covariates or stresses, i = 1,2,...,I. The complete structure of the data is shown in Table 1. We have also considered missing data case but the same will be discussed separately.
Table 1: Current status data observed at different points of time under different stresses or covariates (the values in parentheses correspond to missing data situation)
Experimental group Number of experimental units Observation time Number of failures Number of survivals Number of missing units Covariates
1 Ti ni(n'i) n(r'i) (m1) X 11 X1J
2 K2 T2 n2(n'2) r2(r'2) X21 X2J
I k, T, n,(n',) r,(r',) (m,) X,i XIJ
Now suppose tik denote the lifetime for the kth experimental unit in the ith experimental group, where i = 1,2,..., I and k = 1,2,...,K. If the lifetimes tik; i = 1,2,... ,1, k = 1,2,...,K, are assumed to be the independent and identically distributed (iid) Weibull variates then the associated probability density function (pdf) can be written as
fw(tik) = tt1 exp {- (ff]; tik > 0, 8i, /3i>0,V i, k, (1)
where fy and ft denote the scale and the shape parameters, respectively, associated with the Weibull model corresponding to ith experimental group. Let us assume that both 9t and ft are related to the covariates x^; i = 1,2,...,I, ] = 1,2,...,], via means of log-linear relationship given as
9i = exp(«o + Y\=i djXij) and ft = exp(bo + tyx^); i = 1,2,...,I. (2)
The parameters a = (a0,a1,...,a]) and b = (b0,b1,...,bf) are the parameters associated with the log-linear relationships corresponding to the Weibull scale parameter fy and the shape parameter ft, respectively, i = 1,2,...,I. Obviously, these parameters contribute in the model due to the involvement of covariates or stress variables in the study and the resulting Weibull model can be referred to as the accelerated Weibull model. More specifically, a0 corresponds to the constant effect of covariates on the scale parameter whereas the parameter aj gives the effect of covariate x^ on the same, i = 1,2,...,I, j = 1,2,...,J. A similar interpretation can be given for the components of b associated with ft. Moreover, the components of a and b are assumed to be real on their support, an assumption that appears justified as well.
To proceed further, it may be noted that in the experiment considered here, we do not observe the actual lifetime data rather only get the information regarding the fact that if the actual lifetimes are either less than the observation time (that is, tik < T) or exceed it (that is, tik > Tt), i = 1,2,...,I and k = 1,2,...,K. The probabilities corresponding to these units under the assumption of Weibul lifetime distribution can be given as
!/ V. exp(bo+Y}j=1bjXij)~\
-( . Jj--) }, (3)
yexp(a0+Yj=1ajXtj) J
and Pw(tik > T) = 1 — Pw(tik < T). It may be further noted that the equality sign in (3) is used to avoid the discontinuity at time T. Of course, this does not make any difference as tik;i = 1,2, ...,I,k = 1,2, ...,K, are the continuous variates. The expression given in (3) is nothing but the cumulative distribution function (cdf) of the corresponding Weibull distribution and its complimentary probability Pw(tik > T) is the corresponding reliability or survival probability.
On the other hand, if we assume lognormal distribution for the iid random variates tik;i =
1,2,..., I, k = 1,2,..., K, the corresponding pdf can be written as
1
f™(t*) = 7;^exp
— ^2(}0gtik— Hi)
¿<7i
tik > 0,—rn < Hi < > 0,v i, k, (4)
where expQi¿), at denote the scale and shape parameters, respectively, of the lognormal model for the ith experimental group, i = 1,2,..., I, and the script LN is used to distinguish the lognormal density with that of Weibull density. Analogous to (2), the parameters of lognormal distribution can be written as
Hi = a'o + Y]=1 a'jxij and Oi = exp(b'o + y]=1 b'jXij); i = 1,2,...,I, (5)
where a' = (a'0, a'1,..., a']) and b' = (b'0, b'1,..., b'j) are the parameters associated with the loglinear relationship corresponding to lognormal parameters Hi and ot, respectively, i = 1,2,...,I. A detailed interpretation of such parameters is already given while discussing Weibull distribution and, therefore, we presume that the components of d and ft can be similarly dealt. Obviously, substituting Hi and ot from (5) into (4) results into accelerated lognormal lifetime model.
The probability associated with the event tik <Tt; i = 1,2,..., I, k = 1,2, ...,K, under the assumption of lognormal distribution can be written analogous to (3) as
PLN(tik <W = cp
(logTi-ao-Zj=1ajXi
exp(bo+ZJj=1bjXtj)
(6)
where < is the standard normal cdf, <{z} = exp [—y] ^ The other probability PLN(tik >
Ti) is simply 1 — PLN(tik < Ti).
Once the model formulation is done as given above, the likelihood function for the model parameters a and b based on the counts of binary outcomes ni and ^ under the assumption of Weibull lifetimes can be written as
Lw(a,bln,r,T,x) = nU [Pw(kk < Ti)]ni[Pw(tik > T)]
= tfi=i
1 — exp { —
exp(ao+ZJ=1ajXij)
exp(bo+Y,J=1bjXij)
exp {—
Ti
exp(ao+ZJ=1ajXij)
exp(bo+ZJ=1bjXij)
(7)
It may be noted that the binary outcomes ni and ^ are observed at time T\ with corresponding covariates x^, j = 1,2,...,J, for the experimental groups i = 1,2,...,I. In (7), we have used the notations n for (n1,n2,...,n,), r for (r1,r2,...,r,), T for (T-t,T2,...,Tl) and x for (xx.2,...,xj) where x.j = (xij, X2j,..., xif)' for j = 1,2,..., J.
Analogous to (7), the likelihood function for the model parameters d and b' under the assumption of lognormal lifetimes can be written as
LLN(a',b'ln,r,T,x) = ni=i [PLN(kk < Ti)]ni[PLN(kk > Ti)]r
= ni=
0
logTi-aio-Zj=1aijXij
1 — 0
exp(b'o+£j=1 b' jXij) (\ogTi-a'o-ZJj=1aijXij
(8)
exp(b'o+^J=1b'jXij)
where the various notations used in (8) are already defined while discussing Weibull based likelihood.
n
T
r
X
n
1
r
X
2.1 Missing data case
To formalise the missing data situation informally introduced in Section 1, let us assume that mi is the observed number of missing units out of Ki experimental units tested in ith experimental group. As mentioned earlier, these Ki units are scheduled to be observed at time Ti for their status (failed or surviving) when operated under J different types of covariates or stresses x^, i = 1,2,..., I, J = 1,2,...,] (see Table 1) but mi missing units are, of course, not observable. Obviously, mi missing units will consist of two different kinds of units, failed or surviving, if they were continued on experimentation and not found missing. Suppose m\ ((mi - m\)) corresponds to number of failed (surviving) units out of mi missing units where obviously m\ is unknown and so is (mi — m\), i = 1,2,..., I. Let n\ and r\ = (Ki — n\ — m[) denote the observed number of failures (count of binary zeros) and survivals (count of binary ones), respectively, out of Ki experimental units when tested in ith experimental group and observed at the time Ti. As usual, the covariates or stresses xij, i = 1,2,...,I, ] = 1,2,...,J, are assumed to have their effects in missing data case as well.
S.K. Upadhyay, Reema Sharma
A BAYES ANALYSIS AND COMPARISON OF WEIBULL AND
LOGNORMAL BASED ACCELERATED TEST MODELS
Now the likelihood function for the model parameters a and b based on the observed counts n\, r\ and mt under the assumption of Weibull lifetimes can be written as
where the counts n\,r\ and mi are observed at the time T when the items under the test are exposed to covariates or stresses x^;j = 1,2,...,J, for the experimental groups i = 1,2,...,I. In (9), we have used the notations n' for (n\,n'2,...,n'¡), r' for (r\,r'2,...,r'¡), m for (m1,m2,...,m,) and the script m with W stands for Weibull likelihood corresponding to missing data case.
Similarly, if we use the same notations as described for the Weibull case, the likelihood function for the model parameters a and b under the assumption of lognormal lifetimes and missing data situation can be written as
where as usual the script m with LN stands for lognormal likelihood corresponding to missing data case.
The likelihoods given in (9) and (10) are incompletely specified in the sense that they involve unknown components m' = (m'1,m'2,... ,m'i) in m but this is not a deterrent issue with regard to Bayesian implementation if attempted using sample based approaches (see, for example, Sharma and Upadhyay (2018a)). For this, we only need to generate the binary response data corresponding to the observed missing units and this can be easily done by generating the iid Bernoulli variates with success probabilities Pw(tk > T) and PLN(tik > T), for accelerated Weibull and lognormal lifetimes, respectively. Once the binary response data corresponding to the missing units are made available, availability of the unknown component m' is obvious (see Sharma and Upadhyay (2018a)). The implementation has been briefed in the next section. It may be noted that the situation may not be straightforward if tried using the tools of classical paradigm. The details of Bayesian implementation for missing data case will be discussed in the next section.
As mentioned, the parameters a and ty; j = 0,1,...,J, associated with the accelerated Weibull model are assumed to be real on their support and, therefore, we consider normal priors for these model parameters given as
LWm(a,bln',r',m,T,x) = UU [Pw(tik < W]n'i [Pw(tik > T^
x [Pw(hk < Wr* [Pw(tik > Ti)]mi-m't
LLNm(a-bln'y.mj.x) = nU [PLN(tik < Ti)]n'1 [PLN(tik > TiW'1
x [PLs(tik < Ti)]m,i [PLs(tik > Ti)]™i-m'i
3 Bayesian Model Formulation
gw(bMb],c7b])=:j^-exp-
1l b]-Vbf 2 (
\;i = o,i.....j,
(12)
where (jiaj,oa.) and (jibj,ob) are the hyperparameters associated with the prior distributions of aj and bj;j = 0,1,...,], respectively. Now assuming that the parameters aj and bj; j = 0,1,...,J, are a priori independent, the joint prior distribution can be written as
g(a,blHa,aa,Vb,Vb) = Uj=0 g(ajlHa],^a]) X g(bjllb],Ob]), (13)
where la = (la0,la1,...,laj), °a = (Oa0, Oa1,..., °aj), lb = (lb0, lb1,..., lbj) and
Obo = (°b0,°b1,...,°bj).
For successful implementation of Bayesian techniques, the choice of hyperparameters becomes a significant issue. In case, the experimenter fails to have enough information to define an appropriate prior distribution, it is often recommended going for such choices that make the prior distributions essentially vague. Alternatively, one can attempt eliciting the prior hyperparameters based on the subjective opinion of the experts if the same are made available. To clarify, suppose the expert suggests that the parameters aj and bj are bounded within the finite intervals [laj,ua.] and [lbj, ub], respectively, j = 0,1,...,/, and due to non-availability of any other significant information, it is assumed that each value within the intervals is equally likely (see Bousquet et al. (2006)). Thus presuming aj and bj to be uniformly distributed in the intervals [laj,ua.] and [lb.,ub ], respectively, j = 0,1,...,J, one can equate the means and variances of the assumed normal priors with those of the assumed uniform distributions over the corresponding intervals. As a result, the hyperparameters (jia, oa) and (jib, ob.) associated with the prior distributions of aj and bj; j = 0,1,...,], can be obtained as
and
la]
lb]
la] + ^a]
_ lb]+Ub]
(Ua]—a])2
; j = 0,1.....j,
ohl =
(Ub:-lb:)2
—]—'—; j = 0,1.....J.
(14)
(15)
The prior information formalized in equations (13) gets updated by the medium of likelihood (7) in order to give the joint posterior, which up to proportionality under the accelerated Weibull distribution is given as
Pw (a, blla, Oa, lb, Ob, n, V, T, x)
1 — exp { —
exp(ao+2J a]xi])
exp(bo+Zj=1b]Xi])
exp {—
exp(bo+Zj=1b]Xi])
exp(ao+2j
a]Xi]J
*nu-.exp\
if^r^OL]) ( i( b]-vb]\2}
°a] ) \ X°b] eXPj *b] ) )
(16)
One can similarly obtain the joint posterior distribution, up to proportionality, under the assumption of accelerated lognormal distribution and the same can be given as
Oa, =
2
]
2
2
n
T
r
T
X
Pln (a', V \nan aa,,Hb„ ab„ n, r, T, x)
.logTi-ai0-ZJj=1aijXij
exp(b:0+^'j=1b:jxij)
ni
1 ^ ,logTi-a'°-ZJj=1a'jxij
(17)
2\
where the prior distributions for the parameters a'j and b'j\ j = 0,1,...,], are formed in a manner similar to those for the accelerated Weibull parameters but with hyperparameters (p.aj') and (Hb/,°b/), respectively. Likewise, we define Ha'= (Ma0',Ha1',...,Haj'), ad =
(°a0',°a1',...,°a/),Hb = (Hb0'\Hb1Hbj') and ab' = (ab0',ab1,...,ab]').
Both (16) and (17) do not appear to offer closed form solutions. Therefore, we propose to consider sample based approaches to get the desired posterior based inferences (see, for example, Upadhyay et al. (2001) and the references cited therein). No doubt, the sample based approaches to Bayesian computation are beneficial in the sense that they are automatic and simultaneously capable of providing variety of inferential aspects in a routine manner. The commonly used sample based approaches are the Metropolis and the Gibbs sampler algorithms where the former is a more generalised version in the sense that it also includes latter as a special case. The implementation of Gibbs sampler algorithm requires the specification of full conditional distributions for all the generating variates and simultaneously necessitates the availability of such full conditionals from the viewpoint of easy generation of samples. On the other hand, the Metropolis algorithm does not require any such specification of full conditionals rather seeks alternative randomization mechanism for sample generation often in a multidimensional framework.
The joint posterior given in (16) and (17) are the ((J + 1) + (J +1)) dimensional distributions. If we look carefully on various associated full conditionals, it is obvious that the corresponding full conditionals are not easy from the viewpoint of sample generation and, therefore, Metropolis algorithm appears to be an obvious choice for simulating the samples from joint posterior (16) and (17). The Metropolis algorithm is a Markovian updating scheme that uses a suitably chosen candidate generating density, say q(^,v), for sample generation where w and v are the realizations from q(^,v) at sth and (s + 1)th stage, respectively. If the chosen kernel is symmetric, the value generated at each step from q(^,v) follows a randomization step based on the probability of acceptance a(w,v) = min{l,^)} where p(.) is the corresponding posterior
distribution of possibly vector valued parameter (.). If v is accepted with probability a(w,v), it is retained as the candidate point from the target posterior at sth stage otherwise w is retained as the candidate point from the target posterior at sth stage. The commonly used candidate generating densities are multivariate normal, rectangular and t distributions, etc. (see also Upadhyay et al. (2001)). One can also use the non-symmetric candidate generating density and define a generalized version, known as the Metropolis-Hastings algorithm. We shall not discuss this version as it is beyond the scope of the present work.
For the implementation of the algorithm on the posteriors under consideration, we consider a multivariate normal distribution (((j + 1) + (/ + 1)) dimensional) as a candidate generating density. The multivariate normal distribution is chosen because it can be easily specified by its mean vector and covariance matrix, approximate values for the same can be guessed on the basis of ML estimates and the corresponding Hessian based approximation obtained at ML estimates. The calculation of ML estimates and the corresponding Hessian based approximations for the models under consideration cannot be directly obtained rather one requires going for an efficient optimisation technique. Balakrishan and Ling (2013) is an important reference that considers expectation maximization (EM) algorithm for the current status data under the assumed lifetime models. We, however, consider using simulated annealing (SA) optimisation
r
technique (see Robert and Casella (2010)) to obtain the ML estimates of the model parameters for both the accelerated Weibull and the accelerated lognormal distributions and accordingly obtain Hessian based approximations at corresponding ML estimates. Thus using the initial values in the form of ML estimates and corresponding Hessian based approximations, the implementation of the Metropolis algorithm can be done routinely through successive generations from (16) and (17) to get an iterative sequence of states. This sequence, after a long run, converges in distribution to a random sample from the true posterior distribution. The process is to be implanted separately on the two posteriors (see also Robert and Casella (2013) and Smith and Roberts (1993)).
Next considering the aforesaid missing data situation, the joint posterior can be obtained by updating the prior distribution (13) with the likelihood (9) under the accelerated Weibull distribution. Let pWm(ci,b}iia,oa,ib,ob,n',r',m,T,x) denotes the posterior in this case. Obviously, this posterior does not behave similar to that for non-missing case given in (16) due to the involvement of an unknown component m' in m. The situation can, however, be easily dealt with by constructing a cycle of Gibbs sampler algorithm for these additional unknowns. It may be noted here that the Gibbs sampler algorithm is also a Markovian updating scheme that requires successive generation from each of its full conditional distributions by using the most recent values of rest of the conditioning variates and the algorithm ultimately proceeds in a cyclic manner (see also Sharma and Upadhyay (2018a)). To implement the algorithm in the present case, we need to form two full conditional distributions. One of these full conditionals focuses on generation of count m' given all other variates (a_,b, ia,oa,ib,ob,n',r',m,T,x). The other full conditional is the joint full conditional of the model parameters (a,b) given all other variates including m' that appears in m. The generation of count m' (binary zeros) may be done by Bernoulli variate generation with success probability Pw(tik > T), conditioned on the recently generated values of the model parameters. The generation from other full conditional can be achieved by implementation of the Metropolis algorithm that has already been discussed earlier for non-missing case. As a result, the algorithm can routinely proceed to get the desired posterior samples for missing data case as well. The situation can be similarly detailed for the posterior arising from the accelerated lognormal distribution involving missing data.
4 Bayesian Model Selection
A number of model comparison tools are available in the literature. The logically appealing tools among these are those which take into accountability both goodness of fit and model complexity while deciding the model. The goodness of fit can obviously be summarized on the basis of deviance statistic whereas the model complexity can be based on the number of free parameters in the model. A few such tools are Akaike information criterion (AIC), Bayesian information criterion (BIC), DIC and the posterior predictive loss (PPL) criterion, etc. We, however, restrict ourselves to two most exploited and justified criteria, namely the DIC and EPPL where latter is nothing but PPL with slight change in the definition and nomenclature.
4.1 The deviance information criterion
Spiegelhalter et al. (2002) proposed a measure based on the posterior distribution of deviance that deals with both Bayesian measure of fit and complexity. This measure, known as DIC, obviously consists of two terms. The first term gives the measure of fit and the second term is used to measure the complexity involved in the entertained models. The DIC can be defined as
DIC = E[—2log(L(Vl(.m+PD, (18)
where pD = E[-2log(L(r]\(.)))] + 2log(L(q\(.))), ^ is a vector of parameters and rj is the corresponding posterior estimates that are usually taken as the posterior mean but posterior mode has also been recommended (see, for example, Upadhyay et al. (2013)). The term -2log(L(ij\(.))), considered as a function of t), is classical deviance and is defined as a measure of fit in classical modelling framework. In this light, the first term of (18) is defined to give the measure of fit, which is nothing but the posterior mean of classical deviance. On the other hand, the second terms pD is known as the effective number of parameters and is used to measure the complexity involved in the modelling. The model providing the least value of DIC is preferred over all other models.
4.2 The expected posterior predictive loss
Initially, the EPPL criterion was developed by Buck and Sahu (2000) for the multinomial cell frequencies and later on Sharma and Upadhyay (2018b) derived the same for the binomial counts. We shall use the same criterion as discussed by Sharma and Upadhyay (2018b) keeping in view the dataset given in Table 1. To discuss the criterion briefly, let rt and r? denote the number of successes corresponding to observed and predicted future observations, respectively, out of Kt experimental units in the ith experimental group, where i = 1,2,...,I. The EPPL criterion derived in Sharma and Upadhyay (2018b) can be rewritten as
E{L(r,r*)] = 2 ZU [n {log g) - log(p*)} + (Kt - r0 {log (^) - log(1 - p?)}]
+2ZU
■ {log(pt) - E (log (£)]} + (K, - n) {log(1 - p*) - E (log (^¡f)
(19)
where .....^ and is the aggregated ,oss function over the
components of r and rp. For the ith experimental group, it is given by
L(n,rf) = 2
•ilog(^) + (Ki-ri)log(^ß)
; i = 1,2,...,1.
(20)
It is to be noted here that the loss function (20) is derived under the binomial modelling assumption in each of the ith experimental group for rt (r?) number of successes corresponding to observed (predictive) dataset, in Kt Bernoulli trials (see Sharma and Upadhyay (2018b)), where i =
1,2,...,I. A careful look on (19) reveals the situation when we observe exactly zero or Kf, i =
1
1,2,...,I, counts. This problem can be resolved by adding (subtracting) - when we observe exactly zero (Kt) counts in the ith experimental group, i = 1,2,...,/. Moreover, in (19), expectation is taken with respect the posterior predictive distribution of the future observation rf; i = 1,2,...,I, given the observed data. One may note that for the situation considered here, the posterior predictive distribution is not available in analytically closed form although it is manageable with the help of simulated posterior samples. Say, for instance, we can easily generate the predictive samples from the considered distributions once the corresponding posterior samples are made available and hence the expectation in (19) can be obtained accordingly.
The EPPL is a decision theoretic measure for model comparison that comprises of two types of losses. The first one being the loss due to fitting (LDF) also known as goodness of fit term and is given in terms of the likelihood ratio statistic (see the first term on RHS of (19)). The second one is the loss incurred due to complexity (LDC) of the model that may also be used to approximate the predictive variance (see the second term on RHS of (19)). Finally, the criterion recommends a model that has the least value of EPPL over all the entertained models.
5 Numerical Illustration
The numerical illustration is based on a real dataset obtained from a survival/sacrifice experiment conducted at National center for toxicological research (NCTR). The subjects involved in the experiment were the male and female mice from two strains of offspring. The first strain F1 consisted of offspring from mating of BALB/c males to C57BL/6 females. The second strain F2 consisted of offspring from non brother-sister mating of the Fl progeny. The considered dataset is taken from Balakrishnan and Ling (2013) which is a compact form of the original dataset reported in Kodell and Nelson (1980).
The dataset consisted of 823 male and female mice from two strains of offspring classified in I = 51 experimental groups with varied number of mice in each group. The mice in different groups were subject to four different doses, 60 ppm, 120 ppm, 200 ppm and 400 ppm, of chemical benzidine dihydrochloride, responsible to develop tumours in the mice, where ppm stands for parts per million. The number of mice with tumours, say, n1,n2,... ,n51, was then recorded in different groups at times Ti, T2,..., T51, respectively. Besides, we also recorded the number of mice without tumours, say, r1,r2,...,r51, in each group . Obviously, the mice with (without) tumours correspond to the observed number of failures (survivals) at the time of observation. One may also note that the experiment involves three covariates, namely the two strains of offspring (say, F1=0 and F2=1), sex of mice (say, F=0 for females and M=1 for males) and doses of chemical (60 ppm, 120 ppm, 200 ppm and 400 ppm). The distribution of mice according to these three covariates are reported in Table 2.
Let us now consider the case of missing data situation in the considered dataset and assume that some of the mice are found missing when observed at different times Tt, T2,..., T51. In this case, we therefore observe three different categories, that is, number of mice with tumours (n'i ), number of mice without tumours (r't ) and number of mice found missing (mt ) at the time Tt in the ith experimental group, where i = 1,2,..., I. It may be noted that the number of mice considered in the ith experimental group, K, is same as that for the non-missing case. Also, the mice in each group are subject to the same covariates as mentioned for the non-missing case. The corresponding dataset is shown in Table 2 where number of missing observations are taken hypothetically for illustration.
Table 2: Observed number of failures ni(n'¿), survivals rt(r'and missing units (mi) at time Ti in the presence of covariates xit (F1=0, F2=1), xi2 (F=0, M=1) and xi3 (dose of chemical in ppm), for the ith experimental group, i = 1,2,...,! (the values in parentheses correspond to missing data cases)
Experimental group Number of experimental units Observation time (in months) Number of failures Number of survivals Number of missing units Covariates
i K Ti ni(n'i) ri(r' i) (mi) xn Xi2 Xi3
1 48 9.27 1 (1) 47 (41) (6) 0 0 60
2 24 9.37 0 (0) 24 (20) (4) 0 0 60
3 24 13.97 1 (1) 23 (21) (2) 0 0 60
4 24 9.37 0 (0) 24 (21) (3) 0 0 120
5 24 13.97 5 (3) 19 (16) (5) 0 0 120
6 23 14.03 9 (7) 14 (14) (2) 0 0 120
7 26 18.67 25 (20) 1 (3) (3) 0 0 120
8 48 9.27 0 (0) 48 (40) (8) 0 1 120
9 44 14.00 7 (6) 37 (33) (5) 0 1 120
10 22 18.73 7 (5) 15 (14) (3) 0 1 120
Experimental group Number of experimental units Observation time (in months) Number of failures Number of survivals Number of missing units Covariates
11 20 19.30 4 (4) 16 (14) (2) 0 1 120
12 24 9.27 0 (0) 24 (22) (2) 1 1 60
13 23 9.30 0 (0) 23 (20) (3) 1 1 60
14 21 9.37 0 (0) 21 (19) (2) 1 1 60
15 44 14.00 3 (3) 41 (34) (7) 1 1 60
16 18 18.67 2 (2) 16 (15) (1) 1 1 60
17 20 18.70 2 (2) 18 (16) (2) 1 1 60
18 1 16.53 1 (1) 0 (0) (0) 0 0 120
19 1 16.57 1 (1) 0 (0) (0) 0 0 120
20 1 16.90 1 (1) 0 (0) (0) 0 0 120
21 1 15.13 1 (1) 0 (0) (0) 0 0 120
22 1 15.40 1 (1) 0 (0) (0) 0 0 120
23 47 9.33 4 (4) 43 (38) (5) 0 0 200
24 45 14.00 38 (32) 7 (10) (3) 0 0 200
25 22 14.00 11 (8) 11 (13) (1) 0 1 400
26 15 18.70 11 (7) 4 (6) (2) 0 1 400
27 1 7.87 1 (1) 0 (0) (0) 0 1 400
28 1 14.73 1 (1) 0 (0) (0) 0 1 400
29 18 18.70 5 (5) 13 (12) (1) 1 1 120
30 1 9.57 1 (1) 0 (0) (0) 1 1 120
31 1 14.43 1 (1) 0 (0) (0) 1 1 120
32 1 17.87 1 (1) 0 (0) (0) 1 1 120
33 1 18.03 1 (1) 0 (0) (0) 1 1 120
34 1 5.13 0 (0) 1 (1) (0) 1 1 120
35 1 13.53 1 (1) 0 (0) (0) 0 0 200
36 1 14.03 1 (1) 0 (0) (0) 0 0 200
37 1 14.23 1 (1) 0 (0) (0) 0 0 200
38 1 18.67 1 (1) 0 (0) (0) 0 0 200
39 24 9.33 16 (11) 8 (10) (3) 0 0 400
40 10 14.00 9 (7) 1 (2) (1) 0 0 400
41 1 9.87 1 (1) 0 (0) (0) 0 0 400
42 1 17.13 0 (0) 1 (1) (0) 1 0 60
43 24 9.27 2 (2) 22 (19) (3) 1 0 120
44 22 9.37 0 (0) 22 (20) (2) 1 0 120
45 41 14.00 15 (12) 26 (25) (4) 1 0 120
46 1 15.43 1 (1) 0 (0) (0) 1 1 200
47 24 9.30 1 (1) 23 (20) (3) 1 1 400
48 21 14.00 4 (4) 17 (16) (1) 1 1 400
49 12 18.67 6 (5) 6 (6) (1) 1 1 400
50 1 11.90 1 (1) 0 (0) (0) 1 1 400
51 1 14.77 1 (1) 0 (0) (0) 1 1 400
Now, going back to Section 2, we can easily see that the present dataset involves J = 3 covariates and, therefore, the total number of associated parameters are ((J + 1) + (J + 1)) = 8. Out of these eight parameters, the first four, that is, a0, a1, a2, a3 are associated with the Weibull
scale parameter and the remaining four, that is, b0, b1, b2, b3 are associated with the Weibull shape parameter. Similarly, when one considers the analysis of lognormal distribution, the associated parameters are a'0, a'1, a'2, a'3 and b'0, b'1, b'2, b'3, respectively, with its scale and shape parameters. One may also note that the parameters a1, a2, a3 and a 1, a 2, a 3 affecting the scale parameters of the corresponding distributions are due to the covariates considered as strain of offspring, sex of mice and dose of the chemical, respectively. Similarly, the parameters b1, b2, b3 and b'1, b'2, b'3 causing the affect on shape parameters of the corresponding distributions enter into the modelling formulation due to the three covariates entertained in the analysis.
To implement the Bayesian modelling formulation as given in Section 3, we first begin with the exact specification of the considered prior distributions for the model parameters of the two entertained models. The prior distribution given in (13) was evaluated based on the interval [-25, 25] specified by the experts for ajand bj; j = 0,1,2,3, under the assumption of accelerated Weibull distribution. The hyperparameters (p.aj, aaj) and (jibj, abj) associated with aj and bj can be obvious from (14) and (15) for all . The prior distributions associated with the accelerated lognormal parameters were also managed in an identical manner.
We next considered simulation of posterior samples from (16) and (17) using Metropolis algorithm as per details given in Section 3, considering a single long run of the chain in each case. For the implementation of the Metropolis algorithm, we considered an eight-dimensional normal distribution with mean vector q and covariance matrix £ as a candidate generating density. We, however, used a scaling constant cs = 0.5 with £ to minimize number of rejections of generated values from the candidate generating density (see also Upadhyay et al. (2001)). As initial values for starting the chain, we used q as a vector of ML estimates of q assuming q = (a,b) for the posterior based on accelerated Weibull distribution and q = for the posterior based on accelerated
lognormal distribution. Thus ML estimates were obtained using the likelihood functions given in (7) and (8) corresponding to Weibull and lognormal based accelerated models, respectively. Similarly, the initial values for £ was obtained as £ in each case where £ is Hessian based approximation corresponding to (7) and (8) evaluated at corresponding ML estimates q. The ML estimates of the model parameters are also reported in Tables 3 and 4 for the accelerated Weibull and the accelerated lognormal distributions, respectively.
Once the simulated chain started producing, we monitored the convergence based on ergodic averages. This was achieved at about 50K iterations for both the models. Once the convergence was achieved, we took a random sample of size 4K separately from the two posteriors by choosing every 10th observation. This gap among the generating variates made serial correlations almost negligible. Thus we finally obtain samples from the joint posteriors (16) and (17), each component of which will form samples from the corresponding marginal posterior. Once these samples are obtained, the desired sample based posterior inferences can be easily drawn (see also Upadhyay et al. (2001)).
The simulation of posterior samples corresponding to missing data situation can be a routine extension of non-missing case described above. As mentioned in Section 3 (see also subsection 2.1), we need to create hybridization with the help of idea inherent in the Gibbs sampler algorithm where the corresponding full conditional distributions can be managed as discussed in Section 3 (see also Sharma and Upadhyay (2018a)).
The estimated posterior modes for various parameters in the two models and the corresponding highest posterior density (HPD) intervals with coverage probability 0.95 are reported in Tables 3-4, where the values in parentheses correspond to the same for the missing data cases. It can be seen that the estimated posterior modes are, in general, close enough to the corresponding ML estimates, a conclusion that is normally appreciated by the classical statisticians. Besides, it conveys the fact that the subjective opinion provided by the experts does not lead to strong prior distributions and the inferences are mostly data dependent. We also observe that the estimates obtained under the missing data cases are, in general, close enough to
those corresponding to non-missing cases for both the models. This, in turn, ensures that we do not loose enough if some of the observations are missing during the experimentation. Among other important conclusions, it can be seen that the estimated 0.95 HPD intervals are mostly narrow reflecting small variability among the different variates. We do not go into the details of other conclusions although the same can be easily drawn once the posterior samples are made available.
Table 3: ML estimates and estimated posterior characteristics for the parameters of accelerated Weibull distribution (the values in parentheses correspond to missing data case)
Parameter ML estimate Posterior mode 0.95 HPD interval
ao 2.944 2.950 2.878, 3.033
(2.945) (2.877, 3.035)
0.049 0.041 -0.065, 0.256
(0.039) (-0.072, 0.262)
a2 0.622 0.594 0.398, 1.017
(0.608) (0.412, 1.046)
a3 -0.002 -0.002 -2.210e-03, -0.001
(-0.002) (-2.221e-03, -0.001)
b0 2.205 2.214 1.880, 2.572
(2.207) (1.889, 2.573)
b-1 -0.088 -0.128 -0.620, 0.285
(-0.116) (-0.622, 0.279)
b2 -0.816 -0.755 -1.422, -0.320
(-0.836) (-1.481, -0.318)
b3 -0.002 -0.002 -0.004, -0.001
(-0.002) (-0.004, -0.001)
Table 4: ML estimates and estimated posterior characteristics for the parameters of accelerated lognormal distribution (the values in parenthesis correspond to missing data case)
Parameter ML estimate Posterior mode 0.95 HPD interval
a'0 2.900 2.906 2.829, 3.005
(2.932) (2.824, 3.035)
a'1 0.096 0.095 -0.037, 0.285
(0.087) (-0.052, 0.344)
a'2 0.552 0.540 0.390, 0.781
(0.567) (0.368, 0.957)
a'3 -0.002 -0.002 -2.534e-03, -1.589e-03
(-0.002) (-2.485e-03, -0.001)
b'0 -1.720 -1.751 -2.113, -1.406
(-1.649) (-2.088, -1.325)
b\ 0.283 0.307 -0.065, 0.771
(0.245) (-0.128, 0.803)
b' 2 0.787 0.768 0.407, 1.235
(0.812) (0.362, 1.376)
b'3 0.001 0.001 -0.001, 0.003
(0.002) (-8.421e-05, 0.004)
The simulated marginal posterior samples corresponding to the model parameters can be further utilized to obtain various survival characteristics such as the survival probability at a specified point of time, hazard rate and mean time to appearance of tumours (MTAT), etc. associated with the mice involved in testing. We have, however, worked out for MTAT only based on the final simulated posterior samples. The MTAT under the accelerated Weibull distribution is 0r(1 + l) where the scale parameter 8 = exp(a0 + Ylj=1 ajXj) and the shape parameter =
exp(b0 + Y3=1 bjXj). Similarly, the MTAT under the accelerated lognormal distribution is exp (i + y) where i = a'0 + YJj=\ a'jXj and a = exp(b'0 + YJj=\ b'jXj). It may be noted here that while writing the expression for MTAT, we have simplified the notations considerably. As a result, the term Xj associated with the model parameters corresponds to the jth covariate, where j = 1,2,3. Thus x1 corresponds to two levels of strain (F1=0 and F2=1), x2 corresponds to sex of mice (0 for females and 1 for males) and X3 corresponds to four doses of chemical (60 ppm, 120 ppm, 200 ppm and 400 ppm).
The estimated posterior characteristics for MTAT at different levels of three covariates are reported in Table 5 under the assumption of accelerated Weibull distribution. The values in parentheses represent the results corresponding to missing data case. We have also reported the corresponding ML estimates for MTAT in order to have a comparison of our results with the classical ones (see Table 5). It can be seen that there is appreciable difference between the MTAT estimates corresponding to two sexes and the female mice are more susceptible to the chemically developed tumours than the male mice for both the levels of strains and all the four doses of the chemical. The results given in Table 5, however, do not stipulate appreciable difference between the two strains of offspring for both the sexes at all the four doses of chemical. There is yet another important finding that can be seen from Table 5. The mice receiving the higher dose of the chemical are more susceptible to the chemically developed tumour, a conclusion that is absolutely in accordance with dose-response relationship. Besides, we also obtained the estimates for MTAT under the assumption of accelerated lognormal distribution. More or less similar results were observed in this case as well except a few marginal differences in the estimates for male mice at two levels of strains of offspring. We do not report these results presuming that these are not going to offer any additional benefit rather unnecessarily increase the length of paper.
We next focus on the estimates of MTAT in missing data case. We can see that both point and interval estimates are close enough to the corresponding estimates in non-missing data case when the considered distribution is accelerated Weibull (see Table 5). More or less similar observations were marked when the considered distribution was accelerated lognormal except for some of the estimated HPD intervals for male mice. These HPD intervals were found to be wider, in general, than the corresponding HPD intervals for non-missing data case. Obviously, this finding is important in the sense that it provides large variability among the MTAT estimates associated with the male mice in missing data case. In this very sense, the accelerated Weibull distribution can be visualized to be a better candidate than the accelerated lognormal distribution simply because the former distribution offers more or less stable estimates for MTAT almost in every considered situation.
Before we end the section, we compare the two accelerated distributions formally using DIC and EPPL measures discussed in Section 4. The DIC is evaluated on the basis of 4K posterior samples from each of the two posteriors (16) and (17) associated with the accelerated Weibull and the accelerated lognormal distributions, respectively. For the evaluation of EPPL, we generated 4K predictive data sets exactly in the same form and size as the observed data. It may be noted that these 4K predictive samples were obtained with the help of simulated posterior samples corresponding to each of the two considered distributions. DIC and EPPL were similarly evaluated for missing data case as well. The evaluated values of the two measures are reported in Table 6.
Table 5: ML estimates and estimated posterior characteristics for MTAT under the accelerated Weibull distribution (the values in parentheses correspond to missing data case)
Strain Sex Dose of chemical (in ppm) ML estimate Posterior mode 0.95 HPD interval
F1 F 60 16.122 16.189 (16.237) 15.387, 17.198 (15.305, 17.146)
120 14.448 14.437 (14.495) 13.968, 15.085 (13.898, 15.031)
200 12.476 12.443 (12.429) 11.939, 12.919 (11.941, 12.915)
400 8.632 8.474 (8.545) 7.539, 9.435 (7.490, 9.360)
F1 M 60 28.722 27.742 (27.851) 23.241, 42.981 (23.206, 44.484)
120 25.752 24.723 (25.005) 20.978, 38.108 (20.415, 38.900)
200 22.286 21.256 (21.681) 17.824, 32.919 (17.747, 33.388)
400 15.686 17.036 (15.692) 12.063, 27.602 (12.074, 28.903)
F2 F 60 16.885 16.622 (16.665) 15.499, 20.234 (15.387, 20.385)
120 15.139 14.997 (14.932) 13.787, 17.981 (13.763, 18.152)
200 13.085 12.985 (12.854) 11.813, 15.706 (11.655,15.672)
400 9.077 8.901 (8.953) 7.535, 11.343 (7.365, 11.208)
F2 M 60 30.052 30.757 (31.216) 25.030, 44.570 (24.330, 45.438)
120 26.958 27.639 (27.597) 22.464, 39.396 (22.039, 40.562)
200 23.354 23.829 (23.699) 19.539, 34.449 (19.515, 35.736)
400 16.531 17.563 (16.883) 13.652, 31.258 (13.692, 33.687)
Table 6: DIC and EPPL values under the accelerated Weibull and the accelerated lognormal distributions (the values in parentheses correspond to missing data case)
Distribution DIC Under EPPL criterion
LDF LDC EPPL
Accelerated Weibull 599.943 38.066 28.991 67.057
(600.012) (38.009) (29.110) (67.119)
Accelerated lognormal 600.276 38.315 28.306 66.621
(611.988) (49.368) (29.987) (79.355)
As regards the results, it can be seen that the values of DIC and those of EPPL under the two distributions are almost close to each other and, therefore, one may consider either of the two distributions for the considered dataset. It may, however, be seen that the accelerated Weibull distribution performs better than the accelerated lognormal distribution even in case some of the observations are missing. So we prefer to conclude in favour of the accelerated Weibull distribution although it offers slight poor loss due to complexity and hence the overall loss for non-missing data case. The difference is, however, marginal only when compared to the corresponding values obtained under the accelerated lognormal distribution.
6 Conclusion
The paper is a successful attempt of analyzing current status data when exact lifetimes are not observable rather the information is available only in the form of failure status or surviving. The other novel feature of the paper is the use of accelerated lifetime models, namely the two-parameter Weibull and the two-parameter lognormal distributions when both the parameters are allowed to be affected by the covariates or the stress variables occurring in the experimentation. Normally, in such situations only the scale parameter of the model is allowed to be varied in accordance with the covariates and the other shape parameter is kept constant. Of course, the resulting distributions are complex but not a deterrent issue when allowed to be dealt by sample based approaches to Bayesian computation. The results on model comparison considered in the paper finally recommend the accelerated Weibull model when both missing and non-missing datasets are allowed to be entertained. If, however, there is no missing data in the experimentation, one can consider either of the two models for drawing the necessary inferences.
References
[1] M. I. Araujo and B. d. B Pereira. A comparison of Bayes factors for separated models: some simulation results. Communications in Statistics- Simulation and Computation, 36(2):297-309, 2007.
[2] N. Balakrishnan and M. H. Ling. EM algorithm for one-shot device testing under the exponential distribution. Computational Statistics and Data Analysis, 56(3):502-509, 2012.
[3] N. Balakrishnan and M. H. Ling. Expectation maximization algorithm for one shot device accelerated life testing with Weibull lifetimes, and variable parameters over stress. IEEE Transactions on Reliability, 62(2):537-551, 2013.
[4] N. Balakrishnan and M. H. Ling. Gamma lifetimes and one-shot device testing analysis. Reliability Engineering & System Safety, 126:54-64, 2014.
[5] N. Bousquet, H. Bertholon, and G. Celeux. An alternative competing risk model to the Weibull distribution for modelling aging in lifetime data analysis. Lifetime Data Analysis, 12(4):481, 2006.
[6] C. E. Buck and S. K. Sahu. Bayesian models for relative archaeological chronology building. Journal of the Royal Statistical Society: Series C (Applied Statistics), 49(4):423-440, 2000.
[7] R. Dumonceaux, C. E. Antle, and G. Haas. Likelihood ratio test for discrimination between two models with unknown location and scale parameters. Technometrics, 15(1):19-27, 1973.
[8] T.-H. Fan, N. Balakrishnan, and C.-C. Chang. The Bayesian approach for highly reliable electro-explosive devices using one-shot device testing. Journal of Statistical Computation and Simulation, 79(9):1143-1154, 2009.
[9] J. S. Kim and B.-J. Yum. Selection between Weibull and lognormal distributions: a comparative simulation study. Computational Statistics & Data Analysis, 53(2):477-485, 2008.
[10] D. H. Kirn, W. D. Lee, and S. G. Kang. Bayesian model selection for life time data under type II censoring. Communications in Statistics- Theory and Methods, 29(12):2865-2878, 2000.
[11] R. L. Kodell and C. J. Nelson. An illness-death model for the study of the carcinogenic process using survival/sacrifice data. Biometrics, 36(2):267-277, 1980.
[12] J. F. Lawless. Statistical Models and Methods for Lifetime Data. John Wiley & Sons, Inc., Hoboken, New Jersey., 2002.
[13] N. R. Mann, N. D. Singpurwalla, and R. E. Schafer. Methods for Statistical Analysis of Reliability and Life Data. John Wiley & Sons, Inc., New York, 1974.
[14] W. Q. Meeker and L. A. Escobar. Statistical Methods for Reliability Data. John Wiley & Sons, 1998.
[15] W. Q. Meeker. A comparison of accelerated life test plans for Weibull and lognormal distributions and type I censoring. Technometrics, 26(2):157-171, 1984.
[16] C. A. Meeter and W. Q. Meeker. Optimum accelerated life tests wth a nonconstant scale parameter. Technometrics, 36(1):71-83, 1994.
[17] W. Nelson. Accelerated Testing: Statistical Models, Test Plans, and Data Analysis. John Wiley & Sons, Inc., Hoboken, New Jersey, 1990.
[18] C. Robert and G. Casella. Introducing Monte Carlo Methods with R. Springer Science & Business Media, 2010.
[19] C. Robert and G. Casella. Monte Carlo Statistical Methods. Springer Science & Business Media, 2013.
[20] R. Sen. Bayesian Analysis of Some Non-Regular Families in Accelerated Life Testing. PhD thesis, Department of Statistics, Banaras Hindu University, Varanasi, India, 2016.
[21] R. Sharma and S. K. Upadhyay. A hierarchical Bayes analysis for one-shot device testing experiment under the assumption of exponentiality. Communications in Statistics- Simulation and Computation, 47(5):1297-1314, 2018a.
[22] R. Sharma and S. K. Upadhyay. A hierarchical Bayes analysis for one-shot device accelerated life test under the assumption of Weibull and exponential models. Submitted for Publication, 2018b.
[23] N. D. Singpurwalla. Reliability and Risk: a Bayesian Perspective. John Wiley & Sons, 2006.
[24] A. F. M. Smith and G. O. Roberts. Bayesian computation via the Gibbs sampler and related Markov chain Monte Carlo methods. Journal of the Royal Statistical Society. Series B (Methodological), 55(1):3-23, 1993.
[25] D. J. Spiegelhalter, N. G. Best, B. P. Carlin, and A. v. d. Linde. Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(4):583-639, 2002.
[26] S. K. Upadhyay. Common failure distributions. Wiley Encyclopedia of Operations Research and Management Science, 2010.
[27] S. K. Upadhyay, A. Gupta, B. Mukherjee, and A. Khokhar. A Baye comparison of Weibull extension and modified Weibull models for data showing bathtub hazard rate. Journal of Statistical Computation and Simulation, 83(1):68-81, 2013.
[28] S. K. Upadhyay, N. Vasishta, and A. F. M. Smith. Bayes inference in life testing and reliability via Markov chain Monte Carlo simulation. Sankhya: The Indian Journal of Statistics, Series A, 63(1):15-40, 2001.
[29] S. K. Upadhyay and M. Peshwani. Choice between Weibull and lognormal models: a simulation based Bayesian study. Communications in Statistics-Theory and Methods, 32(2):381-405, 2003.
[30] W. Wang. Fitting the Weibull and Lognormal Log-linear Models to Accelerated Life Test Data. PhD thesis, The University of Arizona, Tucson, Arizona, 1999.