Bayesian Survival Modeling of Marshal Olkin Generalized-G family with random effects using R and
STAN
Shazia Farhin, Firdoos Yousuf and Athar Ali Khan
•
Department of Statistics and Operations Research Aligarh Muslim University, Aligarh-202002, India [email protected] [email protected] [email protected]
Abstract
The purpose of this paper is to fit the Marshall-Olkin generalized-G(MOG-G) family to censored survival data with random effect in the Bayesian environment. Three special distrbution based on MOG-G family are obtained, namely Marshall-Olkin generalized-exponential, Marshall-Olkin generalized-Weibull, and Marshall-Olkin generalized-Lomax. The probabilistic programming language STAN is used for the fitting of these three distrbution to the survival data. STAN offers full Bayesian inference and implements via Hamiltonian Monte Carlo algorithm and No-U-Turn Sampler(NUTS) algorithm of MCMC. We compared the models with the help of leave one out cross-validation information criteria and Watanabe Akaike information criteria. Stan codes for the analysis are provided.
Keywords: Bayesian modeling, Marshall-Olkin generalized-G family, censored survival data, random effect, Leave one out information criteria, STAN
1. Introduction
In the survival analysis, researchers are using the extended version of standard distribution to analyze the lifetime data and problems related to the modeling of the aging or failure process. In this paper, we have used the Marshall-Olkin generalized-G (MOG-G) family to fit censored survival data, including the random effect. [1] proposed the MOG-G family and studied its mathematical properties along with application in the fitting of lifetime data.The Marshall Olkin distribution has been extended by using the genesis of other distributions to create a wider family of distribution. see for example, Marshall-Olkin-G family [2], Kumaraswamy marshal-Olkin family [3], beta Marshall-Olkin family [4], Beta Generalized Marshall-Olkin-G family [5], Exponentiated Marshall-Olkin family [6], The generalized Marshall-Olkin-Kumaraswamy-G family [7], The Beta generalized Marshall-Olkin Kumaraswamy-G [8], The exponentiated generalized Marshall-Olkin family [9], The Weibull Marshall-Olkin family [10].
We have considered the three models based on the MOG-G family and are fitted to the survival data. The first model is Marshall-Olkin Generalized-Exponential(MOG-E), the second is Marshall-Olkin Generalized-Weibull (MOG-W) model, and the third one is Marshall-Olkin Generalized-Lomax (MOG-L) model. The data with random effect significantly affects the distribution of the patients' survival time and accounts heterogeneity among the patients. Fitting a large number of random effects in a non-Bayesian setting requires a large amount of data. Often, the data is too small to estimate random-effects parameters reliably. However, Bayesian modeling
can be used if there is not enough data for inferential statistics. So, the above three models have been fitted to the censored survival data under the Bayesian setup in R [11] using the probabilistic programming language STAN [12], which offers full Bayesian inference. STAN uses Hamiltonian Monte Carlo (HMC) sampling [13],[14] and its extension. No-U-Turn Sampler(NUTS) [15] algorithm of MCMC for the simulation and computation of posterior estimate. HMC is a more efficient and sophisticated MCMC algorithm, and it is the combination of MCMC and deterministic simulation methods. To find the region of posterior distribution with high mass, HMC uses the gradient of the log posterior density. After that, it jumps around the posterior distribution [16]. Whether the priors are conjugate or not, the above algorithms converge at a fast rate to high dimensional target distributions as compared to other algorithms of MCMC [15].
The purpose of this paper is to fit the three models, namely MOG-E, MOG-W, and MOG-L, to the censored survival data containing random effects under the Bayesian environment using the R and STAN and select the best model for the real survival data.
2. Marshal-Olkin Generalized-G family
Suppose that G(t, ty) and g(t, ty) be baseline cdf and pdf of a continuous random variable T with parameter vector ty. The cdf, pdf, survival function, and hazard function of the MOG-G family are respectively given by
F(',",«, ty)= 1 -'(--"»-[^'ty)]. - 'e R (1)
', ty «f(>,ty)[1 - GQ,ty)]'-12, t € R (2)
[1 - (1 - »)[1 - G(t, ty)]']2
% ',»ty) = 1 - (»--£<- Z ty)]., < < r (3)
^',» ty) = 1T ty)[G - tyil ty)]', te R (4)
Hence forth a random variable T with pdf (2) is denoted by T~MOG-G(a,',ty), where » and ' are two positive shape parameter.
2.1. Marshall-Olkin Generalized Exponential model
Consider T as a continous random variable follow an exponential distribution with scale parameter A > 0, whose pdf and cdf is given by g(t) = Je- a and G(t) = 1 - e-a, t > 0. Then the pdf and cdf of MOG-E model are respectively given by
f (t) =-^^---2 (5)
f - (1 - a)exP(-'J^
= 1 - exp(-' A) (6)
F(t) 1 - (1 - »)exp(-'J) (6)
The survival function corresponding to Equation (6) is given as
S(t) =i n ( A) M (7)
1 - (1 - a)exp(-a J)
Hazard function of the MOG-E model is written as
, 1
() 1 - (1 - a)exp(-a J) ()
In survival analysis, random generation of time variable from a survival model is done by putting u = S(t), where U is a random variable follow Uniform(0,1). So, the generation of time variable from MOG-E model is obtained by
t=>g (u+(1 - <o) (9)
Following the [17], the joint likelihood function for right censored data is given as
n n
L = n Pr(ti, S ) = n {h(ti )}3iS(tl) (10)
i=0 i=0
here Si is an indicator variable
|0, censored
Si =
[1, observed
The likelihood function for the MOG-E survival model is given by
" f a1 1S a expi-al)
L = 0 ^-J , t A ^ ( J) t \ (11)
¿=0 11 - (1 - a)exP(-a j) J 1 - (1 - a)exp(-aJ)
2.2. Marshall-Olkin Generalized Weibull model
Let g(t) and G(t) be the pdf and cdf of Weibull distribution with shape parameter y > 0 and scale parameter J > 0. Where, g(t) = JytY-1 e-( J)Y and G(t) = 1 - e-( J)Y, t > 0. Then the pdf of MOG-W model is given by
f(t) = aaJJt7-1 eXp(-a(J)Y) 2 (12)
[1 - (1 - a)exp(-a( J)y)]2
Therefore, random variable T is denoted by T^MOG-W(a,a,Y,A). The cdf of MOG-W model is written as
= 1 - exp(-a( J)Y)
F(t) 1 - (1 - a)exp(-a( J)y) (13)
Survival function and hazard function of the MOG-W model are given respectively
S(t) =_aexp(-a( J)Y)__(14)
S(t) 1 - (1 - a)exp(-a( J)y) (14)
Ht) = 1 - (1 -7Y)exp(-a( J )Y ) (15)
Random generation from the MOG-W model is done by the expression given below
t = J
\log (U + (1 - a))
(16)
Using the Equation (10), the joint likelihood function for the MOG-W model based on right censored is written as
L = n i aYjtY-1 )Six aexp(-a( J )y )
f=0\1 - (1 - a)exp(-a( J)Y) 1 - (1 - a)exp(-a( J)Y) ( )
2.3. Marshall-Olkin Generalized Lomax model
Taking Lomax distribution with parameters y > 0 and A > 0 having pdf g(t) = J (1 + J) (7+1) and cdf G(t) = 1 - (1 + J)-Y, t > 0. Then the pdf and cdf of a random variable T-MOG-L(a,«,7,A) model are given respectively
aaJ (1 + J)-1 (1 + J)-aY f (t) = —^-^-i-(18)
r1 - (1 - a) (1 + J)-aY 1 1 - (1 + J)-aY
F(t) = ,, + y f)-aY (19)
1 - (1 - a) (1 + A) -aY
Survival function of the MOG-L is given by
J
Hazard function of the MOG-L is written as
t •
S(t) = TT^^-^ (20)
wn aa J (1 + J)-1
h(t) = 1 - e l-l)(lA+ j (21)
j)
Generation of survival time from the MOG-L model is given by
t=J
JL
(u + *1 -«))" -1
The joint likelihood function for the MOG-L model is written as
L = ^ aaY (1 (J)-1 1 ' ;; a(1 + (J)-aY (23)
f=o 11 - (1 - a) (1 + J)-a^ 1 - (1 - a) (1 + J)-aY ( )
(22)
3. Kidney catheter data
This dataset, originally discussed in [18]. The study concerns with the recurrence times to infection, at the point where the catheter is inserted, for kidney patients using portable dialysis equipment. The data consist of times until the first and second recurrence of kidney infection in 38 patients. Each patient has exactly two observations. Each survival time is the time until infection since the insertion of the catheter. A Catheter may be removed for reasons other than infection, in which case the observation is censored. There are about 24% censored observations in the dataset. This data set has unmeasured or 'random' effect that is an identification code of patients, which accounts heterogeneity among the patients. This data set available in the package survival [19] of R [11].
Discription of kidney catheter data variables are given below: time: time to infection in days
status: event status, 1=infection occurs or 0=censored age: age in years sex: 1=male, 2=female
disease: disease type(0=GN, 1=AN, 2=PKD, 3=Other) id: identification code of the patients
3.1. Construction of data frame in R
Fitting of Bayesian models to the kidney catheter data with stan function requires data in a listed form, which we have created as below;
require(survival)
data(cancer, package="survival")
head(kidney)
y=kidney$time
x1=kidney$age
x2=kidney$sex
kidney$disease1=as.numeric(kidney$disease)
x3=kidney$GN=as.numeric(kidney$disease1==2)
x4=kidney$AN=as.numeric(kidney$disease1==3)
x5=kidney$PKD=as.numeric(kidney$disease1==4)
x=cbind(1,x1,x2,x3,x4,x5)
N=nrow(x)
M=ncol(x)
J=38
event=kidney$status
Id=as.integer(kidney$id)##identity of subject datk=list(y=y,x=x,N=N,M=M,event=event,J=J,Id=Id)
4. Bayesian Analysis of MOG-G family
4.1. Prior Specification
For the construction of the Bayesian regression model, we need to specify a prior distribution to the parameters of the model. We have chosen half-Cauchy prior for shape and scale parameters and regularizing prior for regression coefficient.
4.1.1 Half-Cauchy prior distribution
The probability density function of half-Cauchy distribution with scale y is given by
2y
f (x) = , 2 —x > 0, y > 0
J v ' n(x + Y2)
The mean and variance of half-cauchy distribution does not exist, but its mode is equal to zero. The half-cauchy distribution with scale y=25 is nearly flat prior but not completely, the prior distribution that are not completely flat provides enough information for the numerical approximation algorithm to continue to explore the target density, the posterior distribution [20],[21]. [22] support the use half cauchy prior for scale parameter because of its excellent frequentist risk properties, and its sensible behaviour in the presence of sparsity compared to the usual conjugate aternative. [20] have also discussed the points in support of half cauchy prior.
4.1.2 Gaussian prior distribution
The probability density function of Gaussian distribution with mean u and variance a2 is given
by
1 (x — u)2
f (x) = exp(- - 2 ), -œ < x < œ,a > 0, u > 0
v2na2 2a
In this paper, we have chosen Gaussian prior with mean 0, and standard deviation 5 for ft coefficient as a regularizing prior because this prior prevent a model from getting too excited by the data that slows the rate of over excitement of model and reduce the overfitting of data to the model [23].
4.2. Model Specification
Following the [24] to build a regression model, we have introduced covariates including random intercept through the log link function i.e.
) = 01 + w[s„by. ] + 02 Xi1 + 03 X.2 + 04 Xi3 + 05 X.4 + 06 X.5 Ji = exp(01 + W[s„bj. ] + 02 Xi1 + 03 Xi2 + 04 Xi3 + 05 X.4 + 06 X.5 )
or,
J = exp(w[s„fci. ] + x0)
where, W[s„fcj.] is the variability accounted by subject or patients called as the random intercept,
w - N(0, aw), and 0 - N(0, ^ = 5)
4.2.1 Posterior density of MOG-E
By using bayes theorem, the joint posterior distribution is given as
P(a,a ,0|X, t) a L(t|a,a,0,X) x P(a) x P(a) x P(0) (24)
P(a,a,0|X,t)
-xP(w[sub/,. ]+x,0)
a -xp(-a
-xP(w[sub/; ]+x,0)
70))
¿=0 I 1 - (1 - a)-xp(-a-
2 x 25 2 x 25
x —r^;-r^rr x —t-z-r^rv x
P(w[sub/;]+X<0) 1
)1 - (1 - a)-xp(-a-
CP(w[sub/; ]+X,0)
)
n(a2 + 252) x n(a2 + 252) x owy/2n6XP ^ 2^W J x [=0 V 2 x 25^'
4.2.2 Posterior density of MOG-W
By using bayes theorem, the joint posterior distribution is given as
P(a,a ,7,0|X, t) a L(t|a,a,7,0,X) x P(a) x P(a) x P(y) x P(0) (26)
w
n
=-xp
(25) ■02
P(a, a,7,0|X,t) «H
«7 -
-P(w[sub/; ]+X,0)
7
0 l 1 - (1 - a)-XP(-«( -xp(W[s„I,]+Xi0) M
a-xp(-a( «pfr'+x,0) )7)
(27)
1 - (1 - a)-xp(-a(-
2 x 25 1
x —, „ x--xp
2 x 25 2 x 25
X —: ~-—x
( * + )7) " n(a2 + 252) " n(a2 + 252)
J 1
x n r fc— -xp
2
1 .02
n(72 + 252) a-wV2n ^ \ 2^) f=0 5>/2n 2 x 25 ^
4.2.3 Posterior density of MOG-L
By using bayes theorem, the joint posterior distribution is given as
P(a,a ,7,0|X, t) a L(t|a,a,7,0,X) x P(a) x P(a) x P(7) x P(0)
(28)
s
1
t
t
1
s
1
1
x
aa—r—^—^ 1 +
-1\ S
P(a, a, Y, P|X, t) KH< -^-\-aY f (29)
i=0 1 1 - (1 - a) 1 + ( * + R) )
^ ' y exp(W[Subji]+xiJ -aY
a ( 1 + ' '
exp(w[subji]+xi2 x 25 2 x 25
X —r^-T^tv X
1 - (1 - a)( 1 + —,—
v ' y exp(W[Subj.]+xiJ
aY " n(a2 + 252) n(a2 + 252)
2 x 25 1 ( w2\ J 1 ( 1 «2
x ^^+-252) x [-x n exp [-2X125
4.3. Implementation using Stan
Bayesian modeling of MOG-G family in STAN language includes the creation of blocks: functions block, data block, transformed data block, parameters block, transformed parameters block, model block, and generated quantities block. To run STAN code in R requires package rstan that is an interface of R and STAN.
4.3.1 Stan code for MOG-E model
modelMOGE="functions{
vector log_moegs(vector t, real a, real alpha, vector lambda){ vector[num_elements(t)]log_moegs; for(i in 1:num_elements(t)){
log_moegs[i]=log(alpha)-a*t[i]/lambda[i]-log(1-(1-alpha)*exp(-a*t[i]/lambda[i])); }
return log_moegs; }
vector log_moegh(vector t, real a, real alpha, vector lambda){ vector[num_elements(t)]log_moegh; for(i in 1:num_elements(t)){
log_moegh[i]=log(a)-log(lambda[i])-log(1-(1-alpha)*exp(-a*t[i]/lambda[i])); }
return log_moegh; }
real surv_MOEG_lpdf(vector t, vector d, real a, real alpha, vector lambda){ vector[num_elements(t)] llikmoeg; real prob;
llikmoeg=d .* log_moegh(t,a,alpha,lambda)+log_moegs(t,a,alpha,lambda); prob=sum(llikmoeg);
return prob; }}
data{ int N;
vector<lower=0>[N] y; vector<lower=0,upper=1>[N] event; int M;
matrix[N,M] x; int<lower=1>J;
int<lower=1,upper=J>Id[N]; }
parameters{
t
x
real<lower=0>a; vector[M] beta; real<lower=0> alpha; vector[J] w;
real<lower=0>sigma_w; }
transformed parameters{ vector[N] linpred; vector<lower=0>[N] lambda; linpred=x*beta; for(i in 1:N){
lambda[i]=exp(w[Id[i]]+linpred[i]); }}
model{
target+=cauchy_lpdf(alpha|0,25)- 1 * cauchy_lccdf(0|0,25); target+=cauchy_lpdf(a|0,25)- 1 * cauchy_lccdf(0|0,25); target+=normal_lpdf(beta|0,5); target+=normal_lpdf(w|0,sigma_w);
target+=cauchy_lpdf(sigma_w|0,25)- 1 * cauchy_lccdf(0|0,25);
target+=surv_MOEG_lpdf(y|event,a,alpha,lambda); }
generated quantities{ vector[N] log_lik; vector[N] yrepmoeg; real dev; dev=0;
for(n in 1:N) log_lik[n]=event[n]*(log(a)-log(lambda[n])-log(1-(1-alpha)*exp(-a*y[n]/ lambda[n])))+log(alpha)-a*y[n]/lambda[n]-log(1-(1-alpha)*exp(-a*y[n]/lambda[n])); {real u;
u=uniform_rng(0,1);
for(n in 1:N) yrepmoeg[n]=(lambda[n]/a)*log(alpha/u+(1-alpha)); }
dev=dev+(-2)*surv_MOEG_lpdf(y|event,a,alpha,lambda); }"
4.3.2 Stan code for MOG-W model
modelMOGW="functions{
vector log_mogws(vector t, real a, real alpha,real gamma, vector lambda){ vector[num_elements(t)]log_mogws; for(i in 1:num_elements(t)){
log_mogws[i]=log(alpha)-a*(t[i]/lambda[i])~(gamma)-log(1-(1-alpha)
*exp(-a*(t[i]/lambda[i])~(gamma))); }
return log_mogws; }
vector log_mogwh(vector t, real a, real alpha, real gamma, vector lambda){ vector[num_elements(t)]log_mogwh; for(i in 1:num_elements(t)){
log_mogwh[i]=log(a)+log(gamma)-gamma*log(lambda[i])+(gamma-1)*log(t[i])
-log(1-(1-alpha)*exp(-a*(t[i]/lambda[i])~(gamma))); }
return log_mogwh; }
real surv_MOGW_lpdf(vector t, vector d, real a, real alpha,real gamma,vector lambda){ vector[num_elements(t)] llikmogw; real prob;
llikmogw=d .* log_mogwh(t,a,alpha,gamma,lambda)+log_mogws(t,a,alpha,gamma,lambda); prob=sum(llikmogw);
return prob; }}
data{ int N;
vector<lower=0>[N] y; vector<lower=0,upper=1>[N] event; int M;
matrix[N,M] x; int<lower=1>J;
int<lower=1,upper=J>Id[N]; }
parameters{ real<lower=0>a; vector[M] beta; real<lower=0> alpha; real<lower=0> gamma; vector[J] w;
real<lower=0>sigma_w; }
transformed parameters{ vector[N] linpred; vector<lower=0>[N] lambda; linpred=x*beta; for(i in 1:N){
lambda[i]=exp(w[Id[i]]+linpred[i]); }}
model{
target+=cauchy_lpdf(alpha|0,25)- 1 * cauchy_lccdf(0|0,25); target+=cauchy_lpdf(a|0,25)- 1 * cauchy_lccdf(0|0,25); target+=cauchy_lpdf(gamma|0,25)- 1 * cauchy_lccdf(0|0,25); target+=normal_lpdf(beta|0,5); target+=normal_lpdf(w|0,sigma_w);
target+=cauchy_lpdf(sigma_w|0,25)- 1 * cauchy_lccdf(0|0,25);
target+=surv_MOGW_lpdf(y|event,a,alpha,gamma,lambda); }
generated quantities{ vector[N] log_lik; vector[N] yrepmogw; real dev; dev=0;
for(n in 1:N) log_lik[n]=event[n]*(log(a)+log(gamma)-gamma*log(lambda[n])+(gamma-1)* log(y[n])-log(1-(1-alpha)*exp(-a*(y[n]/lambda[n])~(gamma))))+log(alpha) -a*(y[n]/lambda[n])~(gamma)-log(1-(1-alpha)*exp(-a*(y[n]/lambda[n])~(gamma))); {real u;
u=uniform_rng(0,1);
for(n in 1:N) yrepmogw[n]=lambda[n]*((1/a)*log(alpha/u+(1-alpha)))~(1/gamma); }
dev=dev+(-2)*surv_MOGW_lpdf(y|event,a,alpha,gamma,lambda);
4.3.3 Stan code for MOG-L model
modelMOGL="functions{
vector log_mogls(vector t, real a, real alpha,real gamma, vector lambda){ vector[num_elements(t)]log_mogls; for(i in 1:num_elements(t)){
log_mogls[i]=log(alpha)-a*gamma*log(1+t[i]/lambda[i])-log(1-(1-alpha)
*(1+t[i]/lambda[i])~(-a*gamma)); }
return log_mogls; }
vector log_moglh(vector t, real a, real alpha, real gamma, vector lambda){ vector[num_elements(t)]log_moglh; for(i in 1:num_elements(t)){
log_moglh[i]=log(a)+log(gamma)-log(lambda[i])-log(1+t[i]/lambda[i])-
log(1-(1-alpha)*(1+t[i]/lambda[i])~(-a*gamma)); }
return log_moglh; }
real surv_MOGL_lpdf(vector t, vector d, real a, real alpha,real gamma,vector lambda){ vector[num_elements(t)] llikmogl; real prob;
llikmogl=d .* log_moglh(t,a,alpha,gamma,lambda)+log_mogls(t,a,alpha,gamma,lambda); prob=sum(llikmogl);
return prob; }}
data{ int N;
vector<lower=0>[N] y; vector<lower=0,upper=1>[N] event; int M;
matrix[N,M] x; int<lower=1>J;
int<lower=1,upper=J>Id[N]; }
parameters{ real<lower=0>a; vector[M] beta; real<lower=0> alpha; real<lower=0> gamma; vector[J] w;
real<lower=0>sigma_w; }
transformed parameters{ vector[N] linpred; vector<lower=0>[N] lambda; linpred=x*beta; for(i in 1:N){
lambda[i]=exp(w[Id[i]]+linpred[i]); }}
model{
target+=cauchy_lpdf(alpha|0,25)- 1 * cauchy_lccdf(0|0,25);
target+=cauchy_lpdf(a|0,25)- 1 * cauchy_lccdf(0|0,25); target+=cauchy_lpdf(gamma|0,25)- 1 * cauchy_lccdf(0|0,25); target+=normal_lpdf(beta|0,5); target+=normal_lpdf(w|0,sigma_w);
target+=cauchy_lpdf(sigma_w|0,25)- 1 * cauchy_lccdf(0|0,25);
target+=surv_MOGL_lpdf(y|event,a,alpha,gamma,lambda); }
generated quantities{ vector[N] log_lik; vector[N] yrepmogl; real dev; dev=0;
for(n in 1:N) log_lik[n]=event[n]*(log(a)+log(gamma)-log(lambda[n])-log(1+y[n]/lambda[n]) -log(1-(1-alpha)*(1+y[n]/lambda[n])~(-a*gamma)))+log(alpha)-a*gamma*log(1+y[n]/lambda[n]) -log(1-(1-alpha)*(1+y[n]/lambda[n])~(-a*gamma)); {real u;
u=uniform_rng(0,1);
for(n in 1:N) yrepmogl[n]=lambda[n]*((alpha/u+(1-alpha))~(1/(a*gamma))-1); }
dev=dev+(-2)*surv_MOGL_lpdf(y|event,a,alpha,gamma,lambda); }"
4.4. Fitting with Stan
To fit the survival models based on MOG-G family, the function stan is used, and list datk of data pass into the function stan. STAN used C++ compiler to samples the posterior distrbution of the model parameters, including random intercepts Wj for each patient J. To get summary of result, the function print is used.
4.4.1 Fitting of MOG-E model
MOGE=stan(model_code = modelMOGE,data=datk,iter=5000,chains = 2) print(MOGE)
Summarizing Output: After fitting of MOG-E survival model to the kidney data set, we get the results in tabular form are given in Table 1. It contains posterior estimates, standard deviation, credible interval, n_eff(crude estimate of effective sample size), and Rhat called as potential scale reduction factor [16], which estimate the convergence of Markov chain to the target distribution. Besides R, Traceplot also shows the convergence of the Markov chain. According to [16] the acceptable limit of n_eff is >100 and R values lower than 1.1. Rhat for all parameters of the MOG-E model is close to 1, which means Markov chains converge to the target distribution, the Monte Carlo error is acceptable, and the effective sample size is reasonable. Here, we can see that the posterior estimate of parameters (Intercept) is 4.440, and (Sex) is 1.678 are statistically significant as 95% credible interval (CI) does not contains 0 respectively. The positive value of inferred that the male patients have more chance to get infected at the place where a catheter is inserted than the female patients. The posterior estimate of parameters (Age) is -0.002, (AN) is -0.116, (GN) is -0.544 and (PKD) is 0.906 are not statistically significant as corresponding CI includes 0.
Table 1: Posterior sMmmary of MOG-E model parameters
parametrs mean se_mean sd 2.5% 50% 97.5% n_eff Rhat
beta[1] 4.440 0.033 1.728 0.833 4.468 7.612 2699 1.001
beta[2] -0.002 0.000 0.013 -0.029 -0.002 0.025 2584 1.000
beta[3] 1.678 0.007 0.387 0.917 1.678 2.427 3309 1.000
beta[4] -0.116 0.009 0.483 -1.085 -0.112 0.836 2678 1.000
beta[5] -0.544 0.009 0.484 -1.533 -0.533 0.397 2777 1.000
beta[6] 0.906 0.014 0.717 -0.493 0.910 2.317 2456 1.000
a 129.344 78.878 2544.490 0.671 18.762 275.779 1041 1.002
alpha 2.825 0.059 2.186 0.534 2.225 8.820 1397 1.001
sigma_w 0.651 0.012 0.221 0.195 0.660 1.091 367 1.003
Graphical Analysis
(a)
(b)
Figure 1: (a) Tracep/ot of MOG-E mode/ parameters, two chaws were ran depicted in different color and mixing of two chains is good means Markov chains converge to the target distribntion, and (b) Caterpillar plot of the MOG-E mode/ parameters.
(a)
(b)
Figure 2: (a) Posterior density plot MOG-E model parameters, (b) Autocorrelation plot ofMOG-E model parameters, after 20 lag autocorrelation declining towards zero.
Figure 3: The posterior predictive density (PPD) plot of the MOG-E model is done by plotting the data y and then overlaying the density of the predicted values yrep, which are generated from the posterior predictive distribution of the given model. PPD plot of the MOG-E model shows that the posterior predictive density fits the data well.
4.4.2 Fitting of MOG-W model
MOGW=stan(model_code = modelMOGW,data=datk,iter=5000,chains = 2)
print(MOGW)
Summarizing Output: It is an evident from Table 2 that the Rhat of the MOG-W model parameters are close to 1, which indicates Markov chain converges to the target distribution and effective sample size is enough to get conversion.
Table 2: Posterior sMmmary of MOG-W modeZ parameters
parametrs mean se_mean sd 2.5% 50% 97.5% n_eff Rhat
beta[1] 4.289 0.053 2.252 -0.936 4.468 8.381 1807 1.000
beta[2] -0.001 0.000 0.014 -0.028 -0.001 0.027 1892 1.001
beta[3] 1.688 0.007 0.391 0.930 1.686 2.451 2978 1.000
beta[4] -0.164 0.010 0.479 -1.130 -0.155 0.744 2102 1.002
beta[5] -0.586 0.011 0.489 -1.582 -0.577 0.350 1935 1.001
beta[6] 0.841 0.020 0.742 -0.625 0.858 2.323 1383 1.002
a 38.413 2.699 169.093 0.786 15.061 185.706 3926 1.000
alpha 72.737 12.350 726.957 0.109 7.801 335.523 3465 1.001
gamma 0.800 0.010 0.357 0.289 0.744 1.562 1332 1.000
sigma_w 0.622 0.013 0.227 0.175 0.630 1.063 288 1.002
Graphical Analysis
(a)
(b)
Figure 4: (a) TracepZot of MOG-W model parameters, two chains were run depicted in different color and (b) caterpillar p/ot of the MOG-W mode/ parameters.
(a)
(b)
Figure 5: (a) Posterior density plot MOG-W model parameters, (b) Autocorrelation plot of MOG-W model parameters, after 20 lag autocorrelation declining towards zero.
Figure 6: PPD plot of the MOG-W model shows that the posterior predictive density fits the data well and good compatibility of the model to the data.
4.4.3 Fitting of MOG-L model
MOGW=stan(model_code = modelMOGW,data=datk,iter=5000,chains = 2) print(MOGW)
Table 3: Posterior summary ofMOG-L model parameters
parametrs mean se_mean sd 2.5% 50% 97.5% n_eff Rhat
beta[1] 5.180 0.115 3.164 -0.916 5.636 10.534 753 1.012
beta[2] -0.002 0.000 0.013 -0.027 -0.002 0.025 2500 1.000
beta[3] 1.693 0.007 0.377 0.944 1.700 2.433 3111 1.000
beta[4] -0.153 0.010 0.494 -1.146 -0.161 0.839 2577 1.000
beta[5] -0.562 0.009 0.470 -1.478 -0.567 0.394 2478 1.000
beta[6] 0.869 0.014 0.713 -0.530 0.867 2.274 2501 1.000
a 41.617 9.152 641.724 0.119 9.765 182.772 4916 1.000
alpha 11.755 1.010 41.049 0.639 3.193 82.591 1653 1.004
gamma 32.510 2.419 163.682 0.129 10.641 152.084 4580 1.000
sigma_w 0.611 0.014 0.226 0.201 0.612 1.058 253 1.013
Graphical Analysis
(a)
(b)
Figure 7: (a) Traceplot of the MOG-L model parameters, two chains were run depicted in different color and (b) caterpillar plot of the MOG-L model parameters.
(a)
(b)
Figure 8: (a) Posterior density p/ot the MOG-L mode/ parameters, (b) Autocorre/ation p/ot of the MOG-L mode/ parameters, after 20 /ag autocorrelation dec/ining towards zero.
■i 11HJ JiOO 4DO iOO
Figure 9: PPD p/ot of the MOG-L mode/ shows that the posterior predictive density fits the data we//, and the mode/ is compatib/e with the given data.
4.5. Bayesian model Comparison
In order to compare the fitted models, we consider the model selection criteria like Watanabe Akaike information criteria [25] and leave one out cross-validation information criteria (LOOIC). However, the lower value of these selection methods indicates a better model fit. The WAIC and
LOOIC are two fully Bayesian selection methods than the others information criteria. They are methods for estimating pointwise out of sample prediction accuracy from a fitted Bayesian model using the log-likelihood evaluated at the posterior simulations of the parameters [16]. Although WAIC is asymptotically equal to LOOIC, it is preferable to use LOOIC because of its robustness in finite cases with weak priors or influential observation.
Table 4: WAIC and LOOIC values for all models.
Model No of parameters WAIC LOOIC
MOG-E 3 667.9 673.3
MOG-W 4 671.1 675.8
MOG-L 4 670.5 674.0
From Table 4, it is evident that the value of WAIC and LOOIC of the MOG-E model is less than the MOG-W and MOG-L, which means the MOG-E model is a better survival model as compared to other models for Kidney catheter data.
5. Discussion and Conclusion
In this paper, the MOG-G family are fitted to the real survival data includes random effect in Bayesian setup, which is implemented by STAN language using package rstan of R. For all models, the Markov chains converges to the target distribution, and covariate Sex is significant. The Posterior predictive check has been computed using the posterior predictive density plot for the MOG-E, MOG-W, and MOG-L models. We have seen in the PPD plot, the data y and replicated data set yrep are showing the same behavior and looks similar which means the replicated data sets are coming from the same model as the given data set, and all are the adequate model for predicting the future value. Upon comparison with the results obtained through LOOIC and WAIC, it can be concluded that the MOG-E model fits the data better than MOG-W and MOG-L.
References
[1] Yousof, H. M., Afify, A. Z., Nadarajah, S., Hamedani, G., and Aryal, G. R. (2018). The marshall-olkin generalized-g family of distributions with applications. Statistica, 78(3):273-295.
[2] Marshall, A. W. and Olkin, I. (1997). A new method for adding a parameter to a family of distributions with application to the exponential and weibull families. Biometrika, 84(3):641-652.
[3] Alizadeh, M., Tahir, M., Cordeiro, G. M., Mansoor, M., Zubair, M., and Hamedani, G. (2015). The kumaraswamy marshal-olkin family of distributions. Journal of the Egyptian Mathematical Society, 23(3):546-557.
[4] Alizadeh, M., Cordeiro, G. M., Brito, E. d., and B Demeetrio, C. G. (2015). The beta marshall-olkin family of distributions. Journal of Statistical Distributions and Applications, 2(1):1-18.
[5] Handique, L. and Chakraborty, S. (2016). The beta generalized marshall-olkin-g family of distributions. arXiv preprint arXiv:1608.05985.
[6] B Dias, C. R., Cordeiro, G. M., Alizadeh, M., Diniz Marinho, P. R., and Campos Coelho, H. F. (2016). Exponentiated marshall-olkin family of distributions. Journal of Statistical Distributions and Applications, 3(1):1-21.
[7] Chakraborty, S. and Handique, L. (2017). The generalized marshall-olkin-kumaraswamy-g family of distributions. Journal of data Science, 15(3):391-422.
[8] Handique, L. and Chakraborty, S. (2017). The beta generalized marshall-olkin kumaraswamy-g family of distributions with applications. Int. J. Agricult. Stat. Sci, 13(2):721-733.
Handique, L., Chakraborty, S., and de Andrade, T. A. (2019). The exponentiated generalized marshallolkin family of distribution: its properties and applications. Anna/s of Data Science, 6(3):391-411.
Korkmaz, M. C., Cordeiro, G. M., Yousof, H. M., Pescim, R. R., Afify, A. Z., and Nadarajah, S. (2019). The weibull marshallolkin family: Regression model and application to censored data. Communications in Statistics-Theory and Methods, 48(16):4171-4194. R Core Team (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
Team, S. D. et al. (2017). Stan modeling language users guide and reference manual. Technica/ report.
Duane, S., Kennedy, A. D., Pendleton, B. J., and Roweth, D. (1987). Hybrid monte carlo.
Physics /etters B, 195(2):216-222.
Neal, R. M. et al. (2011). Mcmc using hamiltonian dynamics. Handbook of markov chain monte carlo, 2(11):2.
Hoffman, M. D. and Gelman, A. (2014). The no-u-turn sampler: adaptively setting path
lengths in hamiltonian monte carlo. /ourna/ of Machine Learning Research, 15(1):1593-1623.
Gelman, A., Stern, H. S., Carlin, J. B., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2014).
Bayesian data ana/ysis. Chapman and Hall/CRC.
Collett, D. (2015). Mode//ing sMrviva/ data in medica/ research. CRC press.
McGilchrist, C. and Aisbett, C. (1991). Regression with frailty in survival analysis. Biometrics,
461-466.
Therneau, T. M. and Lumley, T. (2014). Package 'survival'. Survival analysis Published on CRAN, 2:3.
Khan, N. and Khan, A. A. (2018). Bayesian analysis of topp-leone generalized exponential distribution. Austrian Journa/ of Statistics, 47(4):1-15.
Akhtar, M. T. and Khan, A. A. (2014). Bayesian analysis of generalized log-burr family with R. SpringerP/us, 3(1):185.
Gelman, A. et al. (2006). Prior distributions for variance parameters in hierarchical models.
Bayesian ana/ysis, 1(3):515-534.
McElreath, R. (2018). Statistical rethinking: A Bayesian course with examples in R and Stan. Chapman and Hall/CRC.
Lawless, J. F. (2011). Statistica/ mode/s and methods for /ifetime data. John Wiley and Sons. Watanabe, S. (2010). Asymptotic equivalence of bayes cross validation and widely applicable information criterion in singular learning theory. Journa/ of Machine Learning Research, 11(12):3571:3594.