ON ESTIMATION AND PREDICTION FOR THE XLINDLEY DISTRIBUTION BASED ON RECORD
DATA
F. Zanjiran1, S.M.T.K. MirMostafaee2'* •
1,2Department of Statistics, University of Mazandaran, Iran 1zanjiran@irstat.ir, 2m.mirmostafaee@mail.umz.ac.ir
Abstract
This paper investigates the estimation of the unknown parameter in the XLindley distribution using record values and inter-record times, both in classical and Bayesian frameworks. It also delves into Bayesian prediction of a future record value. We also study the problem of estimation and prediction for the XLindley distribution based on lower records alone. A simulation study, as well as an analysis of a real data example, are conducted for comparison and illustration. The numerical findings underline that including the inter-record times in the study may enhance the performance of the estimators and predictors.
Keywords: XLindley distribution, lower record values, inter-record times, Bayesian estimation and prediction.
1. Introduction
The XLindley distribution was first proposed by [8] as an effective new distribution in modeling lifetime data. Suppose that X is a random variable following the one-parameter XLindley distribution. The probability density function (PDF) and cumulative distribution function (CDF) of X are given by
f (x; 0) = (TT0F (2 + 6 + x)e-x' (1)
9x \ —6x
F(x' 6) =1 - I1 + (TT6Fje (2)
respectively.
We write X ~ XL(6) if the PDF of X is given by (1). The XLindley distribution enjoys an increasing hazard rate function. Chouia and Zeghdoudi [8] demonstrated that the XLindley distribution can fit better than some other one-parameter distributions such as the exponential, xgamma and Lindley distributions. Due to the flexibility of the XLindley model, several inferential researches have been accomplished by authors since its inception, for example, Alotaibi et al. [2] addressed the estimation problem for the XLindley distribution using an adaptive Type-II progressively hybrid censored data, Nassar et al. [31] investigated the reliability estimation of the XLindley constant-stress partially accelerated life tests using progressively censored samples and Alotaibi et al. [3] worked on the reliability estimation under normal operating conditions
'Corresponding Author
for progressively Type-II XLindley censored data. Moreover, Metiri et al. [29] focused on the characterization of XLindley distribution using the relation between the truncated moment and failure rate function or reverse failure rate function.
Suppose that {Xn, n = 1,2, ■ ■ ■} is a sequence of identical and independent random variables. Let {Xn, n = 1,2, ■ ■ ■} be a sequence of identically distributed and independent random variables. If an observation Xj is less than all its preceding observations, then it is termed a lower record value. Similarly, upper record values can be defined based on the comparisons with preceding observations in the sequence. The sequence of lower record values along with the inter-record times can be denoted by (R, T) = {^1, T1, R2, T2, ■ ■ ■ , Rm-1, Tm-1, Rm} where Ri represents the i-th record value and Tj is the i-th inter-record time, which is the number of observations needed after occurrence of Ri to obtain a new record value Ri+1. Record data play a crucial role in various practical scenarios, see for example [5]. Record values and the related subjects have been studied by many authors; see for example [1, 11, 12, 28]. For instance, Samaniego and Whitaker [38] explored the estimation problem of the mean parameter of the exponential distribution using records and inter-record times. Doostparast [9] delved into the Bayesian and non-Bayesian estimation of the two parameters of the exponential distribution based on records and inter-record times. In a similar study, Doostparast et al. [10] investigated the Bayesian estimation of the parameters of the Pareto distribution utilizing records and interrecord times. Kzlaslan and Nadar [21] estimated the parameter of the proportional reversed hazard rate model based on records and inter-record times. Nadar and Kzlaslan [30] discussed inferential methods for the Burr type XII distribution using record values and inter-record times. Additionally, Kzlaslan and Nadar [22, 23] centered their research on inferential procedures for the generalized exponential and Kumaraswamy distributions based on record values and interrecord time statistics, respectively. Amini and MirMostafaee [4] examined interval prediction of future order statistics from the exponential distribution based on records given the inter-record times. Pak and Dey [32] developed inferential procedures for the estimation of parameters and prediction of future record values for the power Lindley model using lower record values and inter-record times. Kumar et al. [24] directed their attention towards the estimation and prediction for the unit-Gompertz distribution based on records and inter-record times. Bastan and MirMostafaee [6] explored inferential problems for the Poisson-exponential distribution based on record values and inter-record times. Khoshkhoo Amiri and MirMostafaee [19] studied estimation and prediction issues for the xgamma distribution based on lower records and inter-record times. Most recently, Khoshkhoo Amiri and MirMostafaee [20] addressed the estimation and prediction problems for the Chen distribution, utilizing lower records and inter-record times.
In this paper, we intend to discuss estimation and prediction for the XLindley distribution based on lower records and inter-record times, as well as based on lower records alone. In what follows, first, we obtain maximum likelihood (ML) estimates and asymptotic confidence intervals (ACIs) for the parameter of the XLindley distribution in Section 2. In Section 3, we go through the Bayesian estimation method and find the Bayes estimates of the parameter under a symmetric loss function and an asymmetric loss function. The Bayes estimates do not seem to be expressible in closed forms, so we become inclined to use an approximation method such as the Metropolis-Hastings algorithm. Section 4 is devoted to the Bayesian prediction of a lower future record value. A simulation study and a real data example are given in Section 5. The numerical outcomes highlight the effect of incorporating inter-record times in the study on the performance of estimators and predictors. The paper is concluded with several remarks in Section 6.
2. Maximum Likelihood Estimation
In this section, we proceed to obtain the ML estimates, as well as ACIs, for the unknown parameter 9 for the XLindley model based on record data. The record data are obtained through an inverse sampling scheme, where the units are sequentially observed until the mth record occurs. Additionally, for ease of computation, the mth inter-record time is assumed to be one.
2.1. ML Estimation Based on Records and Inter-Record Times
In this subsection, our attention shifts towards the ML estimate and an ACI for the parameter. Let r = (rT, • • • , rm) and t = (tT, • • • , tm-i) be the observed sets of R = {RT, • • • , Rm} and T = {Ti, • • • , Tm-i} respectively coming from XL(6) distribution. Then, the likelihood function of 6, given the observed lower records and inter-record times, becomes
m r, t)= n f (n)[1 - F(n)]t{-1 i=1
e
1 + 9
2m m
e-9 n n
i=1
(2 + 9 + r, )[Z (Ti, 9)]
ti-1
where
Z (x, 9)= (1 +
9 x
-9 x
9 > 0.
(3)
(4)
(1 + 9)2
It is important to note that tm is set to one for the sake of simplifying the equations. Therefore, the resulting log-likelihood function can be expressed as
l(6; r, t) = 2m ln 6 - 2m ln(l + 6) - 6 £ n + £ ln(2 + 6 + n) + £ (ti - 1) ln £ (n, 6).
i = i i = i i=i
Upon taking the partial derivative of the log-likelihood function with respect to (w.r.t.) 6 and setting it equal to zero, we get
W, r' t)= 2m - tn + £ 1 + £ (ti - i) |(M) = 0,
d9
9(1 + 9) = i = 2 + 9 + r, i ' Z(r,J
where
, ^ dZ(x, 9 ) 9xf, 9x 9 - 1 . „ „
j(x,9) = ' = -xe-9x[ 1 + „ , + „ , „N3 ), 9 > 0.
d9
(1 + 9)2 (1 + 9)3
(5)
The ML estimate of 6 may be determined through solving the above equation. However, it appears that there is no explicit form for the equation presented above, which necessitates the use of a numerical method. Subsequently, our focus shifts to constructing an ACI for the parameter 6.
In this context, Fisher's information is defined as follows I(6) = -e(~~~dfQ(R'~T')
, if the inte-
gral exists, where f6 (r, t) denotes the joint probability function of RT, TT, R2, T2, • • • , Rm-i, Tm-i, Rn The second partial derivative of the log-likelihood function w.r.t. 6 is given by
d2l(9; r, t) _ 2m(1 + 29)
1
d92
[9(1 + 9)]2 E (2 + 9 + r,)2
+ E (ti -1) i=1
j(ri,9)Z(ri,9) - №(n,9)]2
[Z (t, , 9)]2
where
j (x, 9) = MM = xe-9^x +
9 x2
+
2x(9 - 1) 2(2 -1
(1 + 9 )2 (1 + 9)3 (1 + 9)4/
9 > 0.
(6)
Let 6ML denote the ML estimator (MLE) of 6. Then, the 100(1 - a)% modified asymptotic two-sided equi-tailed confidence interval (MATE CI) for 6 can be given by (see for example [25])
{ „ z a ) „ z a
max 0, 6ml - , 2 , 6ml + 2
I(9ml )
I(9ml )
where zY represents the 7-th upper quantile of the standard normal distribution and
I(9ml ) = -
d2 l(9\R, T )
d92
9=9ml
m
m
m
m
m
2.2. ML Estimation Based on Record Values
The likelihood function of 9 given the lower records r (without considering the inter-record times) is given by
L* 9 r) = f (r ) FT = ( 9 \2m-9 £T=1 r, nm=l(2 + 9 + ri) (7)
L (9' r)= f(T U F(rt) =( 1 + 9) e nr=-1(1 - fru 9))' (7)
where Z(x, 9) is defined in (4).
The corresponding log-likelihood function of 9 is then given by
m m m-1
l* (9; r) = 2m ln 9 - 2m ln(1 + 9) - 9 £ r{ + £ ln(2 + 9 + r{) - £ ln[1 - Z(r,, 9)]. (8)
i=1 i=1 i=1
Taking the first partial derivative of the log-likelihood (8) w.r.t. 9 and equating it with zero, we have
dl* (9; r) = 2m £. + £ 1 + m£ f(r,, 9) = 0 d9 9(1 + 9) £ £ 2 + 9 + ri £ 1 - Z(r,, 9) '
where f(x, 9) is defined in (5).
So the ML estimate may be obtained by solving the above equation with the help of a numerical technique.
The second partial derivate of (8) w.r.t. 9 is obtained to be
d21* (9; r) =_ 2m(1 + 29) ^ 1 + T-1 ( f' (rt, 9)[1 - Z(r,, 9)] + [f(r,, 9)]2 d92 [9(1 + 9)]2 £ (2 + 9 + r,)2 + =1 ^ [1 - Z(r,, 9)]2
where f' (x, 9) is defined in (6).
Let 9*Ml denote the MLE of 9 based on lower records. Following the same approach described in the previous subsection, the 100(1 — a)% MATE CI for 9 can be given by
( - z a 1 „ z a
ma^ 0, 9*Ml - . 2 , 9ML + 2
. !* (9*mlv (9*ml)
where
(,* ) d21* (9\R, T) 1 (9ml ) = -
d92
3. Bayesian Estimation
In the context of Bayesian estimation, the experimenter's information can be conveyed through a probability distribution for the parameter, referred to as the prior distribution. Due to the constraint that the parameter of the XLindley distribution must be positive, we use the popular gamma prior for 9, whose PDF is given by
ba 9 a—1e—b9
n(e) = ——, (9)
where, the positive hyperparameters a and b can be set based on the prior information available to the experimenter. In what follows, we focus on the Bayesian estimation of 9 based on records and inter-record times and based on records alone.
9=9ml
3.1. Bayesian Estimation Based on Records and Inter-Record Times
Using (3) and the prior (9), we can derive the posterior density of d given r and t as follows
fl2m+a—1 m
mr, t) = D^+eym e-^iTi) n
(2 + e + n )[£ (Tlr e)]
ti-1
where £(x, e) is defined in (4) and
to e2m+a-1
k (1 + e)
2m
-e(b+YT=1 Ti)
n
i=1
(2 + e + n )[£ (Ti, e)]
t,-1
de.
The squared error loss function (SELF) is widely used in Bayesian analyses. However, the SELF may not be appropriate for many real-world scenarios due to its equal weighting of overestima-tion and underestimation. An alternative asymmetric loss function is the linear-exponential loss function (LELF), proposed by [42], which is given by
Lle(e,e) = b[exp{c(d - e)}- c(e - e) -1], b > 0, c = 0,
where O denotes an estimator of e.
Without loss of generality, we assume b = 1. The appropriate determination of c involves considering both its sign and magnitude. When c > 0, then overestimation is more serious than underestimation and vice versa, see [43] for more details. The Bayes estimates of e under the SELF and LELF become
esE = r en(e|r, t)de = D I
J0 D Jo
d J0 (1 + e)
a2m+a m
e e-e(b+Lm=1 ti) "Q i=1
2m
(2 + e + Ti )[£ (Ti, e)]
ti 1
d ,
and
1 1 f TO
eLE = — ln M(-cW, t) = — ln[ e-ce n(e\r, t)de ] c c J0
c \D J0 (1 + e)
e2m+a-1 2m1
-e (c+b+Ymu Ti)
n
i=1
(2 + e + Ti )[£ (Ti, e)]
ti-1
d
respectively, provided that the integrals exist.
It appears that the above Bayes estimates of e may not be expressible in closed forms. Therefore, we resort to an approximation method, called the Metropolis-Hastings (M-H) algorithm [27,15]. An M-H algorithm suitable for our scenario can be outlined as follows.
Algorithm 1
Step1. Start with an initial guess e0 = eML and set t = 1.
Step2. Given et-1, generate e* from a truncated-normal distribution, N(et-1,a2)I{e>0}. Then, assign et = e * with the following probability
P = min
f n(e*\r,t)q(et-1\e*) \n( et-1 \r, t)q(e *\ et-1)
,1 ,
where q(x\b) represents the density of N(b,a2)I{x>0}, otherwise set et = et-1. Step3. Set t = t + 1 and repeat Step 2, T times, where T is a considerably large number. So, {eM+1, eM+2, ■ ■ ■ , eT} constitutes the generated sample, where M denotes the burn-in period.
The approximate Bayes point estimates of e under the SELF and LELF are then given by
1
eSM = m* E et,
and
1
Olm = - ~c ln( M- E
t=M+1
1
-cOt
t=M+1
TO
TO
T
T
e
respectively, with M* = T — M. In Section 5, we have taken a2 = 1.
Let 0(1) • • • 0(M*) denote the ordered values of dM+1, ■ ■ ■ , dT. Define the intervals Lj(M*) = [0j,0(j+l(1—a)M*.])] for j = 1,2, ■ ■ ■ ,M* — [(1 — a)M*]. Consequently, the 100(1 — a)% Chen and Shao short width credible interval (CSSW CrI) for 0 can be represented as Lq(M*), where q is determined such that [7]
0 (q+[(1—a)M*]) — 0(q) = 0 (j + [(1 — a)M*]) — 0(j) •
3.2. Bayesian Estimation Based on Record Values
Using (7) and the prior (9), the posterior density of 0 given r is derived to be
n* ( 0|r) = 02m+a—1 c—0(b+Ef=1 r{) ni=1(2 + 0 + ri)
V|; D* (1 + 0)2m nm=i1[1 — Z(ru 0)]'
where
D* = f " 02m+a—1 e—0(b+^U r>) (2 + 0 + ri) d0
Jo (1+0)2m nm=11[1—e(ri, 0)] '
The Bayes estimates of under the SELF and LELF become
e*E = JL ie—0(b+0=1 ri) nm=1(2 + 0 + ri) d0 SE D* Jo (1 + 0)2m nm=—1 [1 — e(n, 0)] '
and
&fE = — 1 ln (" 02m+a^1 e—0(c+b+0m=1 n) n=1(2 + 0 + ri) d0
0LE c in\ D* Jo (1 + 0)2m e nm=11[1 — e(ri, 0)]
respectively, provided that the related integrals exist.
It appears that the above Bayes estimates of 0 may not have closed-form expressions. So we may use the M-H algorithm (similar to that described in Algorithm 1) to approximate these Bayes estimates, see Subsection 3.1. We can also obtain the 100(1 — a)% CSSW CrI for 0 using a similar approach detailed in Subsection 3.1.
4. Bayesian Prediction
Let R^ T1, R2, T2, ■ ■ ■ , Rm—1, Tm—1, Rm be the first m of lower record values and their corresponding inter-record times from XL(0). Let further r = (r1, ■ ■ ■ , rm) and t = (t1, ■ ■ ■ , tm—1) be the observed sets of R = {R1, ■ ■ ■ , Rm} and T = {T1, ■ ■ ■ , Tm—1}. We intend to predict the s-th unobserved lower record value, Rs, where s > m. Using the Morkovian property of records, the conditional PDF of Rs given R = r and T = t, denoted by f (rs \0, r, t) is identical to the conditional PDF of Rs given Rm = rm, denoted by f (rs \0, rm) (see for example [5, 20]). So, we have
f (rs0 r,t) S f (rs0 rm ) = f& 0)[Q(rs, 0) — Q(rm, W"1
F(rm, 0)r(s — m)
(iTv) (0 + 2 + rs)
= [Q(rs, 0) — Q(rm, 0)]s—m—\: ^,-r e—0rs, (10)
L«v s ) «v m n [1 — e(rm,0)]r(s — m) v '
where 0 < rs < rm, Q(x,0) = — ln(F(x; 0)) and e(x, 0) is defined in (4).
The Bayes predictive density of Rs given the lower records and inter-record times is obtained to be
f TO
h(rs\r, t) = J f (rs \0, rm)n(0\r, t)d0.
It can be easily seen that the associated posterior predictive density may not be obtained analytically. Thus, we estimate h(Ts\, r, t) by means of a sample generated using the M-H algorithm. Let {ev, u = 1, ■ ■ ■ , M*} be the generated sample using Algorithm 1, where M* = T - M. Then, an estimate of h('s\r, t) is given by
1 M*
h(Ts^ t) = M* E f (Ts \eu, Tm).
v=1
The approximate predictions of Rs under the SELF and LELF (provided that they exist) can be obtained as
¡'Tm _ 1 M f-Tm
RSEM = Tsh(Ts\r, t)dTs = M E Tsf (Ts \ev, Tm)dTs, (11)
J0 u_1^0
and
____ _1 r r Tm
e-cTsh(Ts \r, t)dn
RLEM = -lln
sc
0
-1 1 M i'm
— !n[M* Ejo e-cTsf (Ts\Ou,Tm)dTs], (12)
respectively.
A 100(1 - a)% two-sided Bayesian prediction interval for Rs is given by (L(r, t), U(r, t)), where L(r, t) and U(r, t) satisfy the following equations at the same time
i L(rt)^ , u a J r U(rt) u
h(Ts \ r, t)dTs = -, and h(Ts \ r, t)dTs = 1 --.
J0 2 J0 2
A 100(1 - a)% approximate two-sided Bayesian prediction interval (ATB PI) for R s is given by (L, U), where L and U satisfy the following equations at the same time
1 T f l a 1 T f U a
M* E nf (Ts \ ^ Tm)dTs = ^ and M- E L f (Ts \ ^ Tm)dTs = 1 - 2
M u=M+1 2 M u=M+1 2
4.1. Special Case: s = m + 1
For the special case, when s = m + 1, then Y = Rm+1 given Rm = Tm follows the truncated XLindley distribution on interval (0, Tm). So, we have
f (Tm+1\0, r, t) = f (Tm+1\0, Tm ) = f ^ 0
F(Tm, e)
(ire) (e + 2 + Tm+1)
e-eTmr1, 0 < Tm+1 < Tm. (13)
1 - £ (Tm, e) Moreover, we have following two relations
= e + 2 + 2/e - [(e + 2)( enm +1) + 'm (e'm + 2) + 2/e ]e = (1 + e)2[1 - £ (Tm, e)]
- Tm
i'm e-c'm+1 f (T+1\eT)dT+1 = e2[1 + (e + c)(e + 2 + 'm+1)] e-(0+c)'m+1
J0 e f (m+1 \e ,Tm )dTm+1 = (e + c)2 (1 + e)2 [1 - £ (Tm, e)]e
(e + c)2 (1 + e)2 [1 - £ (Tm, e)] e2{1 + (e + c)(e + 2) - [1 + (e + c)(e + 2 + Tm)]e-(e+c)Tm} (e + c)2 (1 + e)2[1 - £ (Tm, e)] •
T
m
0
T
m
Therefore, from (11) and (12), the approximate predictions of Rs under the SELF and LELF can be obtained as
1 M* rm
Rtn+l — ~M £ i rm+lf (rm+l \Vu, rm)drm+l M v—lJ°
l T ev + 2 + 2/ev - [(ev + 2)(evrm +1) + rm(evrm + 2) + 2/ev]e-e»rm
m* ^ (l + ev)2[l - ç(rm, ev)]
and
" M*
ïLEM m+l
-l l
— ln
c M*
l ' l
— ln
c M*
v—l '
M*
v—l
% + c)(Vu + 2) - [l + (6V + c)(6V + 2 + rm)]e-(Uu+L)'m
+ c)2 (l + 6u )2[l - ç (rm, du)]
respectively.
Additionally, A 100(1 — a)% ATB PI for Rm+1 is given by (L, U), where L and U satisfy the following nonlinear equations
± T 1 — e(l,0V) = a d JL T 1 — e(u,0V) = a
M* vJM+1 1 — e (rm, 0u ) 2' an M* v=M+1 1 — e (rm, 0u ) 2' where e(x, ) is defined in (4).
Remark 1. Using the Morkovian property of records, the conditional PDF of Rs given R = r is identical to the conditional PDF of Rs given Rm = rm (see for example [5, 20]). Therefore, the approximate Bayesian point predictions and a 100(1 — a)% ATB PI for Rs based on record values can be obtained using a similar procedure described above, with this difference that the M-H sample, {0V, v = 1, ■ ■ ■ , M*}, must be generated based on only records.
r
m
sf (rsV Vr rm ldrs
V
5. Numerical Illustration This section involves a simulation study, as well as a real data analysis.
5.1. A Simulation Study
Here, we conduct a Monte Carlo simulation to evaluate the accuracy of the point and interval estimators and approximate predictors that are mentioned in this paper. In this simulation study, we set the number of replications to N* = 1000. For each replication, we generate (m +1) records and their associated inter-record times from XL(0). We consider the values of m to be m = 3,4,5 and the values of the parameter to be 0 = 0.5,1 and 2. In the context of the Bayesian estimation, we use the approximate non-informative prior with a = b = 0.1. A few replications for which the predictions became negative were removed from the simulation.
We obtain the ML estimates and the approximate Bayes estimates based on the first m records and their corresponding (m — 1) record times and based on the first m records alone. Furthermore, we use Geweke's test [13], Raftery and Lewiss diagnostic [36, 37] and Heidelberger and Welch's convergence diagnostic [18] to assess the convergence of the generated M-H Markov chains. It is worth noting that Heidelberger and Welch [18] made use of or referenced the findings of [39, 16, 17, 40, 41]. In some cases, we have taken every second sampled value (and adjusted the number of sampled values accordingly) to ensure a convergent M-H Markov chain. All the final chains have sizes equal to 10000. Figure 1 shows the M-H Markov chains (the figure is for m = 4 and 0 = 1), from which the convergence of the M-H algorithm may be confirmed.
The performance of the different estimators is compared based on their estimated biases (biases for short) and estimated risks (ERs). Additionally, we evaluate the interval estimators
0 2000 4000 6000 8000 10000 0 2000 4000 6000 3000 10000
Figure 1: Plots of Markov chains for 9, the left panel is for the case based on records and inter-records times, whereas the right panel is for the case based on lower records alone (m = 4 and 9 = 1).
and predictors using the average width (AW) and coverage probability (CP) criteria. If 9 is an estimator of 9 and 9i is the corresponding estimate obtained in the i-th replication, then the bias and ERs of 9 w.r.t. the SELF and LELF are given by
Bias (9)
ERs (9)
N*
N* E(9 - 9)'
i=1
N*
N* E (9 -
i=1
and
1 N* .
ERl(9) = W E (exp[c(9i - 9)] - c(9t - 9) - 1) i=1
(14)
(15)
(16)
respectively.
The point and interval predictions for the (m + 1)-th record value, namely Rm+i, are also calculated. In terms of prediction assessment, we consider the estimated bias (bias for short) and the estimated prediction risks (EPRs) w.r.t. to the SELF and LELF for the point predictors, which are defined similarly to (14), (15) and (16), respectively. The simulation results are given in Table 1 for point estimation, Table 2 for point prediction and Table 3 for interval estimation and prediction. The results for point estimation and prediction in Tables 1 and 2 are provided for m = 4 and 5 for the sake of brevity, whereas the results presented in Table 3 are provided for m = 3,4 and 5.
Based on Tables 1-3, we draw the following conclusions:
• The point estimators based on records and inter-record times outperform the corresponding point estimators based on record alone in terms of bias and ER in the most cases. Additionally, the biases and EPRs of the approximate point predictors based on records and inter-record times are smaller than those of approximate point predictors based on records alone in the most cases, as well.
• The ERs of the point estimators for 9 = 1 and 2 decrease w.r.t. to m in the most cases, whereas the EPRs of the point predictors decrease w.r.t. m for all selected values of 9 without any exception.
• The AWs of the 95% approximate interval estimators and predictors based on records and inter-record times are less than those of the 95% approximate interval estimators and predictors based on records alone (except for one case for which they are equal up to 5 decimals).
• The CPs of the 95% approximate interval estimators and predictors are all equal to or close to the nominal value 0.95, as expected.
2
Table 1: The biases and ERs of the point estimators of d based on records and inter-record times (first row) and based on records alone (second row).
m = 4 m = 5
ERl ERl ERl ERl
= 0.5 bias ERs c = 0.5 c = -0.5 bias ERs c = 0.5 c = -0.5
MLE 0.0868 0.0971 0.0156 0.0101 0.0669 0.0483 0.0066 0.0056
1.6656 99.041 > 100 0.6617 0.1215 > 100 > 100 0.8858
Bayes (SELF) 0.0943 0.0991 0.0156 0.0104 0.0729 0.0510 0.0070 0.0059
0.6746 2.7953 1.7739 0.1732 0.6977 3.0185 2.6072 0.1802
Bayes (LELF) 0.0795 0.0821 0.0123 0.0089 0.0634 0.0457 0.0062 0.0053
c = 0.5 0.3751 0.7616 0.1494 0.0684 0.3854 0.7803 0.1546 0.0697
Bayes (LELF) 0.1112 0.1246 0.0212 0.0126 0.0831 0.0573 0.0079 0.0066
c = -0.5 1.7028 25.912 > 100 0.6366 1.7503 26.783 > 100 0.6545
6 = 1
MLE
Bayes (SELF)
Bayes (LELF) c = 0.5
Bayes (LELF) c = -0.5 6 = 2
MLE 0.5934 2.9392 2.0688 0.1996 0.4055 1.7905 0.9308 0.1353
9.4667 > 100 > 100 4.4454 9.9441 > 100 > 100 4.6788
Bayes (SELF) 0.5169 2.1067 0.7134 0.1628 0.3695 1.4141 0.4473 0.1166
1.5136 9.0327 7.3213 0.5359 1.5155 8.9470 6.8274 0.5322
Bayes (LELF) 0.2013 0.9833 0.1922 0.0943 0.1448 0.7758 0.1505 0.0759
c = 0.5 0.1222 1.0090 0.1527 0.1139 0.1291 1.0102 0.1549 0.1128
Bayes (LELF) 1.0939 6.8180 59.371 0.3483 0.7305 3.6409 17.917 0.2150
c = -0.5 6.5314 > 100 > 100 2.7891 6.6037 > 100 > 100 2.8190
0.2293 0.5745
7.4661 > 100
0.2394 0.5045
1.2180 6.3531
0.1652 0.3243
0.4764 1.1666
0.3505 1.2209
3.9390 75.039
0.2497 0.0466
> 100 3.4686
0.1371 0.0443
43.106 0.3665
0.0602 0.0317
0.2182 0.1077
14.838 0.0718
> 100 1.6089
0.1564 0.3371
4.8365 > 100
0.1688 0.3186 1.0927 5.8714
0.1186 0.2277 0.4107 1.0520
0.2316 0.5204
3.4741 67.636
0.1164 0.0300
> 100 2.1844
0.0806 0.0297
9.0584 0.3326
0.0412 0.0230
0.1980 0.0973
0.5563 0.0406
> 100 0.0481
Table 2: The biases and EPRs of the approximate Bayes point predictors of d based on records and inter-record times (first row) and based on records alone (second row)..
m = 4 m=5
ERl ERl ERl ERl
e = 0.5 bias ERs c = 0.5 c = -0.5 bias ERs c = 0.5 c = -0.5
SELF 0.002842 0.017065 0.002167 0.002128 0.000548 0.006948 0.000912 0.000843
0.005136 0.017106 0.002211 0.002098 0.001393 0.007154 0.000958 0.000854
LELF -0.001371 0.017407 0.002159 0.002222 -0.000844 0.006759 0.000865 0.000837
c = 0.5 0.000895 0.017163 0.002166 0.002155 -0.000003 0.006893 0.000899 0.000841
LELF 0.007078 0.017195 0.002235 0.002097 0.001943 0.007233 0.000975 0.000858
c = -0.5 0.009345 0.017514 0.002317 0.002102 0.002780 0.007493 0.001031 0.000874
e = i
SELF 0.002i36 0.002491 0.000312 0.000312 0.001369 0.000653 0.000082 0.000081
0.002754 0.002453 0.000309 0.000305 0.001667 0.000663 0.000084 0.000082
LELF 0.001516 0.002487 0.000310 0.000313 0.001114 0.000645 0.000081 0.000080
c = 0.5 0.002130 0.002437 0.000305 0.000305 0.001411 0.000651 0.000082 0.000081
LELF 0.002760 0.002504 0.000315 0.000312 0.001625 0.000664 0.000084 0.000082
c = -0.5 0.003379 0.000248 0.000313 0.000307 0.001924 0.000678 0.000086 0.000084
e = 2
SELF 0.000476 0.000378 0.000047 0.000047 -0.000272 0.000105 0.000013 0.000013
0.000689 0.000397 0.000050 0.000049 -0.000208 0.000106 0.000013 0.000013
LELF 0.000381 0.000374 0.000047 0.000047 -0.000302 0.000105 0.000013 0.000013
c = 0.5 0.000594 0.000392 0.000049 0.000049 -0.000239 0.000106 0.000013 0.000013
LELF 0.000572 0.000382 0.000048 0.000048 -0.000242 0.000105 0.000013 0.000013
c = -0.5 0.000785 0.000401 0.000050 0.000050 -0.000177 0.000106 0.000013 0.000013
Table 3: The AWs and CPs of 95% approximate interval estimators and predictors based on records and inter-record times (first row) and based on records alone (second row).
m = 3 m = 4 m = 5
e = 0.5 AW CP AW CP AW CP
MATE CI 1.09579 0.962 0.83556 0.963 0.70605 0.956
6.92765 0.964 5.89956 0.959 7.21595 0.960
CSSW CrI 1.03103 0.955 0.80036 0.952 0.68460 0.951
2.98580 0.957 2.93030 0.952 3.00196 0.957
ATB PI 0.47067 0.945 0.23426 0.938 0.11708 0.948
0.47077 0.942 0.23428 0.938 0.11708 0.949
e = 1
MATE CI 2.55093 0.953 1.95860 0.954 1.61990 0.951
11.8789 0.966 24.2695 0.977 16.4109 0.965
CSSW CrI 2.29057 0.954 1.82186 0.955 1.54003 0.946
5.46901 0.962 5.77409 0.975 5.43254 0.960
ATB PI 0.18213 0.955 0.08794 0.946 0.04852 0.950
0.18224 0.955 0.08797 0.946 0.04853 0.950
e = 2
MATE CI 5.81158 0.946 4.58167 0.956 3.75708 0.956
29.2559 0.962 32.4821 0.955 33.8740 0.967
CSSW CrI 4.66783 0.952 4.00381 0.958 3.39805 0.953
8.67913 0.959 9.29240 0.952 9.29188 0.962
ATB PI 0.08007 0.935 0.03415 0.954 0.01698 0.950
0.08012 0.937 0.03416 0.952 0.01699 0.950
5.2. Real Data Example
Here, we consider the following data on the amount of rainfall (in inches) recorded at the Los Angeles Civic Center in February from 1999 to 2018; visit the website of Los Angeles Almanac: www.laalmanac.com/weather/we08aa.php.
0.56, 5.54, 8.87, 0.29, 4.64, 4.89, 11.02, 2.37, 0.92, 1.64, 3.57, 4.27, 3.29, 0.16, 0.20, 3.58, 0.83, 0.79, 4.17, 0.03.
We have used the Kolmogorov-Smirnov (K-S) test to check if the XLindley model fits the data. The K-S test statistic confirms that the XLindley distribution is quite suitable for fitting the above data (p-value greater than 0.5). We have extracted the lower records and the corresponding inter-record times as follows:
i 1 2 3 4
ri 0.56 0.29 0.16 0.03
ki 3 10 6 1
Here, we have used the approximate non-informative prior with a = b = 0.1. We have computed the ML and approximate Bayes point estimates, along with the 95% approximate interval estimates of the parameter for the XLindley distribution. Additionally, we have derived the point predictions and 95% ATB PIs for the next future record, namely R5. The numerical results of this example are given in Table 4, where Case I denotes the case based on records and inter-record times, whereas Case II denotes the case based on records alone. Our findings suggest that the subsequent lowest rainfall amount (after 2018) is expected to be around 0.015 inches, which is the predicted 5-th lower record value since 1999.
Table 4: The numerical results of the real data example.
Estimation MLE SELF LELF (c = 0.5) LELF (c = -0.5) 95% MATE CI 95% CSSW CrI
Case I Case II 0.9535 1.8809 0.9729 1.8679 0.9405 1.4782 1.0087 2.9436 (0.2470, 1.6601) (0, 4.9336) (0.3781,1.7523) (0.0400, 4.6741)
Prediction SELF LELF (c = 0.5) LELF (c = -0.5) 95% ATB PI
Case I Case II 0.01495 0.01488 0.01493 0.01486 0.01497 0.01490 (0.00074, 0.02924) (0.00073, 0.02923)
6. Concluding Remarks
Recently, the XLindley distribution has been introduced by [8] aiming at proposing a flexible distribution for lifetime phenomena. In our study, first, we obtained the ML estimates of the XLindley parameter based on record values and inter-record times, as well as solely based on records. Then, we considered the Bayesian estimation of the parameter, and we employed both symmetric and asymmetric loss functions. The Bayesian point estimates involve integrals that seem to lack closed forms, so we have utilized the M-H method to evaluate them. Our study extended to predicting future records, especially the immediate subsequent lower record value as a special case has been explored in detail. A simulation study has been conducted to evaluate the point and interval estimators of the unknown parameter of the XLindley distribution along with the approximate point and interval predictors of a future lower record value. The simulation study revealed the impact of including the inter-record times on the performance of the estimators and predictors. Furthermore, a real data set containing the rainfall data was analyzed, where a lower record value could serve as an indicator of an impending drought. The predicted values of the 5-th lower record have been obtained in the example. Summing up, the results
of this paper are anticipated to offer practical utility in the estimation and prediction in real
phenomena. All the computations of the paper were carried out using the statistical software R
[35], and the packages coda [33, 34], nleqslv [14] and truncnorm [26] therein.
Data Availability Statement
The data set used in this paper is provided in the manuscript.
Declaration of Conflicting Interests
The Authors declare that there is no conflict of interest.
Funding Details
This research received no specific grant from any funding agency in the public, commercial, or
not-for-profit sectors.
References
[1] Ahmadi, J., & MirMostafaee, S. M. T. K. (2009). Prediction intervals for future records and order statistics coming from two parameter exponential distribution. Statistics & Probability Letters, 79:977-983.
[2] Alotaibi, R., Nassar, M. and Elshahhat, A. (2022). Computational analysis of XLindley parameters using adaptive Type-II progressive hybrid censoring with applications in chemical engineering. Mathematics, 10:3355.
[3] Alotaibi, R., Nassar, M. and Elshahhat, A. (2023). Reliability estimation under normal operating conditions for progressively Type-II XLindley censored data. Axioms, 2023,12:352.
[4] Amini, M. and MirMostafaee, S. M. T. K. (2016). Interval prediction of order statistics based on records by employing inter-record times: A study under two parameter exponential distribution. Metodoloski Zvezki, 13:1-15.
[5] Arnold, B. C., Balakrishnan, N. and Nagaraja, H. N. Records. John Wiley & Sons, 1998.
[6] Bastan, F. and MirMostafaee, S. M. T. K. (2022). Estimation and prediction for the Poisson-exponential distribution based on records and inter-record times: A comparative study. Journal of Statistical Sciences, 15:381-405.
[7] Chen, M.-H. and Shao, Q.-M. (1999). Monte Carlo estimation of Bayesian credible and HPD intervals. Journal of Computational and Graphical Statistics, 8:69-92.
[8] Chouia, S. and Zeghdoudi, H. (2021). The XLindley distribution: Properties and application. Journal of Statistical Theory and Applications, 20:318-327.
[9] Doostparast, M. (2009). A note on estimation based on record data. Metrika, 69:69-80.
[10] Doostparast, M., Akbari, M. G. and Balakrishna, N. (2011). Bayesian analysis for the two-parameter Pareto distribution based on record values and times. Journal of Statistical Computation and Simulation, 81:1393-1403.
[11] Etemad Golestani, B., Ormoz, E., MirMostafaee, S. M. T. K. (2024). Statistical inference for the inverse Lindley distribution based on lower record values: Accepted-January 2024. REVSTAT-Statistical Journal.
[12] Fallah, A., Asgharzadeh, A. and MirMostafaee, S. M. T. K. (2018). On the Lindley record values and associated inference. Journal of Statistical Theory and Applications, 17:686-702.
[13] Geweke, K. N. (1992). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In Bayesian Statistics 4, Eds. J. M. Bernardo, J. O. Berger, A.P. Dawid and A. F. M. Smith, Clarendon Press, Oxford, UK, pp. 169-193.
[14] Hasselman, B. (2018). nleqslv: Solve systems of nonlinear equations. R package version 3.3.2, https://CRAN.R-project.org/package=nleqslv.
[15] Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57:97-109.
[16] Heidelberger, P. and Welch, P. D. (1981). A spectral method for confidence interval generation and run length control in simulations. Communications of the ACM, 24:233-245.
[17] Heidelberger, P. and Welch, P. D. (1981). Adaptive spectral methods for simulation output analysis. IBM Journal of Research and Development, 25:860-876.
[18] Heidelberger, P. and Welch, P. D. (1983). Simulation run length control in the presence of an initial transient. Operations Research, 31:1109-1144.
[19] Khoshkhoo Amiri, Z. and MirMostafaee, S. M. T. K. (2023). Analysis for the xgamma distribution based on record values and inter-record times with application to prediction of rainfall and COVID-19 records. Statistics in Transitions new series, 24:89-108.
[20] Khoshkhoo Amiri, Z. and MirMostafaee, S. M. T. K.(2024). Statistical inference for a two-parameter distribution with a bathtub-shaped or increasing hazard rate function based on record values and inter-record times with an application to COVID-19 data. Journal of Statistical Computation and Simulation, DOI:10.1080/00949655.2024.2310682.
[21] Kizilaslan, F. and Nadar, M. (2014). Estimations for proportional reversed hazard rate model distributions based on upper record values and inter-record times. Istatistik: Journal of The Turkish Statistical Association, 7:55-62.
[22] Kizilaslan, F. and Nadar, M. (2015). Estimation with the generalized exponential distribution based on record values and inter-record times. Journal of Statistical Computation and Simulation, 85:978-999.
[23] Kizilaslan, F. and Nadar, M. (2016). Estimation and prediction of the Kumaraswamy distribution based on record values and inter-record times. Journal of Statistical Computation and Simulation, 86:2471-2493.
[24] Kumar, D., Dey, S., Ormoz, E. and MirMostafaee, S. M. T. K. (2020). Inference for the unit-Gompertz model based on record values and inter-record times with an application. Rendiconti del Circolo Matematico di Palermo Series 2, 69:1295-1319.
[25] Lehmann, E. L. and Casella, G. Theory of Point Estimation. Second Edition, Springer, 1998.
[26] Mersmann, O., Trautmann, H., Steuer, D. and Bornkamp, B. (2018). trunc-norm: Truncated normal distribution, R package version 1.0-8, https://CRAN.R-project.org/package=truncnorm.
[27] Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. and Teller, E. (1953). Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21:1087-1092.
[28] MirMostafaee, S. M. T. K., Asgharzadeh, A., & Fallah, A. (2016). Record values from NH distribution and associated inference. Metron, 74:37-59.
[29] Metiri, F., Zeghdoudi, H. and Ezzebsa, A. (2022). On the characterisation of X-Lindley distribution by truncated moments. Properties and application. Operations Research and Decisions, 32:97-109.
[30] Nadar, M. and Kizilaslan, F. (2015). Estimation and prediction of the Burr type XII distribution based on record values and inter-record times. Journal of Statistical Computation and Simulation, 85:3297-3321.
[31] Nassar, M., Alotaibi, R. and Elshahhat, A. (2023). Reliability estimation of XLindley constant-stress partially accelerated life tests using progressively censored samples. Mathematics, 11:1331.
[32] Pak, A. and Dey, S. (2019). Statistical inference for the power Lindley model based on record values and inter-record times. Journal of Computational and Applied Mathematics, 347:156-172.
[33] Plummer, M., Best, N., Cowles, K. and Vines, K. (2006). CODA: Convergence diagnosis and output analysis for MCMC. R News, 6:7-11.
[34] Plummer, M., Best, N., Cowles, K., Vines, K., Sarkar, D., Bates, D., Almond, R. and Magnus-son, A. (2018). coda: Output analysis and diagnostics for MCMC, R package version 0.19-2, https://CRAN.R-project.org/package=coda.
[35] R Core Team (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
[36] Raftery, A. E. and Lewis, S. M. (1992). Comment: One long run with diagnostics: Implementation strategies for Markov chain Monte Carlo. Statistical Science, 7:493-497.
[37] Raftery, A. E. and Lewis, S. M. (1996). Implementing MCMC. In Markov Chain Monte Carlo in Practice, Eds. W. R. Gilks, S. Richardson and D. J. Spiegelhalter, Chapman and Hall/CRC, Boca Raton, pp. 115-130.
[38] Samaniego, F. J. and Whitaker, L. R. (1986). On estimating population characteristics from recordbreaking observations. i. parametric results. Naval Research Logistics Quarterly, 33:531543.
[39] Schruben, L. W. (1982). Detecting initialization bias in simulation output. Operations Research, 30:569-590.
[40] Schruben, L., Singh, H. and Tierney, L. (1980). A test of initialization bias hypotheses in simulation output. Technical Report 471, School of Operations Research and Industrial Engineering, Cornell University, Ithaca, New York, 14853.
[41] Schruben, L., Singh, H. and Tierney, L. (1983). Optimal tests for initialization bias in simulation output. Operations Research, 31:1167-1178.
[42] Varian, H. R. (1975). Bayesian approach to real estate assessment. In Studies in Bayesian Econometrics and Statistics in Honor of Leonard J. Savage, Eds. S. E. Fienberg and A. Zellner, North-Holland Pub. Co., Amsterdam, pp. 195-208.
[43] Zellner, A. (1986). Bayesian estimation and prediction using asymmetric loss functions. Journal of the American Statistical Association, 81:446-451.