DIFFERENT ESTIMATION METHODS FOR TOPP-LEONE-(A) MODEL: INFERENCE AND APPLICATIONS TO COMPLETE AND CENSORED
DATASETS
Adubisi O. D.1
Adubisi C. E.2 •
department of Mathematics and Statistics, Federal University Wukari, Nigeria.
2Department of Physics, University of Ilorin, Nigeria.
[email protected] [email protected]
Abstract
This work proposes a new two-parameter model, titled Topp-Leone (A) model. The main benefit of the new model is that it has an inverted bathtub shaped curve, increasing and decreasing hazard rate function quite dependent on the shape parameter. Its structural properties including the ordinary moments, quantiles, probability weighted moment, median, entropy and order statistics are derived. More so, the survival, failure rate, reversed failure rate and cumulative failure rate functions are also derived. Six classical estimation methods are discussed for estimating the parameters of the new model. Monte Carlo experiments and real datasets analyses are conducted to examine the classical estimators performance of this model. Finally, the usefulness of the Topp-Leone (A) model demonstrated with different applications to complete and type-ll right censored data proves its more flexible when compared to well-known models in statistical literature.
Keywords: (A)-model, Censoring, Estimation methods, Simulation, Topp-Leone-G class
1. Introduction
In the past decades, classical models have been utilized extensively for analysing various datasets in the fields of demography, engineering, finance, medical and social sciences, environmental science, biological and actuarial studies. In many workable circumstances the classical models do not give a sufficient fit to actual datasets. Therefore, various generalizations and extensions of the classical models have been proposed and studied. For example, inverse Gompertz model was pioneered by [1], odd Frechet inverse exponential model was studied by [2], Kumaraswamy inverse Gompertz model was introduced by [3], Odd exponentiated half-logistic exponential model was studied by [4], Pareto exponential model was proposed by [5], odd exponentiated skew-t model was pioneered by [6], type-I half logistic skew-t model studied by [7], exponentiated half logistic skew-t model was introduced by [8], exponentiated odd lomax exponential model was studied by [9] and polynomial exponential model was studied by [10], among others. [11] proposed the (A) model having just a scale parameter which makes it unsuitable for modeling most real life circumstances, hence the need to extend the (A) model to increase its flexibility and capability. The novelty and input made by this study is the creation of a new two-parameter model
known as the Topp-Leone-(A) (TL(A)) model reliant on Eqs (3) and (4). The principal focus in this work are: utilizing the TL-G class to improve the structural properties and flexibility of the (A) model, provide a new generalized version of the (A) model with a closed-form quantile function, investigate the important descriptive aspects of the TL(A) model, such as the mode, median, mean, variance (VAR), skewness (SK), kurtosis (KU), moments, moment generating function, entropy, probability weighted moment and order statistics, investigate the statistical inference of the TL(a) model using six different methods such as the maximum likelihood estimation (MLE), maximum product spacing estimation (MPS), Anderson Darling estimation (ADE) least square estimation (LSE), and weighted least square estimation (WLSE), Cramer Von Mises estimation (CVME) for complete datasets, and provide better fits than competing generalized statistical models and also the suitability for testing the goodness of fit of the TL(a) to its sub-model, the (A) model. Suppose that Z is a random variable, the cumulative distribution function (CDF) and probability density function (PDF) of the (A) model with scale parameter k > 0 are respectively, given by
G (z) = e-1 (ez z > 0; k > 0, (1)
and
1 1 [A-A
g (z) = Z2 ez '; Z > 0; K > 0. (2)
Recently, the Topp-Leone-G (TL-G) family is one essential generator that have increased the interest of researchers in distribution theory. [12] introduced the CDF of the TL-G family as
F (z) = {l - [1 - G (z)]2}, (3)
and the corresponding PDF to Eq (3) takes the form
f (z) = 2ng (z) [1 - G (z)] {l - [1 - G (z)]2}'-1. (4)
where q > 0 is a shape parameter, G(z) and g(z) are considered as the CDF and PDF of a baseline r.v Z.
The remaining parts are outlined as follows: Part 2 introduces the CDF and PDF of the TL(A) model. Part 3 presents several fundamental structural properties of the TL(A) model. Some essential functions used in reliability analysis are introduced in Part 4. The six classical estimation approaches are discussed in Part 5 to appreciate the parameters of the TL(A). The maximum likelihood estimator of TL(A) for the type-II right censored are presented in Part 6. The performance of TL(A) estimators is appreciated in Part 7 using Monte Carlo experiments. Three real datasets; two complete and one type-II right censored data are analysed and the empirical results presented in Part 8. Finally, in Part 9, discussions and conclusion are presented.
2. Topp-Leone-(A) model
2.1. Genesis of TL(a)
The non-negative r.v Z is said to have the TL(a) model with parameters vector Y = (k, n), say Z ~ TL(a) (Y). The CDF of TL(a) model takes the form
F (z)= 1 -
1- e-
e«-1
2^1 I
(5)
and the corresponding PDF to Eq (5) takes the form f (z) = 2nz-2 e
-2„z-K ez-1
1- e-1-1)
1
1- e-K v -1)
2^1 n-1
(6)
K
where q > 0, k > 0 are the shape and scale parameters, and z Figure 1 depicts the graphical
shapes of the TL(a) PDF with selected values for the parameters q and k. The PDF is uni-modal, increasing-decreasing, right-skewness, decreasing, and heavy-tailed. The failure rate function of the new model in Figure 3 takes the form of "an inverted bathtub shaped, increasing and decreasing".
Figure 1: The density function (PDF) plots of the TL(a) model.
3. Structural properties of TL(a) model This part inspects some fundamental structural properties of the TL(A) model.
3.1. Quantiles function
The explicit forms of the quantile and median functions for the TL(a) model are presented in this subpart. The quantile function found by inverting Eq (5) takes the form
-; 0 < u < 1,
(7)
log ^ 1 - K log 1 - - u 1nj2
By setting u = 1 in Eq (7), the median (M) function takes the form
k
M =---
log < 1 — к log
1
1- ( 1-0.51X2
(8)
к
z
u
3.2. Moments and moment generating function
If Z ~ TL(a) (Y), then the rth ordinary moment (OM) of Z is found using
TO
v'r = E (Zr) = J Zrf (z; Y) dz (9)
By substituting Eq (6) into Eq (9), expanding using the Taylor series and invoking the beta function. The OM of the TL(A) model takes the form
n-1
v'r = E r (h - r + 1), (10)
i=0
where
fi- = 2n E~ Eg E~ ( n - M ( 21 + M (-1)g+,'+/+'+1 (1+j)g(1+g)hKr-g-11r-h-1
fii = 2n Ej,g=0 EZ=0 Eh=0 I i J\ j J l ! h ! (g-1)!
Similarly, the moment generating function (MGF) of TL(a) model, say Mz (t) is found using
M tr /■ W M tr
r
^ t' /*to ^ t'
Mz (t) = E (etz) = E ry n Zrf (z; Y) dz = E jr^. (11)
r=0 r! 0 r=0 r!
By substituting Eq (10) into Eq (11), the MGF takes the form
w
Mz (t) = E firr (h - r + 1), (12)
r
r=0
where
fi = 2n En-1 E~ Eg ew in - M ( 21 + M (-1)g+i+j+'+1 (1+j)g (1+g)h Kr-g-1tr;r-h-1
fir = 2n E,=0 Ej,g=0 EZ=0 Eh=0 I i I I j I r!/!h!(g-/)!
3.3. Entropy measures
Entropy performs a crucial part in computer science, information theory, probability theory and engineering. It is considered as a measure of dispersion for the uncertainty associated with a random variable Z; see [13]. The Renyi entropy of Z, say IT (Z) is given by
1 rM
It (Z) = log y /T (z) dz; t > 0 and t = 1, (13)
1 /*w
1 - t j0
If Z ~ TL(a) (Y), substituting Eq (6) into Eq (13), expanding using the Taylor series and invoking the beta function. The It (Z) takes the form
1
It (z) = !og
r(n-1)
E fi? r (2i + h - 1)
i=0
(14)
where
= (2n)T Eg ( T (n - 1)\f 2i + M (-1)g+i+j+'+1 (T+j)g(T+g)hK-2T-g-11-2T-h-1 fi, = (2n) Ej,g=0 E/=0 Eh=0 I i I 1 j I 1 !h!(g—1)!
3.4. Probability weighted moment
According to [14], the probability weighted moment (PWM) is a very useful quantity in mathematical statistics. The PWM of Z, say Zr,s is given by
f w
Zr,s = E [zrFs (z)] = yo zrFs (z) f (z) dz, (15)
If Z ~ TL(A) (Y), substituting Eqs (5) and (6) into Eq (15), expanding using the Taylor series and invoking the beta function. The PWM Zr,s takes the form
w
Zs,r = E fiar (h - s + 1) , (16)
«=0
where
A r \ / n (« + 1) - 1 V 2b + 1
fi« = 2n Eb=0 Ec,,=0 Eg=0 Eh=0 I «11 b M c
(-1)b+c+i+g+1 (1+c)1 (1+i)h Ks-i-1 gs-h-1 X g!h!(i-g)!
3.5. Order statistics
Let zi, z2,..., zn be a random sample from a continuous distribution, and the sequence z1:n < z2:n < ... < zn:n are order statistics (O.S) obtained from the sample. According to [15], he pth O.S is given by
fp:N (z) = b (p, N ^p + 1) [G (z)]p-1 t1 - G (^p , (17)
where G (z) and g (z) are the CDF and PDF of the TL(A) model, and B (.,.) is the beta function. Expanding [1 - G (z)]N-p, the O.S takes the form
f'N (z)= B (p,N - p + 1) £ (-1)' ( " -p ) [F (z)]'+'-1 f(z), ™
By substituting Eqs (5) and (6) into Eq (18), and then expanding. The O.S takes the form
r / \ 2n N~p K(1+g) „ ^
fp:N (z)= B (p, N-p + 1) £ ^, (19)
where
L,=0 Lj,g=0 ^h=0 y l J { b J{ j J ~~
r7(p+/)-i fN - p\( n (p +l) - 1 V 2i + M (-i)g+h+,'+j+' (i+j)gK-g
3.6. Skewness, kurtosis, dispersion index and coefficient of variation
The quantile function of the TL(A) presented in Eq (7) can be utilized in investigating the effect of the shape parameter on the mean (ME), variance (VAR), standard deviation (STD), median (M), skewness (Sk), kurtosis (Ku), dispersion index (DI) and coefficient of variation (CV). [16] proposed the skewness computational method using the quartiles, titled Bowley skewness. It is expressed as
Q (3; Y) - 2Q (i; y) + Q ( 1; Y) Sk =-^-(-^-J- (20)
Q (3;Y) - m 4;Y
Likewise, [17] introduced the kurtosis computational method based on the octiles, titled Moor's kurtosis. It is expressed as
_ Q (8;Y) - Q (5;Y) - Q (|;+ Q (8; K = Q (6; t) - Q (2;T) (21)
The DI shows whether a model is suitable for modeling equi, under or over-dispersed datasets. More so, a distribution is considered equi-dispersed if DI = 1, under-dispersed if DI < 1 and
over-dispersed if DI > 1. The DI is expressed as
. . Q( 4;t)-Q( i;t)
= Vor(X1= q( 4 1.|Q( 4 ) (22)
U1 E(X) Q(|;y) + Q( 1;t) + Q(4;y) . (22)
3
The CV is a relative measure of variability and generally utilized to compare independent samples based on their variability. A large CV value indicates a higher variability. The CV is expressed as
CV = (Var(X))2 = ( q(4 {.|5q(4 ))1 (2|)
E(X) Q(|;t)+Q( 1;t)+q(|;y) . (23)
3
where Q(.) is the quantile function. The numerical values of the descriptive measures for the TL(a) model under selected values of n and k are reported in Table 1. The following conclusions are reached:
1. The mean and standard deviation of the TL(A) model increases as the values of k and n increases. From the reported numerical values of the skewness and kurtosis, we can conclude that the TL(a) model is positively skewed. Also, as the values of k and n are increased the skewness and kurtosis values decreases.
2. The TL(a) model is beneficial for under-dispersed datasets while the DI increases and the CV decreases, as the values of k and n increases.
Table 1: The descriptive measures for the TL(a) model.
Parameters Descriptive measures
n K ME VAR STD M Sk Ku DI CV
0.2 0.2 0.375 0.072 0.268 0.331 0.365 1.140 0.192 0.715
0.5 0.2 0.664 0.224 0.473 0.592 0.340 1.055 0.337 0.712
1.0 0.5 1.148 0.507 0.712 1.044 0.322 0.997 0.442 0.620
2.0 0.5 1.667 1.061 1.030 1.520 0.317 0.978 0.636 0.618
2.5 1.0 2.097 1.378 1.174 1.933 0.310 0.959 0.657 0.560
3.0 1.5 2.501 1.708 1.307 2.321 0.306 0.943 0.683 0.523
3.5 1.5 2.683 1.991 1.411 2.489 0.306 0.944 0.742 0.526
3.5 2.0 2.885 2.048 1.431 2.691 0.302 0.931 0.710 0.496
4.5 2.5 3.417 2.683 1.638 3.196 0.299 0.923 0.785 0.479
5.0 3.0 3.765 3.042 1.744 3.533 0.297 0.915 0.808 0.463
6.5 3.5 4.383 3.988 1.997 4.117 0.296 0.912 0.910 0.456
7.0 4.0 4.705 4.360 2.088 4.429 0.294 0.906 0.927 0.444
Fig 2 depicts the 3D plots of the mean, variance, skewness and kurtosis of the TL(a) for some values of n and k parameters. The plots in figure 2 reveal that as the values of n and k increases, the skewness and kurtosis values decrease, and the mean and variance values increase, respectively.
4. Reliability analysis
4.1. Survival and failure rate functions
The survival function (Reliability) of Z ~ TL(a) (Y), takes the form
R (z) = 1 - { 1 -
1- e-1 (ef -1)
n n
; n, k > 0.
(24)
The failure (hazard) rate function (HRF) of Z ~ TL(a) (Y), takes the form
h (z) = 2nz-2ez K
-1 ez-1
1e
-1 ez-1
1
1e
-1 ez-1
11
1 e-
More so, if Z ~ TL(a) (Y), then the reversed HRF takes the form
r (z) = 2nz 2ez
1 I K
-1ez -
1- e-1 (ef -1)
1
1
1 e-
2n
1 e-
2 n-1
(25)
2 n-1
(26)
1
1
K ez-1
Figure 2: The Mean, Variance, skewness and kurtosis plots of the TL(A) model.
and the cumulative HRF takes the form
H (z) = - log 1 - 1 -
1 - e
U K
(27)
The graphical shapes of the HRF for TL(A) with various selected values of n and k are depicted in Fig 3. The model is characterized by an inverted bathtub shaped curve, increasing and decreasing HRF.
2
1
5. Estimation methods
This part discusses estimating the TL(A) parameters via different estimation methods. The method of maximum likelihood (MLE), method of maximum product of spacing (MPS), methods of ordinary least squares (OLS) and weighted least squares (WLS), method of Cramer-Von Mises (CVM) and method of Anderson Darling (ANDA) are considered for the complete data.
5.1. The MLE
The maximum likelihood (ML) method for estimating the unknown parameters of TL(a) (Y) for complete samples is considered. Let z1,z2,...,zs be the random observed values of size (s) from TL(a)(Y). Hence, the log-likelihood function L(Y) of Eq (6) takes the form
L (Y) = s log (2n) + 2 Es=1 log z; + k ^=1 57 - K U
+ Es=1 log (1 - e-1 j + (n - 1) Es=1 log 1 - (1 - e-1 j
2
Figure 3: The HRF plots of the TL(a) model.
where Vj = ez' - 1. By deriving the first partial derivatives of L (Y) and equating to zero. The
associated score function U (Y) = (^df^-, ^gj^) = 0 are given by
and
s
Un (Y) = r + E log
j=1
1- (1- e-2
Uk (p) = es=4 1) + $ Es=1 (j -1 E=1 ( ej] - E=1 (c jj
(29)
j - ej )e- *
+2 (n - 1) E=1
1-e 1M *2 uj- Kzf e ' le
1- 1-e-1
(30)
where cj -
1-e- *
The ML estimates nML and KML of the parameters of TL(a) (Y) are found by maximizing Eq (??) using the (Optim function) in R-programming software.
5.2. The OLS and WLS
Let z(1:s), z(2:s),...,z(s:s) be the ordered sample of size (s) from CDF of the TL(a)(Y) in Eq (5). The ordinary least squares (OLS) estimates nOLS and KOLS can be found by minimizing with respect to n and k, the function
OL (n, k) = e
j=1
F z
(j)
n, K) - £(j, s)
(31)
where £(j,s) = j/ (s + 1). Equivalently, the OLS estimates can be found by solving the following differential equation
1 u
2
2
E [^ooKK) -Çs)] A (zo')|^K) = °;i =1,2 ;=i
where
A1 (z(j)1 n, K) = dF (' n, K) , A2 (^ n, K)
= dF ( z,
dK
( )
n, K
(32)
(33)
The solutions of Ai for i = 1,2 can be found numerically. For more details, see [18]. Likewise, the weighted least squares (WLS) estimates nWLS and KWLS can be found by minimizing with respect to n and k, the function
WL (n, K) = E^O', s) E
j=1 j=1
F z
( )
n, K) - ç0's)
(34)
where $(j, s) = (s + 1)2 (s + 2) y/j (s - j + 1). The WLS estimates can also be found by solving the following differential equation
Es) [F (^nk) -£s)] Ai (^nk) = i =1,2 ;=i
where A1 (. |n, k ) and A2(. |n,k ). are given in Eq (33).
(35)
5.3. The MPS
The maximum product spacing (MPS) estimator proposed by [19, 20], for the estimation of unknown parameters with ordered sample z(1:s),z(2:s),..., z(s:s) from TL(a)(Y), and the uniform spacing for this random sample is given by
Dj' (n, K) = F (z(j:s) |n, K) - F (z(j-1:s) |n, K) ; j' = 1,2 . . ., T + 1,
(36)
where
F (z(°:s) ln,K) = ° F (Z(s+1:s) ln,K) = 1.
j Dj (n, K) = 1.
The MPS estimates nMPS and kMPS can be found by maximizing the geometric mean (GM) of the spacing given by
s+1
n Dj' (n, k)
'=1
1/s +1
GM (n, k)
relative to n and k or maximizing the logarithm of GM of the spacing given by
LGM (n, k)
1 s + 1
E log D (n, k) ,
s+1
(37)
(38)
'=1
The MPS estimates nMPS and KMPS of TL(a) (Y) can also be found by solving the following differential equation
1
s+1
E1 Dj (n, k)
Ai ( Z(j:s) |n, K ) - Ai (Z(;-1:s) |n,
°; i = 1,2,
(39)
where A1 (. |n, k ) and A2(. |n,k ) are given in Eq (33).
2
K
5.4. The ANDA
The ANDA estimates nANDA and KANDA can be found for TL(a) (Y) by minimizing the function
1
sj=1
AD (n,k) = -s - - E(2j -1) losF №) \nK) + logF fy(S+i-j:S) \n,
(40)
relative to n and k. The ANDA estimates can also be found by solving the following non-linear equation
E (2j - i)
j=i
Ai (z(j:s) \n, K ) Am (z(s+1-j:s) \n,
F[Z(j:-) \n, K
Z(s+1-j:s) \n,K
0; i, m = 1,2.
(41)
where A1 (. \n, k ) and A2(. \n,k ) are given in Eq (33).
5.5. The CVM
The CVM estimates nCVM and KCVM of TL(a) (Y) are found by minimizing the function
1 .A r„ ( , ) 2(j - 1) + 112
CV (n,K) = 12- + E F{z(r-) \n,0 -
j=1
2s
(42)
relative to q and k. Solving the non-linear equation, the CVM estimates can also be found by
solving the following non-linear equation
E
r=1
F (z(j:s) \n, K
2(j - 1) + 1 2s
Ai (z(j:s) \n, K ) =0; i =1,:
(43)
where A1 (. \n, k ) and A2(. \n,k) are given in Eq (33).
6. MLE für type-II right censored data
Experiments on life testing is terminated when a specified number of failed objects have been observed, then the objects remaining are designated to be a type-II-right censored W. Let z(i),Z(2),...,Z(p), p < s denote the ordered values of a random sample z\,z2,...,zs (failure times) and observations terminate after the pth failure occurs, then the likelihood function (Ct-jj) is
Ct
II = 0^ ; s-p n f(zj; Y
(44)
If z1, z2,...,zs is a random sample from the TL(a)(Y), then the log-likelihood function L** (Y) of z(1),z(2),...,z(p), p < s is
L** (Y) = p log (2n) + log (^) + (s - p) log {1 - [1 - (1 - e-2vp)" J
-2 Ep=1 log (zj) + k Ej=11 - 1 EjU Uj + j log (1 -
e-
(45)
+ (n - 1) EjP=1 log
, 1V ^ 2
1- 1- e-1 Vr
where Vp = ezp - 1 and Vj = ezj - 1. The ML estimates nML and KML of the unknown parameters of TL(a)(Y) is found by maximizing Eq (45) using the R-programming software (Optim function).
K
K
7. Münte Carlo experiments
In this part, the average estimates (AEs), absolute biases (AbsBs), mean square errors (MSEs) and mean relative errors (MREs) are computed for the TL(a) parameters (Pa.) using Monte Carlo experiments with complete samples.
7.1. Monte Carlo experiments based on complete data
These Monte Carlo experiments are executed in R-programming software and the sampling distributions are found for different sample sizes (T) from s = 3000 replications for various values of k and n. The classical estimators discussed in Part 5 for complete data are assessed and the average estimates (AEs) for each estimator are presented in Tables 2, 3 and 4. The comparison of the estimators graphically according to the AbsBs, MSEs and MREs for the TL(A) parameters (Pa.) are depicted in Figures 4, 5 and 6. Therefore, the following conclusions are reached utilizing the graphical plots.
1. The estimators are asymptotically unbiased given that their absolute biases converge to zero as the sample size increases. The estimators are consistent given that their MSEs tend to zero for large sample size.
2. The MLE and OLS performs better than the other estimators in terms of minimum AbsBs and MREs in most cases while the MPS has the largest absolute biases and MREs compared to other estimators in most cases. The results indicate that the MLE, OLS, WLS, ANDA, CVM and MPS perform quite well in estimating the TL(A) model parameters.
Figure 4: The estimators AbsBs, MSEs and MREs when k = 3.0 and n = 0.5 (complete data).
Table 2: The estimators AEs when k = 3.0 and n = 0.5 based on complete data.
T Pa. MLE OLS WLS MPS ANDA CVM
20 K 3.458 2.425 2.493 2.662 3.102 2.659
n 0.522 0.889 0.850 0.804 0.642 0.788
40 K 3.239 2.664 2.704 2.749 3.051 2.805
n 0.510 0.705 0.683 0.673 0.573 0.653
60 K 3.163 2.763 2.777 2.802 3.040 2.862
n 0.507 0.643 0.633 0.626 0.551 0.609
80 K 3.127 2.815 2.820 2.837 3.042 2.901
n 0.504 0.611 0.604 0.597 0.535 0.582
100 K 3.104 2.855 2.861 2.859 3.037 2.933
n 0.504 0.588 0.583 0.583 0.530 0.563
Table 3: The estimators AEs when k = 3.5 and n = 2.0 based on complete data.
T Pa. MLE OLS WLS MPS ANDA CVM
50 K 3.788 3.491 3.562 3.220 3.610 3.750
n 2.003 2.286 2.200 2.460 2.161 2.100
100 K 3.649 3.528 3.572 3.320 3.568 3.660
n 2.004 2.134 2.080 2.270 2.081 2.040
150 K 3.598 3.516 3.552 3.360 3.545 3.600
n 2.005 2.091 2.051 2.190 2.056 2.030
200 K 3.577 3.513 3.542 3.380 3.536 3.580
n 1.999 2.064 2.033 2.150 2.037 2.020
250 K 3.562 3.514 3.538 3.400 3.531 3.570
n 2.000 2.051 2.025 2.120 2.030 2.010
Table 4: The estimators AEs when k = 2.0 and n = 2.5 based on complete data.
T Pa. MLE OLS WLS MPS ANDA CVM
200 K 2.056 2.006 2.029 1.920 2.023 2.060
n 2.492 2.562 2.530 2.650 2.536 2.510
400 K 2.025 2.002 2.015 1.950 2.010 2.030
n 2.502 2.535 2.517 2.590 2.522 2.510
600 K 2.013 2.000 2.008 1.950 2.004 2.020
n 2.506 2.526 2.515 2.570 2.518 2.510
800 K 2.009 1.999 2.005 1960 2.002 2.010
n 2.504 2.521 2.511 2.550 2.514 2.510
1000 K 2.008 2.000 2.005 1.970 2.003 2.010
n 2.502 2.514 2.507 2.540 2.509 2.500
8. Data applications
The flexibility of the TL(A) model is demonstrated here with three real datasets; two complete data and one type-II right censored data.
8.1. Applications for complete data
The first dataset corresponding to the relief times of twenty patients receiving an analgesic was previously analysed by [22]. 1.1, 1.4, 1.3, 1.7, 1.9, 1.8, 1.6, 2.2, 1.7, 2.7,4.1, 1.8, 1.5, 1.2, 1.4, 3.0, 1.7, 2.3, 1.6, 2.0. The second dataset corresponding to the scores of the general rating of affective symptoms for preschoolers (GRASP) which measures the emotional and behavioural problems of children was previously analysed by [1] and [11]. 19(16), 20(15), 21(14), 22(9), 23(12), 24(10), 25(6), 26(9), 27(8), 28(5), 29(6), 30(4), 31(3), 32(4), 33(1), 34(1), 35(4), 36(2), 37(2), 39(1), 42(1), 44(1). The MLE will be used to compare the goodness-of-fit of the TL(a) with the (A) model, inverse Gompertz (IG) model, lomax(LOMX) model, Pareto (PE) model, inverse Pareto (IPE) model, Pareto type-I (PETI) model, exponentiated inverse rayleigh (EIR) model, type-I half logistic skew-t (TIHLST) model, generalized inverse exponential (GIE) model and odd frechet inverse exponential
Figure 5: The estimators AbsBs, MSEs and MREs when k = 3.5 and n = 2.0 (complete data).
(OFIE) model. These models will be fitted to the two complete datasets according to some criteria, namely, the Kolmogorov Smirnov test statistic (K.S) with its PVs. The Akaike information criterion (AIC), correct Akaike information criterion (CAIC), Hannan-Quinn information criterion (HQIC), Bayesian information criterion (BIC), Cramer-von Mises (W) statistic, Anderson-Darling (A) statistic and log-likelihood value (LL) will also be provided. The analysis is performed with the R-programming software using the fitdistrplus, Optim and AdequacyModel packages.
First dataset analysis
The MLEs, K.S and PVs for the first dataset are provided in Table 5 for all the studied models. The results show that the TL(a) has the least values for LL, AIC, CAIC, HQIC, BIC, and KS value with largest PV. This highlights that the TL(A) fits the first dataset better than (A), IG, LOMX, PE, IPE, PETI, OFIE, EIR, TIHLST and GIE models. This confirms that the TL(A) seems to be a very good model better than the other competing models. More so, the TL(a) model gives a more appropriate fit to the first data than the Kumaraswamy-transmuted exponentiated modified Weibull (KwTEXMW), McDonald log-logistic (McDLL), beta Weibull (BWE), modified Weibull (MWE), transmuted complementary Weibull geometric (TCWEG) and exponentiated transmuted generalized Rayleigh (ETGRH) models (see Table 5, [22]). Figure 7 depicts the fitted PDFs and fitted CDFs of all the models. The plots support the results presented in Table 5 that the TL(A) model provides the best goodness of fits to the first dataset.
Figure 6: The estimators AbsBs, MSEs and MREs when k = 2.0 and n = 2.5 (complete data).
Table 5: The MLEs, KS, PVs, LL, AIC, BIC, CAIC, HQIC, W* and A* values for the first dataset.
Models
TLA A IG LOMX PE IPE PETI GIE TIHLST EIR OFIE
K 5.092 2.402 6.145 28.385 25.011 28.862 1.697 36.485 0.708 3.610 1.329
n 0.228 - 0.110 0.019 47.333 0.059 - 2.232 2.347 2.336 0.906
K.S 0.116 0.385 0.142 0.444 0.437 0.380 0.285 0.134 0.504 0.127 0.358
PV 0.952 0.005 0.812 0.001 0.001 0.006 0.078 0.862 7.6E-05 0.906 0.012
LL -15.650 -23.503 -16.392 -33.142 -33.182 -32.986 -21.207 -16.261 -38.265 -15.868 -26.591
AIC 35.300 49.006 36.783 70.283 70.364 69.972 44.414 36.521 80.529 35.736 57.181
CAIC 36.005 49.229 37.489 70.989 71.070 70.678 44.636 37.227 81.235 36.442 57.887
BIC 37.291 50.002 38.774 72.275 72.356 71.963 45.410 38.513 82.520 37.727 59.172
HQIC 35.688 49.201 37.172 70.672 70.729 70.361 44.609 36.910 80.918 36.125 57.570
W* 0.025 0.028 0.055 0.102 0.102 0.050 0.038 0.054 0.090 0.042 0.032
A* 0.105 0.162 0.332 0.607 0.605 0.293 0.219 0.319 0.533 0.244 0.179
Second dataset analysis
The MLEs, KS and PVs for the second dataset are provided in Table6 for all studied models. The results show that the TL(a) has the least values for LL, AIC, CAIC, HQIC, BIC, and KS value with largest PV. This highlights that the TL(a) fits the second dataset better than (A), IG, LOMX, PE, IPE, PETI, OFIE, EIR, TIHLST and GIE models. This confirms that the TL(A) seems to be a very good model better than the other competing models. More so, the TL(A) model gives a more appropriate fit to the second data than the generalized exponential (GEX), Gompertz (GTz), extended Gompertz (EGTz) and generalized Gompertz (GGTz) models (see Table 5, [1]). Figure 8 depicts the fitted pdfs and fitted CDFs plots of all the models. The plots support the results
Figure 7: Fitted density function (pdf) plot (left panel), Fitted distribution function (CDF) plot (right panel) for the TL(a) model (first dataset).
presented in Table 6 that the TL(A) model provides the best goodness of fits to the second dataset.
Table 6: The MLEs, KS, PVs, LL, AIC, BIC, CAIC, HOIC, W* and A* values for the second dataset.
Models
TLA A IG LOMX PE IPE PETI GIE TIHLST EIR OFIE
a 86.970 107.354 149.337 3.328 7.808 63.123 0.313 138.914 0.676 10.246 18.987
n 4.809 - 0.175 0.012 185.974 0.379 - 0.218 359.391 40.286 0.943
K.S 0.100 0.130 0.102 0.495 0.532 0.452 0.602 0.108 0.572 0.109 0.446
PV 0.138 0.022 0.120 2.2E-16 2.2E-16 2.2E-16 2.2E-16 0.088 2.2E-16 0.085 2.2E-16
LL -393.202 -404.277 -393.401 -582.812 -573.127 -565.908 -718.074 -399.265 -601.745 -401.694 -520.798
AIC 790.404 810.555 790.801 1169.625 1150.253 1135.816 1438.148 802.530 1207.490 807.388 1045.595
CAIC 790.496 810.585 790.893 1169.716 1150.345 1135.907 1438.179 802.622 1207.582 807.479 1045.687
BIC 796.200 813.453 796.597 1175.420 1156.049 1141.611 1441.046 808.326 1213.286 813.452 1051.391
HQIC 792.759 811.732 793.157 1171.980 1152.608 1138.171 1439.326 804.886 1209.846 809.743 1047.951
W 0.208 0.209 0.235 0.371 0.388 0.286 0.302 0.246 0.377 0.322 0.227
A 1.523 1.525 1.708 2.512 2.613 1.994 2.091 1.756 2.545 2.214 1.631
For the TL(A), the approximate 95% two-sided confidence intervals (CIs) for the parameters k and n are [2.276,7.909] and [-0.176,0.632] for the first dataset and [65.743,108.197] and [-1.249,10.868] for the second dataset, respectively. The likelihood ratio test (LRT) is normally used to test if the fit by TL(A) model is statistically superior to the fit provided by the (A) model. Table 7 provides the values of the LRT, degree of freedom (d.f) and its PVs for the first and second datasets. Based on the PVs, the null hypothesis (H0) is rejected at a = 0.05 level of significance.
Table 7: The LR tests for the first and second datasets.
Model Hypotheses LR PV
First dataset
(A) vs. TL(a) HQ : n = 1 vs. H1 : Hq is false 15.707 0.00074
Second dataset
(A) vs. TL(a) HQ : n = 1 vs. H1 : Hq is false 22.151 0.0000025
8.2. Application for censored data
The censored data used here which represents the fatigue life for 10 bearings of a specific type in hours was introduced by DD. 152.7, 172, 172.5, 173.5, 193, 204.7, 216.5, 234.9, 262.6, 422.6. Assume a type II right censored sample of size p = 8 is taken from this data. Table 8 shows the MLEs, LL, K.S and PV for the TL(a) model. It is clear that the TL(a) well fitted to the data based on the K.S and its PV.
Table 8: The MLEs and performance measures for the type-IT right censored data.
Models MLEs LL K.S PV
TL(a) k = 539.86, n = 946.82 -41.129 0.263 0.550
9. Discussions and Conclusion
In this work, a new model titled TL(A) which is considered as an extension and generalization to the (A) model is proposed. The TL(A) is characterized by an inverted bathtub shaped curve, increasing and decreasing hazard rate function quite dependent on the shape parameter. More so, the TL(A) is appropriate for testing the goodness of fit of the sub-model, the (A) model. Some structural properties including the ordinary and incomplete moments, MGFs, PWMs, quantiles, Bonferroni and Lorenz curves, entropies, median and order statistics of TL(A) are derived. Likewise, basic functions utilized in reliability theory such as the survival function, HRF, reversed HRF, cumulative HRF, MTTF, MRL and MWT are derived. The Monte Carlo experiments are carried out to determine the performance of MLE, MPS, ANDA, CVM, OLS and WLS methods according to AbsBs, MSEs and MREs measures. The experiments results indicate that the estimators perform quite well in producing good parameter estimates for all the various parameter groups at different sample sizes. However, the MLE method produced closer estimates for TL(A) parameters. This conforms to the reports by [24, 25, 26, 27, 28]. Furthermore, the parameters of TL(A) are appreciated using the MLE in the case of complete and type-II-right censored data. The two complete data are analysed using the TL(A) and compared with ten other competing lifetime models. Likewise, a type-II-right censored data is analysed using the TL(a) model. The results indicate that the TL(A) has more flexibility for fitting the various datasets.
Futuristically, the bivariate extension of the TL(A) model, the TL(A)-G family of distributions and
the discrete case of the TL(A) model will be addressed.
References
[1] Eliwa, M. S., El-Morshedy, M and Ibrahim, M. (2009). Inverse Gompertz distribution: properties and different estimation methods with application to complete and censored data. Annals of data science, 6(2):321-339.
[2] Alrajhi, S. (2019). The odd Frechet inverse exponential distribution with application. Journal of Nonlinear Sciences & Applications (JNSA), 12(8):1-24.
[3] El-Morshedy, M., El-Faheem, A. A. and El-Dawoody, M. (2020). Kumaraswamy inverse Gompertz distribution: Properties and engineering applications to complete, type-II right censored and upper record data. Plos one, 15(12):e0241970.
[4] Aldahlan, Maha, A. D. and Afify, A. Z. (2020). The odd exponentiated half-logistic exponential distribution: estimation methods and application to engineering data. Mathematics, 8(10):1684.
[5] Rana, M. S. and Rahman, M. M. (2022). The Pareto-Exponential Distribution: Theory and Real-life Applications. European Journal of Mathematics and Statistics, 3(3):30-39.
[6] Adubisi, O. D., Abdulkadir, A. and Chiroma, H. (2021). A Two Parameter Odd Exponentiated Skew-T Distribution With J-Shaped Hazard Rate Function. Journal of Statistical Modeling & Analytics (JOSMA), 3(1): 26-46.
[7] Adubisi, O. D., Abdulkadir, A., Chiroma, H. and Abbas, U. F. (2021). The type I half logistic skew-t distribution: A heavy-tail model with inverted bathtub shaped hazard rate. Asian J Probab Stat, 14:21-40.
[8] Adubisi, O. D., Abdulkadir, A., Abbas, U. F. and Chiroma, H. (2021). Financial data and a new generalization of the skew-t distribution. Covenant Journal of Physical and Life Sciences, 9(2):1-17.
[9] Dhungana, G. P. and Kumar, V. (2022). Exponentiated Odd Lomax Exponential distribution with application to COVID-19 death cases of Nepal. PloS one, 17(6):e0269450.
[10] Beghriche, A., Zeghdoudi, H., Raman, V. and Chouia, S. (2022). New polynomial exponential distribution: properties and applications. Statistics in Transition new series, 23(3):95-112.
[11] Alshenawy, R. (2020). A new one parameter distribution: properties and estimation with applications to complete and type II censored data. Journal ofTaibah University for Science, 14(1):11-18.
[12] Al-Shomrani, A., Arif, O., Shawky, A. Hanif, S. and Shahbaz, M. Q. (2016). Topp-Leone family of distributions: Some properties and application. Pakistan Journal of Statistics and Operation Research, 7:443-451.
[13] Renyi, A. (1961). Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability. Contributions to Biology and Problems of Medicine, 4:547-561.
[14] Greenwood, J. A., Landwehr, J. M., Matalas, N. C and Wallis, J. R. (1979). Probability weighted moments: definition and relation to parameters of several distributions expressable in inverse form. Water resources research, 15(5):1049-1054.
[15] David, H. A. and Nagaraja, H. N. Order statistics, John Wiley & Sons, 2004.
[16] Kenney, J. F. Mathematics of statistics, D. Van Nostrand, 1939.
[17] Moors, J. J. A. (1988). A quantile alternative for kurtosis. Journal of the Royal Statistical Society: Series D (The Statistician), 37(1):25-32.
[18] Swain, J. J., Venkatraman, S. and Wilson, J. R. (1988). Least-squares estimation of distribution functions in Johnson's translation system. Journal of Statistical Computation and Simulation, 29(4):271-297.
[19] Cheng, R. C. H and Amin, N. A. K. (1979). Maximum product-of-spacings estimation with applications to the lognormal distribution. Math report, 791.
[20] Cheng, R. C. H and Amin, N. A. K. (1983). Estimating parameters in continuous univariate distributions with a shifted origin. Journal of the Royal Statistical Society: Seríes B (Methodological), 45(3):394-403.
[21] Zheng, G. and Park, S. (2004). On the Fisher information in multiply censored and progressively censored data. Communications in Statistics-Theory and Methods, 33(8):1821-1835.
[22] Afify, A., Yousof, H. and Nadarajah, S. (2017). The beta transmuted-H family for lifetime data. Statistics and its Interface, 10(3):505-520.
[23] McCool, J. I. (1978). Competing risk and multiple comparison analysis for bearing fatigue tests. ASLE transactions, 21(4):271-284.
[24] Ramos, P. L., Louzada, F., Ramos, E. and Dey, S. (2020). The Fréchet distribution: Estimation and application-An overview. Journal of Statistics and Management Systems, 23(3):549-578.
[25] Chesneau, C., Bakouch, H. S., Ramos, P. L. and Louzada, F. (2020). The polynomial-exponential distribution: A continuous probability model allowing for occurrence of zero values. Communications in Statistics-Simulation and Computation, 22:1-26.
[26] Adubisi, O. D., Abdulkadir, A. and Adashu, D. J. (2022). Improved Parameter Estimators for the Flexible Extended Skew-t Model with Extensive Simulations, Applications and Volatility modeling. Scientific African, e01443.
[27] Adubisi, O. D., Adubisi, C. E. and Abdulkadir, A. (2022). Laplace Transformed Properties of the Extended Power-Gompertz Model: Simulation and Applications. Scientific African, e01523.
[28] Adubisi, O. D., Abdulkadir, A. and Adubisi, C. E. (2022). A new hybrid form of the skew-t distribution: estimation methods comparison via Monte Carlo simulation and GARCH model application. Data Science in Finance and Economics, 2(2):54-79.