The Log-Hamza distribution with statistical properties and
application
An alternative for distributions having domain (0,1).
AIJAZ AHMAD* •
Department of Mathematics, Bhagw ant University, Ajmer, India [email protected]
A FAQ A HMAD •
Department of Mathematical Science, IUST [email protected]
I. H. Dar
Department of Statistics, University of Kashmir , Srinagar , India [email protected]
Rajnee Tripathi •
Department of Mathematics, Bhagw ant University, Ajmer, India rajneetripathi@hotmail,com
Abstract
This work suggests a novel two-parameter distribution known as the log-Hamza distribution, in short (LHD). The significant property of the investigated distribution is that it belongs to the family of distributions that have support (0,1). Several statistical features of the investigated distribution were studied, including moments, moment generating functions, order statistics, and reliability measures. For different parameter values, a graphical representation of the probability density function (pdf) and the cumulative distribution function (CDF) is provided. The distribution's parameters are determined using the well-known maximum likelihood estimation approach. Finally, an application is used to evaluate the effectiveness of the distribution.
Keywords: Log transfor mation, Hamza distribution, moments, maximum likelihood estimation.
1. Introduction
Statistical distribution is important in modelling many sorts of data from various disciplines of research. Statisticsians have focused their efforts on developing new distributions or generalising current distributions by introducing additional parameters. The major reason for these extensions is to impr ove the efficiency of these distributions while analysing incr easingly complicated data.
The Beta distribution, Kumarasw amy distribution, and Topp-Leone distribution are the most commonly used bound support distributions. Among these distributions, the Beta distribution is the most common and has applications in many fields of study, including bio-science, engineer -ing, economics, and finance. The fundamental disadv antage of the beta distribution is that its cumulativ e distribution function (c.d.f ) comprises a beta function that cannot be written in closed form.The aim of this paper is to introduce a new distribution which is consider ed an alternative to the family of distributions having support (0,1). To achieve this goal, the Hamza distribution is used to generate a new distribution which is defined on an open interval (0,1) . In this regard, a note worthy effort has been attempted to limit many continuous distributions in unit inter vals, including: Topp-Leone [12], Nadarajah and kotz [10], Cordeiro and castro [3], Gomez-Deniz et al. [4], Mazucheli et al. [7], Ghitany et al.[5], Haq et al.[6], Menezes et al.[8], Rodrigues et al.[11], Aijaz et al.[1].
2. The Log-Hamza Distribution
Suppose a random variable X follow Hamza distribution with probability density function (p.d.f)
ft6
f (x;a, ft)
^a + ftx6^ e-ftx ; x > 0, a, ft > 0
aft5 + 120
The corresponding cumulativ e distribution function (c.d.f) is given as
ftx((ftx)5 + 6(ftx)4 + 30(ftx)3 + 120 (ftx)2 + 360 (ftx) + 720)
(1)
F(x; a, ft) = 1 -
1 +
6(aft5 + 120)
,-ftx
Suppose a random variable Y = e X (pdf) of Y is giv en as
; x > 0, a, ft > 0 (2) X = -ln(Y) , then the probability density function
f (y;a, ft)= aft5 +120 (a + ft(ln(y))6) yft-1 ; 0 < y < 1,a, ft > 0
(3)
Figure (1.1) and (1.2) represents some possible shapes of pdf of LHD for different values of parameters
2.1. PDF pioli of LHD(a,b) 2.2. PDF pio^ of LHD(a,b)
Figure 1
The corresponding cumulativ e distribution function (cdf) is given by
F(y; a, p)
1 +
((pln(y))6 - 6(pln(y))5 + 30(pln(y))4 - 120(filn(y))3 6(ap5 + 120)
+
360(pln(y))2 - 720(pln(y))) 6(ap5 + 120)
yp ; 0 < y < 1, a, p > 0 (4)
3. Reliability Measures of Log-Hamza Distribution
This section is focused on resear ching and developing distinct ageing indicators for the formulated distribution.
3.1. Survival function
Suppose Y be a continuous random variable with cdf F(y).Then its Survival function which is also called reliability function is defined as
!• TO
S(y) = Pr(Y > y)= f (y)dy = 1 - F(y)
Jy
Jy
Therefore, the survival function for log-Hamza distribution is given as
S(y; a, p) =1 - F(y, a, p)
s(y) =1 -
360(pin(y))2 - 720(pln(y)))
1 + ((pln(y))6 - 6(pln(y))5 + 30(pin(y))4 - 120(pin(y))3 + 6(ap +120) ( )
+ -
6(a£5 + 120)
yp ; 0 < y < 1, a, p > 0
3.2. Hazar d rate function
The hazar d rate function of a random variable y is denoted as
h(y; a, P)= {^ (6)
S(У, a, p)
using equation (3) and (4) in equation (6), then the hazar d rate function of log-Hamza distribution is giv en as
P6 (a + P(ln(y))6) yp-1
h(y) = ---\———--, „N1 a ; 0 < y < 1, a, P > 0
w 6(ap5 + 120) - [6(ap5 + 120) + (A)] yp " H
wher e
A = (Pln(y))6 - 6(pin(y))5 + 30(pin(y))4 - 120(Pln(y))3 + 360(Pln(y))2 - 720(pin(y))
Figure (3.1) and (3.2) represents some possible shapes of hrf of LHD for different values of parameters
1-! o= 0.3,ß = 2.5
............ o= 0.8,ß = 211
------ o= 0.6,ß = 0.7
i J / /
—
C o = 0.6,ß = 2
C o = 0.7,ß = 2.3
■ <<< o = 0.9,ß = 2.5 C o = 1.9 ,ß = 1.9
■--o = 0.5,ß = 2.5
0 12 3
3.1. HRF plo¥ of LHD(o,ß)
3.2. HRF plot of LHD(o,ß)
Figure 2
3.3. Cumulativ e hazar d rate function
The cumulativ e hazar d rate function of a random variable y is given as
H(y, a, ft) = - ln[F(y, a, ft)]
(7)
using equation (12) in equation (17), then we obtain cumulativ e hazar d rate function of IWB-III distribution
H(y, a, ß) = -ln
1 - 1 +
((ßln(y))6 - 6(ßln(y))5 + 30(ßln(y))4 - 120(ßln(y))3
6(aß5 + 120 )
+
360(ßln(y)) - 720(ßln(y)))\ yß
6(aß5 + 120 )
(8)
3.4. Mean residual function
The mean residual lifetime is the predicted residual life or the average completion period of the constituent after it has exceeded a certain duration y. It is extremely significant in reliability investigations.
Mean residual function of random y variable can be obtained as
m(y; a, ß)
S(У, a ß)Jy 1
tf (t, a, ß)dt - y
iWy' (« + ß tß-1dt - y
S (y, a, ß) (aß5 + 120 ) Jy + 6 {m Making substitution ln(t) = —z, sothat 0 < z < -ln(y), we have
1 ß6 f -ln(y)
m(y a ß) = S (y, a, ß) (aß5 + 120 ) Jo After solving the integral, we get
a + ßz6 ) e-6
m( y; a, )
5
S(y, a, ß) (aß5 + 120)
{a(1 - yß) + 7 (5, ln(y-ß))} - y
Where y (a, x) = Jq ua 1 e udu denotes lower incomplete gamma function
4. Statistical Properties Of log-Hamza Distribution
This section is devoted to derive and examine disttinct properties of log-Hamza
1
1
4.1. Moments
Let y denotes a random variable, then the rtth moment of log-Hamza is denoted as ¡ir and is given by
Vr =E(yr)= i yrf^ap)dy ■10
r
y
0
p6 11 yr+p-1 (a + P(ln(y))^ dy
ap5 + 120 J0 V 6
Making substitution ln(y) = -z, so that 0 < z < to, we have
P f^ (a ' P'
¡¡r =—fi- r (a + p-z6) e-(p+r)zdz
^r ap5 + 120 J0 \ 6 J
aP5 + 120 After solving the integral, we have
/ = P6 [(P + r)6 + 120 P] Vr = (aP5 + 120 )(P + r)7 The first four raw moments of log-Hamza distribution are given as.
, = P6 [(P + 1)6 + 120 P] , = P6 [(P + 2)6 + 120 P]
(aP5 + 120 )(P + 1)7 112 (aP5 + 120 )(P + 2)7
; = P6 [(P + 3)6 + 120 P] , = P6 [(P + 4)6 + 120 P]
(aP5 + 120 )(P + 3)7 ¥a (aP5 + 120 )(P + 4)7
4.2. Moment generating function
suppose Y denotes a random variable follows log-Hamza distribution. Then the moment generating function of the distribution denoted by My(t)is given
My (t)= E(e'y )=jf1 e'yf (y; a, P)dy
=i (1+y + ¥ + ¥+■■■■) f (y; a, P)dy
to tr to
E yrf(y;a,P)dy
r=0 to ,r
E rrE(yr)
r=0
E tr P6 [(P + r)6 + 120P]
r=0
r! (aP5 + 120)(P + r)7
The characteristics function of the log-Hamza distribution denoted as fa-(t) can be yeild by replacing t = it wher e i = V-1
^ {t)=EE (^y p6 [(P+r)6+120 P]
r=0
r! (ap5 + 120 )(p + r)7
4.3. Incomplete moments
The general expr ession for incomplete moments is giv en as T(t)= f lyrf (y; a, P)dy
r r
0
6
P6 [yr+p-1 (a + f (ln(y))^ dy
ap5 +120 Jo V 6
Making substitution ln(y) = -z, so that -lnt(t) < z < to, we have
6
ap5 + 120 J-ln(t)\ 6
TO 'a + Pz6\ e-(r+p)zdz
After solving the integral, we get
T(t)
6 t +r
+
J> r (5, „(r^)
a 5 + 120 + r 6( + r) where T(a, x) = fX° ua-1 e-udu denotes the upper incomplete gamma function.
5. Order Statistics of Log-Hamza Distribution
Let us suppose Y1, Y2,..., Yn be random samples of size n from log-Hamza distribution with pdf f (y) and cdf F(y). Then the probability density function of the kth order statistics is given as
fY (k)
f (y) [F(y)]k-1 [1 - F(y)]
(k - 1)!(n - 1)!
in— 1
(9)
Using equation (3) and (4) in equation (10), we have
fY(k)=n-^-^--J^- (a + P(ln(y))6) yp-1
JYy 1 (k - 1)!(n - 1)! aP5 + 120 V 6 /
1 . ((Pln(y))6 - 6(pln(y))5 + 30(pln(y))4 - 120(pln(y))3 + 360(pln(y))2 - 720(pln(y)))
6(a p5 + 120)
1 +
((pln(y))6 - 6(Pln(y))5 + 30(pln(y))4 - 120(pln(y))3 + 360(pln(y))2 - 720(pln(y)))
6(a p5 + 120)
The pdf of the first order statistics Y1 of log-Hamza distribution is given by
fY(1)=n-ap+2T0 (a + p(ln(y))6) y'-1
1 . ((Pln(y))6 - 6(pln(y))5 + 30(pln(y))4 - 120(pln(y))3 + 360(pln(y))2 - 720(pln(y)))
6(a p5 + 120)
The pdf of the nth order statistics Yn of log-Hamza distribution is giv en by
6
fY(1) =n „„ , -JY w ap5 + 120 V 6
a + ^(ln(y))b) y
M yP-1
1 + ((pln(y))6 - 6(Pln(y))5 + 30(pln(y))4 - 120(pln(y))3 + 360(pln(y))2 - 720(pln(y)))
6(ap5 + 120)
6. Maximum Likelihood Estimation of log-Hamza Distribution
Let the random samples y1, y2, y3,..., yn are drawn from log-Hamza distribution. The likelihood function of n obser vations is giv en as
L = n ap+20 («+1 «no»)6) ^-1
f=\ ap5 + 120 V" ' 6 ' The log-likelihood function is given as
n p n
l =6nlog(p) - nlog(ap6 + 120) + E log(a + 6(log(yl))6) + (p - 1) E logyi (10)
i=1
i= 1
n
k-1
P
y
nk
(5
y
n1
(5
y
n1
p
X
y
The partial derivatives of the log-likelihood function with respect to a and ft are given as
= —nft5 A 6 ( )
da = (aft5 + 120) + A (6a + ft(ln(yt))6) (11)
dl 6n 5naft4 n (ln(yi))6 , ,
W = J - WT^) + fis. Ifttnlyy,))6 + A l°g(yi) (12)
For interval estimation and hypothesis tests on the model parameters, an information matrix is requir ed. The 2 by 2 obser ved matrix is
I (Ç ) = -1 vw n
e(E(m)
\ da2 J \ dadp J
Ef E(d2 l°gl\
E dfidx J E dp2 J _
The elements of above information matrix can be obtain by differentiating equations (12) and (13) again partially . Under standar d regularity conditions when n ^ œ the distribution of £ can be approximated by a multivariate normal N (0, I (£) —1 ) distribution to construct approximate confidence interval for the parameters. Hence the approximate 100(1 — confidence interval for a and p are respectiv ely given by
wher e
a ± ZI— (£)and p ± Zt JI— (£)
L 36
da2 = L (6a + p(ln(yi ))6 )2
d21 _ —6n 5nap3(120 — ap5) n (ln(yi))6
— L
(ap5 + 120 )2 = (6a + p(ln(yi ))6 )2
d21 _ d21 _ 600np4 + £ 6(ln(yi))6
dadft dftda (aft5 + 120)2 ' = (6a + ft(ln(yi))6)2 7. Data A analysis
This subsection evaluates a real-w orld data set to demonstrate the log-Hamza distribution's applicability and effectiv eness. The log-Hamza distribution (LHD) adaptability is deter mined by comparing its efficacy to that of other analogous distributions such as beta distribution (BD), Kumarasw amy distribution (KSD) and Topp-Leone distribution (TLD).
To compar e the versatility of the explor ed distribution, we consider the criteria like AIC (Akaike information criterion), CAIC (Consistent Akaike information criterion), BIC (Bayesian information criterion) and HQIC (Hannan-Quinn information criterion). Distribution having lesser AIC, CAIC, BIC and HQIC values is consider ed better.
AIC = -2l + 2p, AICC = -2l + 2pm/ (m - p - 1), BIC = -2l + p(log(m))
HQIC = -2l + 2plog(log(m)) K.S = max1<j<m ^F(xj) - i--, m - F(xj
Where 'l' denotes the log-likelihood function,'p'is the number of parameters and'm'is the sample size.
Data set: The followig observation are due to Caramanis et al [2] and Mazmumdar and Gaver [9], wher e they compar e the two distinct algorithms called SC16 and P3 for estimating unit capacity factors. The values resulted from the algorith SC16 are
0.853, 0.759, 0.866, 0.809, 0.717, 0.544, 0.492, 0.403, 0.344, 0.213, 0.116, 0.116, 0.092, 0.070, 0.059, 0.048, 0.036, 0.029, 0.021, 0.014, 0.011, 0.008, 0.006.
The ML estimates with corresponding standar d errors in parenthesis of the unknown parameters
2
2
Table 1: Descriptive statistics for data set
Min. Max. Ist Qu. Med. Mean 3rd Qu. kurt. Skew.
0.0060 0.8660 0.0325 0.1160 0.2881 0.5180 1.9741 0.7676
Table 2: The ML Estimates (standard error in parenthesis) for data set
Model a ß
LHD 1.9503 2.0969
(1.5513) (0.2355)
BD 0.4869 1.1679
(0.1208) (0.3577)
KSD 0.5043 0.0242
(1.1862) (0.3264)
TLD 0.5943
(0.1239)
Table 3: Comparison criterion and goodness-of-fit statistics for data set
Model -2l AIC AICC BIC HQIC K.S statistic p-value
LHD -25.551 -21.551 -20.951 -19.280 -20.980 0.1034 0.9663
BD -19.214 -15.214 -14.614 -12.943 -14.643 0.183 0.4202
KSD -19.341 -15.341 -14.741 -13.070 -14.770 0.178 0.4526
TLD -16.230 -14.230 -14.039 -13.094 -13.944 0.168 0.5273
are presented in Table 2 and the comparison statistics, AIC, BIC, CAIC, HQIC and the goodness-of-fit statistic for the data set are displayed in Table 3.
Estimated pdfs of fitted models for the data set
Figure 3
It is observed from table 3 that LHD provides best fit than other competativ e models based on the measur es of statistics, AIC, BIC, AICC, HQIC and K-S statistic. Along with p-values of each model.
8. Conclusion
This study proposed a new two parameters distribution known as Log-Hamza distribution which is defined on unit interval and is used for modelling the real life data. Several structural properties
of the proposed distribution including moments, moment generating function, order statistics and reliability measur es has been discussed. The parameters of the distribution are estimated by famous method of maximum likelihood estimation. Finally the efficiency of the distribution is examined through an application when compar ed with Beta distribution, Kumarasw amy distribution and Topp-Leone distribution.
R eferences
[1] A. Aijaz, M.Jallal, S.Q. Ain Ul and R.Tripathi. The Hamza distribution with statistical properties and applications. Asian journal of probability and statistics, 8 (2020), 28-42.
[2] M. CaramanisJ.Str emel, W. Fleck and S. Daneil. Probabilistic production costing: an investigation of alter nativ e algorithms. Internation journal of electrical power and energy system,5(2),(1983), 75-86.
[3] G.M. Corderio and M. de Castro. A new family of generalized distribution. Journal of statistical computation and simulation,81(7) (2011),75-86.
[4] Gomez-Deniz, M.A.S ordo, E. Calderin-Ojeda. The Log-Lindle y distribution as an alter native to beta regression model with applications in insurance.. Insurance: Mathematics and Economics,54 (2014), 49-57.
[5] M.E. Ghitnay, J. Mazucheli. A.F.B. Menezes and F. Alqallaf. The unit-Inverse Gaussian distribution: A new alternative to two-parameter distribution on the unit interval. Commum. Stat. theory methods, 48 (2019), 3423-3438.
[6] M.A Haq,S. Hashmi, K. Aidi, P.L. Ramos and F. Louzada. Unit modified Burr-III distribution: Estimation, characterization and validation test. Ann. data sci., 87(15) (2020).
[7] J. Mazucheli, A.F.B. Menezes, L.B Fernandes, R.P. de Oliveria and M.E. Ghitney. The unit-Weibull distribution and associated inference. J.Appl. probab. Stat, 13 (2018),1-22.
[8] A.F.B. Menezes, J. Mazucheli and S. Dey. The unit-Gompertz distribution with applications. Statistica, 79 (2019)25-43.
[9] M. Mazumdar and D.P. Gaver. On the computation of power-generating system reliability indexes. Tecnometrics, 26(2) (2019),173-185.
[10] S. Nadarajah and S. Kotz. Moments of some J-shaped distribution. journal of applied statistics, 30(3) (2003),311-317.
[11] J. Rodrigues, J.L. Bazan and A.K. Aflexible procedur e for formulating probability distribution on the unit-inter val with applications. Commun. stat. Theory methods, 49 (2020), 738-754.
[12] C.W. Topp and F.C. Leone. A family of J-Shaped frequency function. journal of the American statistical association, 50(269) (1955), 209-219.