CHARACTERIZATION OF CHANDBHAS-P DISTRIBUTION AND ITS APPLICATIONS IN MEDICAL
SCIENCE
Praseeja C B1, Prasanth C B2*, C Subramanian3, Unnikrishnan T4
Research Scholar, Department of Statistics, Annamalai University, Tamil Nadu, India,
prasinikhil@gmail. com 2Assistant Professor, Department of Statistics, Sree Keralawarma College, Kerala,
[email protected](*Corresponding author) 3Associate Professor, Department of Statistics, Annamalai University, Tamil Nadu,
4Assistant Professor, Department of Statistics, Shri C Achuthamenon Govt. College, Kerala,
Abstract
The current research attempts the length biased version of new two-parameter Sujatha distribution, which is referred as ChandBhas-P Distribution (CBPD). Its different structural properties are discussed and the model parameters of this novel distribution are predicted by using the Maximum Likelihood Estimation. The distribution was examined with two real lifetime sets of data. The first set of data is birth weight of new born babies, randomly selected from a hospital at Thrissur, kerala and the second set is the weight of children of age range between three months to four years-collected from a few babysitting centres and play schools across Thrissur, and both are employed in order to discuss the goodness of fit.
Keywords: estimation, Survival analysis, two-parameter Sujatha distribution, Length biased distribution, data.
1. Introduction
Statistical probability distributions are often the foundation of data analysis in the real world. However, data from other domains, including environmental, biomedical, financial, and other areas, may not match the conventional distributions. Thus, it becomes necessary to create new distributions that will effectively capture extreme skewness and kurtosis while improving the goodness-of-fit of empirical distributions.
The most common classical distributions often fail to fit and forecast data in a variety of applied fields, such as engineering, the biological and environmental sciences, the medical and life sciences, finance, and economics. As a result, many generalized families are seen to be an improvement for developing and expanding the typical classical distributions. The recently created families provide more application versatility and have been thoroughly investigated in many fields. Because of their versatility, these extended families have been used to represent data in many applied fields. The statistical distribution theory often involves adding a new parameter to a
distribution family function that already exists. A class of distribution functions may often be made more flexible by adding parameters, which could be extremely helpful for data analysis.
The introduction of an extra parameter into the paradigm, which generates flexibility in nature, allows for the weighted distributions to play a vital part in distribution theory since it offers a new comprehension of the classical distributions that are already in use. In particular, the weighted distributions have been employed to help in the acceptable model's selection for observed data when samples are taken without a suitable frame. The weighted distributions are used to describe heterogeneity, extraneous variance, and clustered sampling in the data set. The weighted distribution arises when data from a sample are noted with uneven probabilities, and it offers a solution for issues when the observations fall into non-replicated, non-experimental, as well as non-random categories. The weighted distributions offer a special method for addressing issues with model formulation and data interpretation. The probability of occurrences as seen and transcribed is modified by using the weighted distributions. The weighted distributions are reduced to length-biased distributions when the function of weight considers just the unit's length of interest.
The notion of weighted distributions was created to record the observation in accordance with certain weight functions. Fisher [8] executed the weighted distribution to analyze how the approach of ascertainment might affect the type of recorded observation's distribution. Rao [13] later developed it into a unified theory w.r.t modeling statistical data when it was discovered that the common practice of utilizing standard distributions was incorrect. Zelen [18] and Cox [4] first proposed the idea of length-biased sampling. In the system of renewal concept, Cox [5] first presented the statistical explanation of length-biased distribution. Length-biased sampling circumstances could arise in survival analysis, reliability theory, clinical trials, as well as population research when an appropriate sample frame is missing.
A significant contribution was conducted by various investigators to create certain important length-biased probability distributions by handling its different lifetime data sets from heterogeneous fields of application. Dar et al. [6] presented Poisson size-biased "Lindley distribution" and its uses. Ganaie and Rajagopalan [9] detailed a "length-biased weighted quasi-gamma distribution" with characterizations and uses. Beghriche and Zeghdoudi [3] explained a size-biased gamma Lindley distribution. Ekhosuehil et al. [7] proposed the Weibull length-biased exponential distribution with statistical properties and uses. Alhyasat et al. [1] expressed power size biased 2-parameter Akash distribution. Sen et al. [15] discussed weighted X-gamma distribution with application and properties. Rasool and Ahmad [12] introduced power "length-biased weighted Lomax distribution" with uses. Mobarak et al. [10] studied the size-biased weighted transmuted Weibull distribution. Sanat [14] proposed the beta-length biased Pareto distribution along with properties. Hussein and Al-Kadim [2] presented Rayleigh distribution and length-biased weighted exponential with the application. Recently, Khan and Mustafa [11] implemented the length-biased powered inverse "Rayleigh distribution" with uses, which demonstrates more flexibility as compared to conventional distribution.
The new two-parameter Sujatha distribution is a recently suggested two-parametric lifespan model by Tesfay and Shanker [16], of which the one-parameter Akash distribution and Sujatha distribution are special cases of the same. Tesfay and Shanker [17] also proposed two parameters of Sujatha distribution which include size-biased Lindley along with Sujatha distribution.
2. ChandBhas-P Distribution (CBPD) The probability density function(pdf) of the new two-parameter Sujatha distribution is,
Praseeja C B, Prasanth C B, C Subramanian, Unnikrishnan T RT&A N
CHARACTERIZATION OF CHANDBHAS-P DISTRIBUTION AND '
Volume 18' December 2023
_ITS APPLICATIONS IN MEDICAL SCIENCE_
в3 / Л - dx
f (х;в,а) = ^-(1 + ax + x2 ) e ; x > 0, в > 0,а> 0 (1)
в2 +ав + 2 v '
and the cumulative density function (cdf) of the new two-parameter Sujatha distribution is presented by
F ( x ;в,а) = 1 -
ex (вх + ав + 2) 1 +-----
в2 + ав + 2
- вх
e ; x > 0, в > 0,а> 0
(2)
Suppose, f (x) is the pdf of X (a non-negative random Variable), then the pdf of weighted is
presented as
w( x)f (x)
fw (x) =-, x > o.
E (w( x))
where w(x) be the non-negative weight function & E(w(x)) = J w(x) f (x)dx <
There are several forms of weighted models. Particularly when w(x) = xc, results a distribution known as a weighted distribution. The length-biased form of the new two -parameter Sujatha distribution is analyzed herewith. The distribution that results from our consideration of the
"weight function" at w(x) = x is termed the "length-biased distribution" with pdf, x.f (x)
fi( x) =
E (( x)
(3)
Where E(x) = J x f (x; в, d)dx 0
в2 + 2ав + 6
E ( x) =-2
в(в +ав + 2)
We get the pdf of the CBP distribution by putting eqs. (1) & (4) into eq. (3).
xв4 ( -ex
f (x) = -(1 + œx + x ) e
в + 2ав + 6 v !
and the cdf of CBP Distribution is,
(4)
(5)
F(x) = J f(x)dx 0
2
в + 2ав + 6
f \
x x x
в4 J xe" ex dx + ав4 J x2e" вх dx + в4 J x3e" вх dx V 0 0 0 j
(6)
dt t Put Ox = t ^ ddx = dt ^ dx = —, When x ^ 0, t ^ 0, Also x = —
e e
After the simplification of eq. (6), we get the cdf of CBPD distribution as
F (x) = ^-1-(e2K2, Ox) + «0^(3, Ox) + y(4, Ox))
0 + 2a0 + 6 (7)
The nature of the pdf and cdt is clear from the Figure 1. & Figure 2. For different a and 9. When a & 9 decreases the pdf curve became less skewed.
I
Figure 1. pdf of CBP Distribution
Figure 2. cdf of CBP Distribution
3. Characteristics of the CBPD
3.1 Survival function
S(x) = 1 - Fl (x) = 1 —-1-l"2
d + 2ad + 6
3.2 Hazard Rate
(d2r(2, 0x) + ad r(3, 6x) + r(4, 0x))
It is also known as hazard function or failure rate and it is presented as h(x) = f (x)
2. - dx
xd (1 + ax + x ) e
22 1 - F (x) (d + 2ad+ 6) - (d r(2, Ox) + adr(3, Bx) + r(4, Bx))
The following figures represent the Survival function and Hazad rate of the new distribution. The decreasing nature of survival functions of CBPD with increase in x for different 'a' & '9' is clear from Figure 3. The nature of hazard rate with different a & 9 is noticed from Figure 4.
Figure 3. Survival function of CBP Distribution with different 'a' &'9'
Figure 4. The nature of hazard rate OF CBPD with different a & 9
4. Statistical Properties
4.1 Moments
The rth order moment E(Xr) can be calculated by considering the random variable X with the CBPD and parameters d and a,
&
E(Xr ) = ß ' = J xr f (x)dx r 0 n4
, r X0" ( 2\ -9x ,
= J x —;-11 + ocx + x le dx
0 6 + 2ad+ 6V '
& xr +1 64 ( 2) -ßx
11 + ax + x le '
= J—z- Il + ax + x I e dx
0 6 + 2a6+ 6
64 & r +1/ A - ax
■ —z-J x 11 + ax + x le dx
6 + 2a6+ 6 0 v '
в
4
в2 + 2ав + 6
л
J х 0
( r + 2) - 1 - вх
e dx + а J x 0
(r + 3) - 1e - вхdx +J x(r + 4) - 1 e - & dx
(8)
By simplifying equation (8), we obtain
9
E() = p ' = 0 r(r + 2) + c0T(r + 3) + T(r + 4)
r
er (e 2 + 2«e + 6)
By putting r = 1, 2, 3, & 4 into equation. (9), we get the first four moments of the CBP distribution.
(9)
2в + 6ав + 24
E( X) = =-2-
в(в + 2ав + 6)
2
2 6в + 24ав +120
E(X ) = =~г—2-
в (в + 2ав + 6)
2 2 3 24в +120 ав + 720 4 120 в + 720ав + 5040
E( XJ) = ^3' = 3 2 _ ^ E (X4) = ^4' =■
в (в + 2ав + 6)
в4 (в2 + 2ав + 6)
Variance =
S.. D(a) =.
6в + 24ав +120 в2 (в2 + 2ав + 6)
2в2 + 6ав + 24
2
!^в(в2 + 2ав + 6))
6в2 + 24 ав +120 в2 (в2 + 2ав + 6)
2
' 2в + 6ав + 24 ^в(в + 2ав + 6))
4.2 Harmonic mean
H м = E
r 1 ^ ™ 1
и
V x )
. 1 7 в = H fl (x)dx = J —2-
0 x 0 в2 + 2ав + 6
L ^ - вх,
11 + ax + x I e dx
в
4
f
2
в + 2ав + 6
r (2) -2 - вх
J x e
\
dx + а J x e dx +J x(3) 1 e в dx 0
(10)
After the simplification of equation (10), we obtain
0(0 + CC0 + 2)
H M =■
2
в + 2ав+ 6
7
7
0
2
7
0
4.3 Moment generating & Characteristic function
The function of the proposed distribution may be constructed using X as the random variable following length biased new 2 parameters with a and d distribution.
Mx (t) = E (e* )= J etx f(x)dx
We obtain the following with Taylor's series
Mx (t) = E(etx ) = J
, , (tx)
1 + tx +-+ ...
2!
Л
fl (x)dx
7
0
CHARACTERIZATION OF CHANDBHAS-P DISTRIBUTION AND ' ,0 0(nJ
Volume 18, December 2023
ITS APPLICATIONS IN MEDICAL SCIENCE_
<» <X t^ x tj
= 1 S — xJ f (x)dx = S —Mi'"
0 J=0 J! ' j=0 j ! J
» j [d2r( j + 2) + a.d.r( J + 3) + r( J + 4)]
S J 2
J=0 j! d (d + 2ad + 6)
1 & tJ - t2t
Hence,
Mx (t) = —-:- S —-[d2r(j + 2) + a.d.r(j + 3) + T(j + 4)]
(d2 + 2ad + 6) j=0 j!d]
Similarly the characteristic function of CBP distribution is,
1 ~ (it)J 2
®x (t) = Mx (it) = —- S ^T[d2r(J + 2) + a.d.r(J + 3) + T(J + 4)]
X X (d2 + 2ad + 6) J=0 J !d
5. Results and Discussions
5.1 Likelihood Ratio Test.
(11)
(12)
Consider a random sample "X1, X2, ..., Xn" of size n selected from the CBP distribution. The
hypothesis is to be tested for examining its significance. H0-.f(x~) = fix,9, a) vs. H^-.fix) = f^x-,6, a)
To investigate and analyze, whether the random n sample size obtained from the CBP distribution,
the given below test statistic rule is applied.
L n f (x ;d,a) A = — = n-
L0 i=1 f ( x ;6, a)
6(6 +a6 + 2) 2
V 6 + 2a6 + 6 )
\n
n
n xt i=1
The null hypothesis will not be accepted if
A =
\n
6(6 +a6 + 2) 2
v 6 + 2a6 + 6 )
n xi > k i=1
Similarly, we shall reject the "null hypothesis", where
* n
A = n x > k i=1
f „2
6 + 2a6 + 6
n
V
+ a6 + 2) )
* n * *
A = n x > k , Where k = k i=1
f 2 \n 6 + 2a6 + 6
V6(6 +a6 + 2) )
Whether for the large size nsample, 2log A indicates distributed as "chi-square distribution" with
ldegree of freedom as well as this distribution is applied for examining the value of p. Therefore,
we reject to maintain the "null hypothesis" when the value of probability is provided as p(A* > £*),
herep* = nf=1 xi s lower than a particular level of significance and nf=1 xr is the examined
*
value of the statistic A .
n
2
Praseeja C B, Prasanth C B, C Subramanian, Unnikrishnan T RT&A N
CHARACTERIZATION OF CHANDBHAS-P DISTRIBUTION AND '
Volume 18' December 2023
_ITS APPLICATIONS IN MEDICAL SCIENCE_
5.2. MLE of CBP distribution
By MLE, estimate the parameters of the CBP distribution. The likelihood function for - X1, X2,... and Xn - a random sample of n size from the CBPD is,
n n f f, 2 - fat ^ L(x) = П f(x) = —--П[ [1 + ax + x 2 I e 4
i=i (0 + 2a0+ 6)n t=l V V i i J J
The log-likelihood formula is,
InL == 4nln0- nln((02 + 2a0+ 6)) + Z ln( Xt) + Zln(1 + axt + Xt2) -0Z xt (13)
i=1 ■ • i=1 ■ 11 i=1
By differentiating the log-likelihood eq. (13) w.r.t parameters 9 and a. We get, d log L 4n
f (20 + 2a) Л
50 0
д log L
-= — n
da
2
V0 + 2a0 + 6 J
\
-Z x. = 0
i=1
20
t2
0 + 2a0 + 6 J
n
+ Z i=1
V(1 + axi + xi ) J
= 0
It is vital to note that the preceding system of nonlinear equations has an analytical solution that is too complex to be solved algebraically. So, we predict the suggested distribution's parameters using R and Wolfram mathematics.
The asymptotic normality findings must be used for the purpose of calculating the confidence interval.
We have that if ¡3 = (0, CC) represents the MLE of 3 = (0, a). The results can be executed as,
In (¡3 -3) ^ N2(0,1 _1(3))
Here I-1 (^indicates the matrix of Fisher's Information.
E
r d2 log L Л d02
4n
= --- n
02
22 2(6 + 2a0 + 6) - (20 + 2a)2
(02 + 2a0 + 6)2
E
E
(2 \ i ' d2 log L
da
= n
40
о 0
(0 + 2a0 + 6)2
> n/
Z
i=1
( xi )2
V
22
(1 + ax ■ + x )
^ д 2 log L V 50da J
= - n
i i ' у
Л
2
2(0 + 2a0 + 6) - 20(20 + 2a) (0 + 2a0 + 6)2
J
Since ft being unknown, we predict / by / 1((?) and this could be utilized to get the confidence intervals for 0&a.
- n
x
6. Applications of CBP Distribution in real life Data.
We analyzed and examined the two real lifetime sets of Bio-medical data for fitting CBP distribution and the fit was compared with some related distributions. (New Two-Parameter Sujatha, Two-Parameter Sujatha, Sujatha, Exponential, and Lindley distributions).
Data set I: The following set of data represents the birth weight (Kg) of randomly selected new born babies from a hospital at Thrissur, Kerala (Table 1).
Table 1: Data regarding the birth weight (Kg) of new born babies
2.910 3.640 2.770 2.190 2.420 1.910
3.1091 3.385 3.145 3.215 2.495 3.515
4.380 3.110 3.280 3.860 4.040 4.170
3.640 3.210 2.870 3.230 4.220 1.520
2.610 3.360 2.840 3.140 2.530 1.580
4.275 3.380 4.870 3.100 2.800 4.280
Data set II: The following data (Table 2) represents the weight (Kg) of 100 randomly selected children- of age range between three months to four years- collected from a few babysitting centres and play schools across Thrissur,
Table 2: weight (kg) of 100 Children (age between 3months to 4 years) from a babysitting record.
8.75 12.25 12.50 6.50 16.00 14.50 6.75 14.50 7.50 9.00
7.50 7.00 7.75 7.50 7.25 7.00 6.50 6.75 10.00 5.50
8.75 6.75 12.25 12.50 6.50 14.00 14.50 6.50 15.50 12.50
10.00 8.75 7.25 8.00 7.50 9.00 7.00 8.75 7.00 7.50
12.00 6.75 5.00 7.50 6.50 7.25 5.25 7.50 5.50 8.75
7.25 14.50 6.75 7.50 7.00 6.75 12.25 8.75 7.00 7.50
7.00 7.75 7.50 7.25 7.00 6.50 6.75 10.00 9.00 7.00
6.75 12.25 12.50 6.50 16.00 14.50 6.50 16.00 5.50 6.75
8.75 7.25 8.00 7.50 9.00 7.00 8.75 7.00 12.50 6.50
6.75 5.00 7.50 6.50 7.25 5.25 7.50 5.50 6.50 7.50
Table 3: Performance & Comparison of Fitted Distributions for a set of Data 1
"Distributions MLE S.E -2logL AIC BIC AICC
CBP distribution a = 0.001 d = 1.1139 a = 0.009 d = 0.0495 .118.94 > 123.07 126.29 123.49
Two Parameter Sujatha a = 0.001 d = 0.819 a = 0.011 d = 0.0539 125.79 129.83 133.01 130.20
New Two Parameter Sujatha a = 0.001 d = 0.698 a = 0.020 d = 0.2110 133.96 137.97 141.09 138.32
Sujatha d = 0.691 id = 0.067 134.19 136.20 137.790 136.29
Exponential d = 0.3047 d = 0.051 157.59 159.56 161.137 159.68
Lindley d = 0.50694 d = 0.061509 144.19 146.19 147.775 146.31
Table 4: Performance &Comparison of Fitted distributions for a set of Data 2
Distributions MLE S.E -2logL AIC BIC AICC
CBP distribution a = 0.001 d = 0.469 a = 0.020 d = 0.025 223.80 227.81 231.29 228.10
Two Parameter Sujatha a = 0.001 d = 0.347 a = 0.0001 d = 0.019 233.79 237.79 241.36 238.09
New Two Parameter Sujatha a = 0.001 0 = 0.350 a = 0.010 0 = 0.022 234.20 238.20 241.69 238.49
Sujatha 0 = 0.336 0 = 0.029 236.58 238.58 240.36 238.67
Exponential 0 = 0.121 0 = 0.0183 273.58 275.58 277.36 275.67
Lindley 0 = 0.2208 0 = 0.0237 251.81 253.81 255.59 253.90
The R software is utilized to calculate the model comparison criteria values as well as the estimate of unknown parameters (Table 3 and Table 4). We take into account of standard criteria's like AICC ("Akaike Information Criterion Corrected"), BIC ("Bayesian Information Criterion"),AIC ("Akaike Information Criterion"), and -2logL to compare the performance of CBP distribution along with the new 2-parameter Sujatha, two-parameter Sujatha, Sujatha, Lindley and Exponential distributions. The distribution with the lowest values of AICC, BIC,-2logL, and AIC is the optimal distribution. From the Table 3 & table 4, the CBP distribution has lower AIC, AICC,BIC, as well as -2logL values than the New Two-Parameter Sujatha, Sujatha, Two Parameter Sujatha, Lindley and Exponential distributions. Therefore, it is confirmed that the CBP distribution offers a better match than the new Two-Parameter Sujatha, Two-parameter Sujatha, Sujatha, Exponential, and Lindley distributions for fitting of such bio-medical data. Hence the significance of the new distribution is established.
7. Conclusions
Here, a novel distribution is known as the ChandBhas-P distribution was implemented utilizing the length-biased approach in comparison to the conventional distribution. Its various statistical properties were derived and studied. Harmonic mean, moments, the shape of the cdf and pdf, the mean, variance, and standard deviation, as well as survival functions, hazard rate functions, order statistics, reverse hazard rate functions, Bonferroni, & Lorenz curves are observed. The MLE of the parameters of the distribution has been calculated. Two Biomedical data sets have been used to test the new distribution's superiority, and the findings show that the CBP distribution offers a far better fit than the new two-parameter Sujatha, Sujatha, two-parameter Sujatha, exponential, as well as Lindley distributions in the case of such biological data.
No Conflict of interest.
We declare there is no conflict of interest.
No funding agencies.
There are no funding agencies for this research article.
References
[1] Alhyasat, K., Kamarulzaman, I., Al-Omari, A. I. and Abu Bakar, M. A. (2020). Power size biased two parameter Akash distribution. Statistics in Transition new series, Vol. 21, No. 3, 73-91.
[2] Al-Kadim, K. A. and Hussein, N. A. (2014). New proposed length-biased weighted exponential and Rayleigh distribution with application. Mathematical Theory and Modeling, 4(7), 137-152.
[3] Beghriche, A. and Zeghdoudi, H. (2019). A size biased gamma Lindley distribution.
Thailand Statistician, 17(2), 179-189.
[4] ]Cox, D. R. (1969). Some sampling problems in technology, In New Development in Survey Sampling, Johnson, N. L. and Smith, H., Jr .(eds.) New York Wiley- Interscience, 506527.
[5] Cox, D. R. (1962). Renewal theory, Barnes and Noble, New York.
[6] Dar, S. A., Hassan, A. and Ahmad, P. B. (2022). Poisson size-biased Lindley distribution and its applications. International Journal of Modeling, Simulation and Scientific Computing, Vol. 13, No. 04, 2250031.
[7] Ekhosuehil, N., Kenneth, G. E. and Kevin, U. K. (2020). The Weibull length biased exponential distribution: Statistical properties and applications. Journal of Statistical and Econometric Methods, Vol. 9, No. 4, 15-30.
[8] Fisher, R. A. (1934). The effects of methods of ascertainment upon the estimation of frequencies. Annals of Eugenics, 6, 13-25.
[9] Ganaie, R. A. and Rajagopalan, V. (2020). Length biased weighted quasi gamma distribution with characterizations and applications. International Journal of Innovative Technology and Exploring Engineering (IJITEE), 9(5), 1110-1117.
[10] Mobarak, M. A., Nofal, Z. and Mahdy, M. (2017). On size-biased weighted transmuted Weibull distribution. International Journal of Advanced Research in Computer Science and Software Engineering, Vol. 7, Issue 3, 317-325.
[11] Mustafa, A. and Khan, M. I. (2022). The length-biased powered inverse Rayleigh distribution with applications. J. Appl. Math. & Informatics, 40(1-2), 1-13.
[12] Rasool, S. U. and Ahmad, S. P. (2022). Power length biased weighted Lomax distribution. RT &A, 17(4), 543-558.
[13] Rao, C. R. (1965). On discrete distributions arising out of method of ascertainment, in classical and Contagious Discrete, G.P. Patiled; Pergamum Press and Statistical publishing Society, Calcutta. 320-332.
[14] Sanat, P. (2016). Beta-length biased Pareto distribution and its properties. Journal of Emerging Technologies and Innovative Research (JETIR), Vol. 3, Issue 6, 553-557.
[15] Sen, S., Chandra, N. and Maiti, S. S. (2017). The weighted Xgamma distribution: Properties and application. Journal of Reliability and Statistical Studies, 10(1), 43-58.
[16] Tesfay, M. and Shanker, R. (2018). A new-two parameter Sujatha distribution with [17] properties and applications. Turkiye Klinikleri J Biostat, 10(2), 96-113.
[17] Tesfay, M. and Shanker, R. (2018). A two-parameter Sujatha distribution. Biometrics & Biostatistics International Journal, 7(3), 188-197.
[18] Zelen, M. (1974). Problems in cell kinetic and the early detection of disease, in Reliability and Biometry, F. Proschan & R.J. Sering, eds, SIAM, Philadelphia, 701-706.