M. U. Faruk, A. M. Isa, A. Kaigama RT&A, No 1 (77)
SINE WEIBULL DISTRIBUTION Volume 19, March 2024
SINE-WEIBULL DISTRIBUTION: MATHEMATICAL PROPERTIES AND APPLICATION TO REAL DATASETS
1Muhammad Umar Faruk, 2Alhaji Modu Isa, 3Aishatu Kaigama
3Department of Mathematics and Computer Science, Borno State University, Nigeria
alhajimoduisa@gmail. com
Abstract
New parameters can be added to expand families of distribution for greater flexibility or to construct covariate models in several ways. In this study, a trigonometric-type distribution called Sine-Weibull distribution was developed by adopting the Weibull distribution as the baseline distribution and Sine-G Family as the generator to generate a flexible probability distribution without the need for extra parameters. The moment, moment generating function, entropy, and order statistics are some of the mathematical aspects of this distribution that were derived. The Maximum Likelihood approach was used to estimate the new distribution's parameters. Using actual datasets, the Sine-Weibull distribution's applicability was demonstrated.
Keywords: Sine-G Family, Weibull Distribution, Probability Distribution, Maximum Likelihood Estimator
I. Introduction
Distribution functions, their properties and interrelationships play a significant role in modeling naturally occurring phenomena. For this reason, a large number of distribution functions, which were found applicable to many events in real life, have been proposed and defined in literature. Various methods exist in defining statistical distributions. Many of these arose from the need to model naturally occurring events. For example, the Normal distribution addresses real-valued variables that tend to cluster at a single mean value, while the Poisson distribution models discrete rare events. Yet few other distributions are functions of one or more distributions.
To explain real world phenomena, statistical distributions are widely applied. Their theory is widely studied due to the utility of statistical distributions, and new distributions are developed. In the field of probability theory and statistics, the search for creating a more effective and scalable distribution of probability remains high [1]. Numerous standard distributions have been extensively used over the past decades for modeling data in several fields such as Engineering, Economics, Finance, Biological, Environmental and Medical Sciences etc. However, generalizing these standard distributions has produced several compound distributions that are more flexible compared to the baseline distributions. For this reason, several methods for generating new families of distributions have been studied.
Weibull distribution is a continuous probability distribution. It is one of different distributions used to describe particle size with major application in survival analysis, weather
forecast and reliability engineering. The Weibull distribution is a continuous probability distribution. It was named after Swedish mathematician Waloddi Weibull, who describe it in detail in 1951, although it was first recognized by [2] and first applied by [3] to describe a unit size of distribution. Weibull distribution exist with scale and shape parameters. This distribution has become very popular in analyzing lifetime data and for many applications where a skewed distribution is required. Inducing of a new shape parameter(s) introduces a model into greater family of distributions and can give significantly skewed and heavy-tailed distributions and also provides greater flexibility in the form of new distribution.
Even when there is uncertainty about the future in real life, decisions still need to be taken. Thus, uncertainty issues must be dealt with by decision-making processes. Probability is one of the frequently employed strategies for addressing uncertainty in planning and management. In order to create a family of hybrid distributions that are more effective than their parent distributions, many researchers have focused on the idea of combining two or more probability distributions. By adding one or more parameters, these distributions become more flexible and can track a variety of random phenomena that are difficult to model using their parent distributions. The laws of generality, which state that when a particular distribution has more than four parameters, it undermines the performance of the model, can sometimes be breached by such compounding or extended distributions.
Many researchers have come up with new families of trigonometric in recent times. Some of these families include: exponentiated sine-generated family of distributions by [4], Sin-G class of distributions by [5], Sec-G Class by [7], Sine Square distribution by [8], Sine Inverse Lomax Generated Family by [9], Sine Burr XII by [10], Sine Kumaraswamy-G family of distributions by [11], Sine Topp-Leone family by [12], Sine-Exponential Distribution by [13] and Sine Power Lomax distribution by [14] (2021).
The quest for developing more efficient and flexible probability distribution remains strong in the field of probability theory and statistics. However, there is no single probability distribution that is suitable for different data sets. Therefore, there is a need to come up with their extended forms to give substitutive adaptable models or as to form a better representation of the data. Thus, this has triggered the need to extend the existing classical Weibull distributions. Therefore, this gives a gap of coming up with a distribution (Sine-Weibull Distribution) capable of handling a dataset that behaved negatively or positively skewed. Hence, this research is aimed at developing a new probability distribution function called Sine-Weibull Distribution.
2.1 The Weibull Distribution
A continuous random variable X is said to have followed a Weibull distribution if its cdf is expressed as;
2.2 Sine G Family of Probability Distribution
Let H(x) be the cumulative distribution function (cdf) of a univariate continuous distribution and h(x) be the corresponding probability density function (pdf), then, the Sine-G family of probability distribution according to [5] Kumar et al., (2015) is given by:
II. Methods
Is
x > 0
and the pdf is also expressed as;
(1)
(2)
F(x,0 = J cost dt = sin {-H (x, 0}
and its corresponding pdf is given by: f(x,0=\h(x,0 cos£H(x,i)}
(3)
(4)
where H(x, and h(x, f) are the cdf and the pdf of any baseline distribution with vector parameter 2.3 The New Sine Weibull Distribution
The pdf and cdf of the new sine Weibull distribution are given in equation (5) and (6):
n /X\k-1 (x\k (n
f(x,k,X)=-(-] e-(j) cosj-
1 - e-(~ï)
(5)
And
F(x, k,A) = sin
n
1 - e-(7i)
(6)
The survival function S(x), hazard function h(x), reverse hazard function r(x) and the quantile function Q (u) are given below:
S(x) = 1 - F(x) = 1-sinj-
1 - e-(V
h (x) =
f(x)
n 2(1
S)
k-1 (x\
e w cos
1 -
1 -F(x)
1-sinn
„ f(x) U (X\k-1 (E\k (n
1- e (
1 - e-(%)
Q(U)=F-1 A {-log ( 1
2sin-1U\)k
(7)
(8)
(9) (10)
2.4. Parameter Estimation
The parameters of the newly developed Sine-Weibull distribution will be estimated using the method of maximum likelihood (MLE). Moment and moment generating function (mgf) will be used in determine the mean, variance, skewness and kurtosis, among other properties, of the proposed distribution.
2.4.1 Method of Maximum Likelihood
k
x
k
x
2
k
x
k
k
x
n
k
x
1
U
Let Y1,Y2,...,Yn independent, identically distributed (iid) random sample of a random variable Y with pdf given by f(y/S), then the likelihood function L(S: y) of Y1, Y2,..., Yn is the joint density function when regarded as a function of the parameter. That is L(S:y) = nihf(yi,S)
It is more convenient to use the log likelihood.
Z(5:y) = ZnL(5,y)
The estimate of the parameter can be obtained by taking the derivative of the log likelihood function with respect to the parameter and equating to zero, that is
ôy
^ ZnL(<5,y) = 0
2.4.2 Maximum Likelihood of Sine-Weibull Distribution
(11)
Let ... be a random sample of size n from a Sine-Weibull distribution with a pd/
given by (1.1), the likelihood functionL(A: x) of this sample is given as
n /X\
L(A:x) = ffU/feA,) = ff/Li-y
C0S
te-i _(£r (7r e w cos{—
1 - e-(D
1 - e U)
Taking the log of the likelihood function gives
n
C0S{2
1 - e U)
Z(A, x) = n Zn (|) + (fc - 1)Zn > (y) + Zn >
e 2
Zn > cos {—
¿=i
1 - e U)
ZK!)*
cosj2
1 - e U)
= 0, because cos (—) = 0
To maximize equation(11), we take the derivative with respect to A and equate to zero
n /-fcx'
ÔZ v"1 /1\ V"
^-»-«ZJÏHZ,
=i\ ¿te+i
=iUte+i
=0
=0
Zn v-in
Xte = (fc - 1)>
A-iAte + i
V"
>=1 (fc-1)>
Ate = -
1 fc
n (fc-1)Z-lt=i
j=ft/i fc yn vk
A Jn (fc-i)^=iX
ft
x
ft
ft
X
e
2
ft
ft
ft
X
ft
X
X
1
te
X
=i
te
Equation (12) gives the maximum likelihood estimator of the parameter X 2.5. Some Mathematical Properties 2.5.1 Moment
Moments plays a vital role in the field of statistical analysis, particularly when it comes to real applications. Suppose that X is a random variable and r is a non-negative integer, the rth moment of X is the quantity E(Xfc) provided it expectation exist. The rth is given by:
Jr ro
x/(x)dx
X=0
The rth moment of proposed Sine-Weibull distribution is derive as follows:
n fro /X\fc-1
-,-f |n
e uz cos{— 2
1 - e-(D
dx
E(xr) =
n
2Afc-1
Jr OO
(x)fc-r+1
X=0
e w cos{— 2
1 - e-(D
E(xr) =
n
( )
fc-1+r
2Afc-1
Jr ro X
77"
^ E(xr) =
Jr ro
Jl)
fc-1+r
- £)- n e w cos {— 2
dx
1 - e-(D
dx
fc-1+r _fx\k (n e w cos
\k
1 - e-(I)
dx = 1 (i t is a pd/)
2Afc-:
"(A)
fc-1+r
E(xr) = -Ar
The first and second moments (when r = 1 and r = 2) are therefore given below, £ (x) = E (x2) = |a2
The variance is given below
7(x) = £(x2) - [E(x)]2 = -
Standard Deviation (S) = J^2 (1 - n)
n
(13)
(14)
(15)
(16) (17)
2.5.2 Skewness and Kurtosis of the Sine-Weibull Distributions
ft
X
ft
X
ft
X
X
2
The skewness and kurtosis of the sine-Weibull distribution are obtained using the third and fourth moment respectively with the power of the standard deviation of the distribution These approaches is the measure of kurtosis (a3) and skewness(a4) based on moments E( x3)
(«3) = E^
(«3) = . 2 — (18)
(a4) =
E(x4) S4
гЯ4
?A.4
1
(a4) = n П
(19)
2.5.3 Entropy
The entropy of is a measure of variation of the uncertainty. There are many entropy measures studied and discussed in literature but the Renyi entropy is perhaps one of the most popular. Renyi entropy of with proposed density function is given by
lR(p)
1 ( fm = r^l0*(J f(x)Pdx
(20)
where p > 0 and p ± 0. Inserting equation (4) into (20)
lR(p) = '
1-P
log
Г(2®
k—1 IЯ
e w cos
1 - e-w
dx
(21)
2.5.4 Order Statistics
Suppose that x1,x2, ...,xn are random samples of size n from probability distribution with pdf f(x) and cdf F(x) as defined in (3) and (4) respectively, the pth order statistic can be expressed
n! f(x)
Ш = {р-Шп-1)\Пху-1[1 - F(x)]n-P
(22)
The order statistics of the proposed Sine-Weibull distribution is given by:
fn(x) = ■
•m
k—1 ix\
e w cos-!-^
1 - e-yJ)
(p-1)!(n-1)l
Sin
л I (£) 2
p-I
x {1 - sin
1-e-(l)
n—p
(23)
III. Results
3.1 Application
Specifically, AIC is aimed to obtain the best approximating model to the unknown true data generating process. Superficially, BIC differs from AIC only in the first term which depends on sample size n. Models that minimize the BIC are selected. From a Bayesian perspective, BIC is designed to find the most probable model given the data.
2
p
к
1
x
P
к
к
П
к
M. U. Faruk, A. M. Isa, A. Kaigama SINE WEIBULL DISTRIBUTION
3.1.1 Dataset
One dataset was considered for illustrative purposes and comparison with the baseline distribution and other competitors. The comparison was done with Weibull distribution and Lomax distribution. We estimated the unknown parameters of the distribution by the maximum-likelihood method. We obtain the values of the Akaike information criterion (AIC), Bayesian information criterion (BIC) and consistent Akaike information criterion (CAIC) for the newly developed distribution as well as the competitors. The dataset consists of thirty successive values of March precipitation (in inches) in Minneapolis/St [16]. The data are as follows:
0.77 1.74 0.81 1.2 1.95 1.2 0.47 1.43 3.37 2.2 3.0 3.09
1.51 2.1 0.52 1.62 1.31 0.32 0.59 0.81 2.81 1.87 1.18 1.35
4.75 2.48 0.96 1.89 0.9 2.05
Table 1: Summary Statistics of the dataset
Data Minimum Gi Media Mean G3 Maximu
Dataset 0.92 1.302 1.544 1.658 1.814 5.306
Table 1 gives the summary statistics of the data sets such as the mean, the median, the first and third quartile, the minimum and the maximum values.
Table 2: MLE, AIC, CAIC, BIC, and HQIC of the data set
Data Set Mif 4/C C4JC BJC tfÇ/C
Sine-Weibull 55.61173 115.2235 115.3472 120.4388 117.3322
Weibull 150.5514 305.1029 305.2266 310.3132 307.2716
Lomax 150.5514 303.1029 303.1437 310.3132 304.1572
Table 2 presents the results of the analysis of the dataset. The result of the analysis of the Sine-Weibull Distribution was compared with Weibull Distribution and Lomax Distribution to test the efficiency of the model. The proposed Sine-Weibull distribution has proven to be the better model because it has the least AIC, CAIC, BIC and HQIC.
IV. Discussion
There has been a growing interest among statisticians and applied researchers in developing flexible lifetime models for the betterment of modelling survival data. In this paper, we introduced a two-parameter Sine-Weibull distribution which is obtained by considering a Weibull distribution as the baseline. We study some of its statistical and mathematical properties. Maximum Likelihood Estimation was used in parameter estimation. The usefulness of the new distribution was illustrated via the analysis of real data sets. We hope that the proposed extended model will attract wider applications.
References
[1] Alzaatreh, A., Lee, C., & Famoye, F. (2013). A new method for generating families of continuous distributions. Metron, 71(1), 63-79.
[2] Frechet, M. (1927). Sur la loi de probability de l'ecart maximum. Ann. de la Soc. polonaisede Math, 6, 93-116.
[3] Rosin, P. and Rammler, E. (1933) The Laws Governing the Fineness of powdered coal. Journal of the Institute of Fuel, 7, 29-36
[4] Muhammad, M., Alshanbari, H.M. Alanzi, A.R.A., Liu, L., Sami, W., Chesneau, C., Jamal, F. A.
(2021). New Generator of Probability Models: The Exponentiated Sine-G Family for Lifetime Studies.
Entropy, 23, 1394.
[5] Kumar, D., Singh, U., & Singh, S. K. (2015). A new distribution using sine function-its application to bladder cancer patients' data. Journal of Statistics Applications & Probability, 4(3), 417.
[6] Souza, L., Junior, W. R. O., de Brito, C. C. R., Chesneau, C., Ferreira, T. A. E., Soares, L. (2019). General properties for the Cos-G class of distributions with applications. Eurasian Bulletin of Mathematics, 2(2), 63-79.
[7] Souza, L., de Oliveira, W. R., de Brito, C. C. R., Chesneau, C., Fernandes, R., & Ferreira, T. A.
(2022). Sec-G class of distributions: Properties and applications. Symmetry, 14(2), 299.
[8] Al-Faris, R. Q., & Khan, S. (2008). Sine square distribution: a new statistical model based on the sine function. Journal of Applied Probability and Statistics, 3(1), 163-173.
[9] Fayomi, A., Algarni, A., & Almarashi, A. M. (2021). Sine Inverse Lomax Generated Family of Distributions with Applications. Mathematical Problems in Engineering, 1-11.
[10] Isa, A. M., Ali, B. A., & Zannah, U. (2022). Sine Burr XII Distribution: Properties and Application to Real Data Sets. Arid Zone Journal of Basic and Applied Sciences, 1(3), 48-58.
[11] Chesneau, C., & Jamal, F. (2020). The sine Kumaraswamy-G family of distributions. Journal of Mathematical Extension, 15.
[12] Al-Babtain, A. A., Elbatal, I., Chesneau, C., & Elgarhy, M. (2020). Sine Topp-Leone-G family of distributions: Theory and applications. Open Physics, 1S(1), 574-593.
[13] Isa, A. M., Bashiru, S. O., Ali, B. A., Adepoju, A. A., & Itopa, I. I. (2022). Sine-Exponential Distribution: Its Mathematical Properties and Application to Real Dataset. UMYU Scientifica, 1(1), 127-131.
[14] Nagarjuna, V. B. V., Vardhan, R.V. and Chesneau, C. (2021). On the Accuracy of the Sine Power Lomax Model for Data Fitting. Modelling, 2, 78-104.
[15] Hinkley, D. (1977). On quick choice of power transformation. Journal of the Royal Statistical Society: Series C (AppliedStatistics), 26(1), 67-69.