Научная статья на тему 'A New Class of Sin-G Family of Distributions with Applications to Medical Data'

A New Class of Sin-G Family of Distributions with Applications to Medical Data Текст научной статьи по специальности «Математика»

CC BY
133
53
i Надоели баннеры? Вы всегда можете отключить рекламу.
Область наук
Ключевые слова
Sine-G distribution / Inverse Weibull distribution / Maximum Likelihood Estimation / Entropy / Quantile Function

Аннотация научной статьи по математике, автор научной работы — Laxmi Prasad Sapkota, Pankaj Kumar, Vijay Kumar

This article is dedicated to the study of the new class of distributions and one of its particular members. Based on the ratio of CDF G(x) and 1+G(x) of the baseline distribution, we have developed the new trigonometric family of distributions by transforming the sine function, and we named it the new class sin-G (NCS-G) family of distributions. The general properties of the suggested family of distributions are provided. Using the inverted Weibull distribution as a baseline distribution, we have introduced a member of the suggested family having a reverse-j or increasing, or inverted bathtub-shaped hazard function. Some statistical properties of this NCS-IW distribution are explored. The associated parameters of the new distribution are estimated through the MLE method. To assess the estimation procedure, we conducted a Monte Carlo simulation and found that even for small samples, biases and mean square errors decreased as the size of the sample increased. Two real medical data sets are considered for the application of the NCS-IW distribution. Using some criteria for model selection and goodness of fit test statistics, we empirically proved that the suggested model performs better than six other existing models (most of which have more parameters).

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «A New Class of Sin-G Family of Distributions with Applications to Medical Data»

A New Class of Sin-G Family of Distributions with Applications

to Medical Data

Laxmi Prasad Sapkota, Pankaj Kumar and Vijay Kumar

Department of Mathematics and Statistics, DDU Gorakhpur University, Gorakhpur, UP, India. [email protected], [email protected], [email protected]

Abstract

This article is dedicated to the study of the new class of distributions and one of its particular members. Based on the ratio of CDF G(x) and 1 + G(x) of the baseline distribution, we have developed the new trigonometric family of distributions by transforming the sine function, and we named it the new class sin-G (NCS-G) family of distributions. The general properties of the suggested family of distributions are provided. Using the inverted Weibull distribution as a baseline distribution, we have introduced a member of the suggested family having a reverse-j or increasing, or inverted bathtub-shaped hazard function. Some statistical properties of this NCS-IW distribution are explored. The associated parameters of the new distribution are estimated through the MLE method. To assess the estimation procedure, we conducted a Monte Carlo simulation and found that even for small samples, biases and mean square errors decreased as the size of the sample increased. Two real medical data sets are considered for the application of the NCS-IW distribution. Using some criteria for model selection and goodness of fit test statistics, we empirically proved that the suggested model performs better than six other existing models (most of which have more parameters).

Keywords: Sine-G distribution, Inverse Weibull distribution, Maximum Likelihood Estimation, Entropy, Quantile Function

1. INTRODUCTION

Statistical distributions are frequently used to investigate real-world phenomena. The theory of statistical distributions is extensively studied, as are new developments in their application. Several families of distributions have been developed to describe various real-world phenomena. In reality, this new development in distribution theory is a continuing practice. Many probability distributions proposed in the literature have a large number of parameters to make the model more versatile. However, obtaining estimates for these parameters can be challenging using numerical resources, as per some authors Marshall and Olkin [17]. Hence, it is better to create models with fewer parameters and greater flexibility for modeling actual data. To achieve this objective, a group of researchers searched for new distributions employing trigonometric functions. In the last few years, researchers have been attracted to trigonometric models due to their flexibility and mathematical tractability. Among the various trigonometric G-family members, Kumar et al. [15] have defined a new class of distribution using the sine trigonometric function and defined the sin-exponential model as its member. The cumulative distribution function (CDF) of this family is given by

F (x; x ) = sin{ 2K (x; x)} x e R, (1)

where K (x; x) is the CDF of any base continuous distribution. Instantaneously, Souza [24] introduced another trigonometric model based on the sine function and Gomez-Deniz and Caldern-Ojeda [9] define the arc-tan-G family of distributions using the arctangent function. Gomez-Deniz and Caldern-Ojeda [9] demonstrated the new distribution family that was used to characterize Norwegian fire insurance data.

This distribution family was introduced for an underlying Pareto distribution and a new model named the Pareto arctan distribution, and it was discovered that when compared to other well-known distributions, this distribution offers an excellent fit. Similarly, the hyperbolic cosine-F families of distributions were defined using a hyperbolic trigonometric function by Kharazmi and Saadatinik [14], and the hyperbolic cosine Rayleigh distribution was defined by Sakthivel and Rajkumar [22]. Using a similar technique as used in sin-G, the Cos-G family of distributions was introduced by Souza et al.[25] who also introduce the Cos-Weibull distribution as a member of Cos-G class. Similarly, Souza et al. [26] have introduced another sin-G class as defined by Kumar et al. [15] with bathtub-shaped or reverse-j, or increasing failure rate function, and studied the Sine inverse Weibull distribution as a particular member. The CDF of the Sin-G class of distribution is

f K(x;a)

rn .1

; x € R (2)

r n

F (x; ® )= I cos(t )dt = sin — K (x; ®)

2

0

where K (x; ®) is the CDF of any parent distribution and ® > 0 is the vector of parameters of the parent distribution. Also, Mahmood et al. [16] have developed the new sin-G family and analyzed the sin-inverse Weibull model in particular. Chesneau and Jamal [6] have defined the sine Kumaraswamy-G family of distributions as having two extra parameters to this family. Muhammad et al. [19] have defined the exponentiated sine-G family and analyzed the particular distribution as an exponentiated sine-Weibull distribution. Another trigonometric function-related probability model introduced by Chaudhary et al.[3] is called Arctan generalized exponential distribution. Using the sine-G family of distributions, Isa et al. [12] have developed a new two-parameter model called the sine Burr XII distribution. Hence, we have noticed that the simple functions are associated with trigonometric distributions and are mathematically tractable (see [15], [26]). Further, the sine transformation can remarkably enhance the flexibility of G(x) without any additional parameters Chesneau and Jamal [6]. Due to these pleasant features, we are motivated towards the sine transformation family. In this study, we have developed a new family of trigonometric models using the sine function, and we called it the "new class of sine-G family" (NCS-G) of distributions. The other parts of this study are organized as follows: Section 2 introduces the model development methodology as well as some key functions of the distribution family. Some general properties and parameter estimation of the NCS-G family are presented in Sections 3 and 4 respectively. In Section 5, a particular member of the NCS-G family is introduced. A detailed study and application of this model are also presented in this section. Finally, we present the conclusion in Section 6.

2. The NCS-G Family of Distribution (NCS-G FD)

Using the T-X approach proposed by Alzaatreh et al. [1], this study proposes a new family of distributions

known as the NCS-G family of distributions. Let G(x; £) be a baseline CDF of a continuous random variable X and £ > 0 be a vector of associated parameters, and then the CDF F (x; £) of the NCS-G FD is defined as

vl+G(^ r G( • £)

F(x;£)= f cos(t)dt = sin n 1 + G;)

; x € R. (3)

0

Differentiating the CDF defined in Equation (3), the PDF f (x; £) of the family is expressed as

f (x; £) = n cos

_ G(x; £) 1 + G(x; £ )J

g(x;£) ; x € R. (4)

(1 + G(x; £ ))2

2.1. Reliability Function The Reliability function of NCS-G FD is given by

G(x; £)

R(x; £) = 1 - sin

n

1 + G(x; £)_

; x € R. (5)

2.2. Hazard Function The Hazard function of NCS-G FD is given as

H (x; I ) = n cos

_ G(x; I ) 1 + G(x; I )J

g(x; I )

(1 + G(x; I ))2

, • , G(x; I ) 1 - sin n-

'1 + G(x; I )

-1

; x G R (6)

2.3. The Quantile Function (QF) The pth quantile can be calculated by solving, Q(p) = F-1 (p). Now the QF of NCS-G FD is given by

Qx (p; | ) = G-1 r sin-1 (p)

where p has U (0,1) distribution.

- sin 1 (p)

(7)

2.4. Random Deviate Generation Random deviate for the NCS-G FD can be generated

x = G-1

sin 1 и

n — sin 1 и

(8)

where и G U (0,1) distribution.

2.5. Skewness and Kurtosis Bowley's measure of skewness was defined by Kenney and Keeping [13] as,

S (B) = Q(3/4;g) + Q(1/4;g) -2Q(1/2;g) (Q)

Sk(B)= Q(3/4;g) - Q(1/4;g) (9)

and the coefficient of Moor's kurtosis defined by Moors [18] is given by

K (M)= 6(0.875;g) -Q(0.625;g) + Q(0.375;g) -Q(0.125;g) ....

Ku(M) =-Q(3/4;g) - Q(1/4;g)-. (10)

3. General Properties of NCS-G FD 3.1. Linear form

Using the following Taylor series expansions, we can express the density function of NCS-G FD in a linear form as

» „ x2n x2 x4 x6 x8

L, 2n AAAA

n=0 (-1) (2n)! = 1 - 2! + 4! - 6! + 8! -•; -» < X < ». (11)

/Л \r ^ (c\ „ c c(c - 1) 2 c(c - 1)(c - 2) 3 II, (1 + x)c = £ ^Jx" = 1 + y!x+ + ^--^x3 + ••• ;|x| < 1. (12)

The PDF of NCS-G FD is

~ — 2i+1 (_1 )2i

f (x; I ) = g(x; I ) £ П (2(i),1) ( 1 + G(x; I ))2(i-1) (G(x; I ))2i. (13)

i=0

(2i)!

Further expanding Equation (13) using generalized binomial series expansion. The expression for f (x; | ) becomes

œ œ

f (x; | ) = g(x; I ) £ £ Tj {G(x; | )}2i+j ; x G R (14)

i=0 j=0

here

n2'+1(-l)2' /2(i- 1)

(2i)! V j

Tij = ' ( V , ). (15)

3.2. Moments The rth order non-central moment (^ftT) for the NCS-G FD is

»

= E (Xr ) = J xrf (x)dx

— CO

» (16)

II Tj I xr (G(x;£ ))2i+jg(x;£)d

x (G(x; £ ) ) g(x; £ )dx

i=0 j=0

Further moments can also be calculated using the quantile function for more detail (see Balakrishnan and Cohen [2]). Let G(x; £) = p ^ g(x; £ )dx = dp; 0 < p < 1, then rth moment can be computed using

1

CO CO p

V = II Tj / p2i+jßG( p)dp; 0 < p < 1. (17)

i=0 j=0 0

where QG(p) is the QF of any distribution.

3.3. Moment Generating Function The MGF (Mx (t)) for the NCS-G FD is

» tk •

Mx (t )= I ^ ft,

k=0

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

CO

(18)

III kï T7/xrg(x; £ )(G(x; £ ))2i+jdx. i=0 j=0k=0 k

Using the quantile function, MGF can be expressed as

, 1

» » » tk f

MX(t) = III -Tj p2i+jßrG(p) dp, 0 < p < 1. (19)

i=0 j=0k=0 k 0

where gG(p) is the QF of any distribution.

3.4. Incomplete Moment

y

The incomplete moment can be defined as Mr(y) = /xrf (x)dx. Therefore incomplete moment for NCS-G

0

FD is given by

y

CO CO p

Mr (y) = II T7xrg(x; £ ) {G(x; £ ) }2i+j dx (20)

•=0 j=0-»

Alternately, Mr(y) may be expressed in terms of QF as

G(y)

CO CO p

Mr (y) = II Tj / p2i+jßrG( p)dp; 0 < p < 1 (21)

n n

i=0 j=0

3.5. Mean Residual Life (MRL) The MRL of the random variable X is defined as

M(y) 1

F (y)

y

ft — J xf (x)dx

- y. (22)

Therefore, MRL for NCS-G FD is given by

M (y)

F (y)

V - II Tj xg(x; |) {G(x; | )}2i+j dx i=0 j=0

- y.

Alternately, M(y) can be expressed in terms of QF as

M (y)

F (y)

G(y)

V - II Tij / P2i+jQG(p)dp i=0j=0 0

- y.

(23)

(24)

3.6. Inequality Measure

In several fields, including insurance, econometrics, and reliability, we can employ Lorenz and Bonferroni curves to measures such as income, poverty, etc. i) Lorenz Curve

The function of the Lorenz curve is written as hence Lorenz curve for NCS-G FD is given by

1

V i=0 j=0

Alternatively, it can be written in terms of QF as

Lf(y) = V II Tij xg(x;|) (G(x;|))2jx.

(25)

G(y)

-I TO TO ,,

Lf(y) = 1II Tij p2i+jQG(p)dp.

V i=0 j=0

(26)

ii) Boneferroni Curve

The Boneferroni curve can be calculated using BF(y) = Fy)• From Equation (25), the Boneferroni curve for the NCS-G FD is calculated as

B

1

F (y)

V F (y) i

II Tij I xg(x; |) (G(x; | ))2i+j dx

(27)

i=0 j=0

3.7. Entropy

Entropy quantifies the uncertainty or variation of a random variable. Its application spans numerous disciplines, including econometrics, probability theory, engineering, and life sciences in general. There are several types of entropy, some of which are as follows: i) Renyi's Entropy

Entropy is used as a measure of uncertainty or variation in a random variable in many disciplines, including engineering, econometrics, insurance, etc. Renyi [20] introduced entropy measures, which can be used to calculate the variability of uncertainty.

TO

Rp(X) = rblog/ {f(x)}Pdx

(28)

and p = 1. The PDF of NCS-G FD [f (x, | )]p can be defined in the form of

[f(x;|)]p = np (g(x;|))p By considering the Taylor series of the function

coM n

cos n

G(x; |) 1 + G(x; |)

G(x; |)

(1 + G(x; | ))-2p

1 + G(x; |)

(29)

(30)

y

1

1

y

y

at the point s=1/4, we can write

[cos (ns)]p = ££ oj*) (—1)k—1

k=0 r=0 W

k—r

(31)

where ak = j! [{cos (ns)}P](k)

using this relation Equation (29) becomes

[f (x; £ )]P = nP (g(x; £ ))P ££> (k) (—1)k—^ 1) k—r (G(x; £ ))r (1 + G(x; £ ))—(2p+r) (32)

Further expanding Equation (32) using generalized binomial series expansion. The expression for [f (x; £ )]P becomes

œ k œ

[f (x; £ )]P = nP £££ (—1)m+k—rak

m+k—^ k—V(2p+r)+m—1

k=0 r=0 m=0

V \4y

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

m

(G(x;£))r+m (g(x;£))P (33)

Substituting [f (x, £ )]p into the expression defining Equation (28), Renyi's entropy for NCS-G FD is given by

Rp(X ) = Y—p log where Zkrm = ( —1)m+k—r nPak (k) (4)k—'' ^m^^ .

7

œ k œ p

££ £ Zkrm / (g(x;£))P (G(x;£))r+mdx

k=0 r=0 m=0

(34)

Zk

ii) q-Entropy The q-entropy is given by

H (p ) = — log 1 — p

1 — / {f (x)}Pdx

(35)

p > 0 and p = 1. Substituting [f (x; £ )]p from Equation (32) into the expression for H(p), the q-Entropy for NCS-G FD is given by

H (p )

1 — P

log

œ k œ

1 — ££ £ Zkrm / (g(x; £ ))P (G(x, £ ))r+mdx

k=0 r=0 m=0

(36)

where p > 0 and p = 1. iii) Shannon's Entropy

When p t 1, Shannon's entropy for a random variable X with PDF f (x) is a particular case of Renyi's entropy. Shannon entropy is defined as nX = E(- logf (x)) . For the NCS-G FD is given by

nx = E

— log ££ Tjg(x;£) (G(x;£)) U=0 j=0

2i+j

(37)

4. Estimation Method NCS-G FD 4.1. Maximum Likelihood Estimation (MLE)

In this section, the parameters of the NCS-G FD are estimated using the MLE method. Given a random sample x1,...,xn of size n with parameters vector £ from the NCS-G FD, we can compute the MLEs. Let u = £ T be (p x 1) parameter vectors, the log density and total log-likelihood function, respectively, are given by

L(x; £ ) = log n + log

cos < n

G(x; £ ) 1 + G(x; £ )

— 2 log (1 + G(x; £ )) + log g(x; £ ),

(38)

r

s

CO

1

and

I (x, I) = n log n + £ log

i=1

cos < n

G(x; I)

- 2 £ log (1 + G(xi; |)) + £ logg(xt; I). (39)

i=1 i=1

1 + G(x,-; I)

Partially differentiating the Equation (39) with respect to | gives the score function's components of the V(u) = (as follows

d l

-r-r = -n £ tan< nI 1

G(x; I) 1 Gk(x(-; |)

+ G(xi;1H (1 + G(x; I ))2 = (1 + G(xi;1)) =1 g(xi;I)'

G'k (x; I)

+ £

gk (x;1)

where gk(xi; |) = ^, gk(*; I) = ^, G'kI) = ^ and Gk(*; I) = .

4.2. Method of Least Square Estimation (LSE)

Another method of estimation was introduced by Swain et al. [27] named the ordinary LSE and weighted LSE to estimate the distribution parameters. Consider x(1),...,x(n) be order statistics of the random sample of size n from F(x, I). The LSE for the NCS-G FD can be obtained by minimizing

K (X; I ) = £

i=1

f (x(i); I) -

n + 1

with respect to I. The least-square estimates for the NCS-G FD also become

k (X; I ) = £

i=1

sin

. G(x(i); I) 1 + G(x(0; I)

n +1

Now differentiating Equation (41) with respect to I we get

d K A "Te = 2n £

sin

. G(x(0; I) 1 + G(x(0; I)

n +1

cos

. G(x(;); I)

1 + G(x(0; I)

Gk (x(i); I) 1 + G(x(0; I)

(40)

(41)

(42)

where Gk(xi; I) = ^^;I). By solving d§ = 0 , we will get the LSEs

dI .^y »un dI

4.3. Cramer-von Mises Minimum Distance Estimator (CVME)

Cramer-von Mises estimators (CVMEs) are specific types of statistical estimators that minimize the difference between the estimated and the empirical CDF. These estimators are considered to have a lower bias compared to other minimum distance estimators. In the context of estimating parameters for the NCS-G FD distribution, CVMEs can be used to obtain more accurate estimates by minimizing

2

1

C(X;I )=1*+£

i=1

f (x(i); I) -

2i - 1 2n

with respect to I . The CVMEs for the NCS-G FD also become

" " G(x(0; I)

C(X; I ) = £

i=1

sin

'1 + G(x(0; I)

2i - 1 2n

Now differentiating Equation (44) with respect to I we get

d C „ A

TE" = 2n £

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

sin

. G(x(0; I) 1 + G(x(0; I)

2i - 1

— - cos

2n

. G(x(;); I) '1 + G(x(0; I)

Gk (x(i); I)

1 + G(x(0; I)

(43)

(44)

(45)

where Gk(x(i); I) = ^I^). By solving dC = 0, we will get the CVMEs.

dI

2

2

2

2

2

5. Special Member of NCS-G FD

Generalization of several distributions can be made using the NCS-G FD. Here we have considered the inverse Weibull (IW) distribution as a parent distribution to introduce a special member.

5.1. A New Class Sin Inverse Weibull (NCS-IW) Distribution The CDF and PDF of the IW distribution are respectively given by

G(x;8,0) = exp(-0x-8);x > 0,8 > 0,0 > 0

and

g(x; 8,0 ) = 80 x-(8+1)exp(-0 x-8 ). The CDF and PDF of the NCS-IW distribution are given by

F (x; 0, 8 ) = sin

. exp(-0 x-8 ) 1 + exp(-0 x-8 )

; x > 0.

(46)

f (x; 0, 8 ) = П08 x-(8+1) cos

exp(-0x 8) 1 + exp(-0 x-8 )

exp(-0x 8) (1 + exp(-0 x-8 )):

; x > 0.

The reliability and hazard functions, respectively, are given by

R(x; 0, 8) = 1 - sin

exp(-0x 8) 1 + exp(-0 x-8 )

; x > 0.

and

h(x; 0, 8 ) = n08 x-(8+1)

exp(-0x 8) (1 + exp(-0 x-8 ))2

cos

1 — si^ I П

exp(-0x 8)

1 + exp(-0 x-8 )

exp(-0 x-8 )

1 + exp(-0 x-8 )

; x > 0.

(47)

(48)

(49)

The possible shapes of PDF and HRF of NCS-IW distribution are shown in Figure (1) and it is observed that HRF can have reverse-j, or inverted bathtub or increasing hazard function. The quantile function and random deviate generation for the NCS-IW distribution, respectively, are given by

Qx ( p)

and

1 ( sin 1 p

- 0 log -'. Z7J

0 - sin 1 p

1 f sin 1 и

- 0 lo^ -! Zj

0 \n — sin 1 и

(50)

(51)

n

x

5.2. Linear Expansion Using Equation (14), Equation (47) can be expressed in linear form as

от от , ^

f(x;£) = ££Bi7x-(8+1)expj-(2i + j + 1)0x- j (52)

i=0 j=0

where B, = (-i^(2(i - 1)

3 = 0.15, e = 0.50 3 = 2.75, e = 0.82 3 = 5.00, e= 1.25 3 = 2.20, e= 1.75 3 = 0.75, e= 1.50

5 = 0.15, 0= 1.50 5 = 0.75, 0 = 2.50 5= 1.50, 0 = 2.25 5 = 2.50, 0 = 3.75 5 = 3.50, 0 = 4.50

Figure 1: Shapes of PDF and HRF of NCS-/W distribution

5.3. Moments

Using the PDF defined in Equation (52), the rth order non-central moment (ft') for the NCS-IW distribution can be presented as

^ = II bç

8-:

8' 8-r- ; v8 > : i=0j=0 [0{(2/ + j) + 1}] -

where B-j = ( ^p0)?— - ^ and Г(.) is the gamma function.

5.4. Moment Generating Function (MGF) The MGF (MX (t )) for the NCS-IW distribution is

от от от

Mx (t ) = III

ÎkBÎ;

Г(

оj=0k=0 k! [0{(2/ + j) + 1}]^

; v8 > r .

(53)

(54)

5.5. Incomplete moment The incomplete moment for NCS-IW distribution is presented as

M,

Г r >,

(y) = IIBij /xr-(8+1)exp{-(2/ + j + 1)0x-8} i=0 j=0 0

Y ( ^, (2/ + j + 1)0 y-8 )

dx

8II B/j-

i=0j=0 {(2/ + j + 1)0} «

(55)

where y(.) incomplete gamma function.

4 -

5 -

4 -

3 -

3 -

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

2 -

2 -

1 -

1 -

0 -

0 -

x

x

y

5.6. Mean Residual Life The MRL for NCS-IW distribution is given by

M (y)

1

F (y) 1

W)

y

to to „ 1

- ££ Bij x-5expj -(2i + j + 1)0x-5\dx i=0 j=0 0

-y

1

y(V, (2i + j + 1)0 y-5

^ - 5 £ £ Bij---rrs=i

5 i=0j=0 {(2i + j + 1)0

- y,

(56)

where y(.) is the incomplete gamma function. Using the Equations (9 and 10) for NCS-IW distribution, we have plotted the graphs of skewness and kurtosis in Figure (2) for different values of the parameters 5 and 0.

Figure 2: Skewness and Kurtosis plots of NCS-IW distribution.

5.7. Entropy

i) Renyi's Entropy

Renyi's entropy for NCS-IW distribution is given by

1

RP(X ) = —p log

j w to k to „

£ £ £ Zkrm (50)p / x-p(5+1) exp(-(r + m + p)0x-5)dx

k=0 r=0 m=0 0

1 - P

log

to k to (50)p r •

£ £ £ Zkrm "5 " " (p-1)(5+1) , ,

k=0 r=0 m=0 " iv- I ™ I - 5

| (P-X5+1) + 1j

{(r + m + p )0 } 5

(57)

whereZkrm = (-1)m+k-rnPak(^j (4)k-r +m+m-1

ii) q-Entropy

1

The q-Entropy for NCS-IW distribution is given by

H (p ) = гтр log

1 - Zkrm (50)p/x-p(5+1) exp(-(r + m + p)0x-5)dx

1 - p

log

1Z

'krm

(5 0)

p Г

(P 1) (5+1) 5

+ 1

}

(p-1)(5+1)

{(r + m + p) 0} 5

+1

/A /(2p+r)+m-1\

where p > 0 andp = 1. where Z^ = (-1)m+k-r npM (4)k r ( m J iii) Shannon's Entropy

The Shannon entropy for the NCS-IW distribution is given by

Пх = E

-loj££ П2'+1(-1)2i P(/ - ^ x-(5+1) exp(- 0(2i + j + 1)x-5)

,!=0 j=0

(2;)!

(58)

(59)

5.8. Inequality Measure

i) Lorentz Curve

The Lorenz curve for NCS-IW distribution is given by

5

Lf(y) = — L L Bij x-5 exp(- 0(2/ + j + 1)x-5 )dx i=0 j=0 0

0 - - Y (^ (2/ + j + 1) 0y-5)

— — L L B«j-^ij—.

/—0j—0 {(2/ + j + 1)0}-s-

where y( ) is the incomplete gamma function. ii) Boneferroni Curve

The Boneferroni curve for the NCS-IW distribution is given by

(60)

1 от от p

Bf(y) = ^T^ L L By x-5 exp(- 0(2i + j + 1)x-5)dx MF (y) i=o j=o 0

LL b;

y(5-1, (2/ + j + 1) 0y-5)

5ftF(y) i—0j—T{(2/ + j + 1)0}¥

where y( ) is the incomplete gamma function.

(61)

5.9. Estimation MLE for NCS-IW distribution

We now investigate the MLE for estimating the parameters of the NCS-IW model. As a result, we intend to compute MLEs for the parameters 5 and 0. Let X — (x1,...,xn)T be a vector of size n of independent random variables from the NCS-IW distribution. Then, the log-likelihood is given by

l(x; 5, 0) = nlog(n 05) - (5 +1) L logx; + L log cos

i=1 i=1

, exp(- 0x-5) '1 + exp(- 0x-5)

-2Llog(1 + exp(-0x-5)J -0 Lx-

;=1 ;=1 (62)

от

1

5

y

Partially differentiating the Equation (62) with respect to S and 0 gives the score function's components of

V(u) = (dS, Has,

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

d l n " „^x-5 log(xi ) exp(-0x-5 )

= 5 - I logx + n0 I ' ë( ) P( 5 ' 2 ) tan

d5 5 '=1 '=1 (l + exp(-0 x-5 )J

+2« îx-! 'og(x ) exp(-f r )+, ¿,-5 ,og(x, )

i=i il + exp(-0 x-5 H i=i

, exp(-0 x-5 ) 1 + exp(-0 x-5 )

and

dl n " x-5exp(-0x-5 )

= « - n I —'-—-—r tan

d0 0 hi,, n..S<

i=1 (l + exp(-0 x-5 ))

, exp(-0x-5 ) 1 + exp(-0 x-5 )

(63)

-2II x-5exp(-0x-5) - I-i=1 (1 + exp(-0 x-5 )) i=1 '

(64)

The MLEs of S and 0 are obtained by maximizing l (x; S, 0) in S and 0 , which can be done by solving simultaneously the equations |S = 0 and = 0.

5.10. Simulation

Using the maxLik R package introduced by Henningsen and Toomet [10], we generated samples from the quantile function defined in Equation (50) for various parameter combinations of the NCS-IW distribution and calculated the MLEs for each sample using the maxLik() function with the BFGS algorithm. This allows us to test parameter estimation problems such as the sharpness or flatness of the likelihood function, as well as estimate the size and direction (underestimate or overestimate) of the MLEs bias. Sample sizes of 20, 30, 40, 50, and 75 are used in the simulation. The procedure is repeated 10,000 times, and the average estimate value, bias, and mean square error (MSE) are calculated. The experiment is summarized in Table 1, which shows the average estimate, bias, and MSEs for each parameter. As can be seen, the MLE method consistently overestimates the parameter S and underestimates the parameter 0, but as sample size increases, MLEs gradually approach the actual values of S and 0 .

Table 1: The estimated values, Biases, andMSEs based on 10000 simulations of NCS-IW distribution.

Actual values MLEs Bias MSEs

n delta theta 5 0 5 0 5 0

0.25 0.50 0.268 0.4796 0.018 -0.0204 0.0029 0.0219

20 0.50 0.75 0.5372 0.7263 0.0372 -0.0237 0.0115 0.0318

0.75 1.00 0.805 0.9816 0.055 -0.0184 0.026 0.0415

0.25 0.50 0.2621 0.4869 0.0121 -0.0131 0.0016 0.0142

30 0.50 0.75 0.5241 0.7343 0.0241 -0.0157 0.0066 0.0215

0.75 1.00 0.7874 0.9848 0.0374 -0.0152 0.0154 0.0277

0.25 0.50 0.2593 0.4889 0.0093 -0.0111 0.0012 0.0109

40 0.50 0.75 0.5175 0.7377 0.0175 -0.0123 0.0046 0.0157

0.75 11.00 0.7768 0.9911 0.0268 -0.0089 0.0103 0.0201

0.25 0.50 0.257 0.4919 0.007 -0.0081 0.0009 0.0087

50 0.50 0.75 0.5146 0.7398 0.0146 -0.0102 0.0037 0.0129

0.75 1.00 0.7696 0.992 0.0196 -0.008 0.0078 0.0159

0.25 0.50 0.2546 0.4943 0.0046 -0.0057 0.0006 0.0059

75 0.50 0.75 0.5089 0.7444 0.0089 -0.0056 0.0022 0.0084

0.75 1.00 0.7646 0.994 0.0146 -0.006 0.0052 0.0105

5.11. Application

Employing two real data sets, we exhibit the application of the NCS-IW distribution in this section. The data sets employed for the application of the suggested distribution are given as follows

i) Data set

Data set 1 (cancer data):

The data set contains information on the survival times of 44 patients. These patients who received radiotherapy have head and neck cancer, and this data set was reported by Efron [8].

"12.20, 23.56, 23.74, 25.87, 31.98, 37, 41.35, 47.38, 55.46, 58.36, 63.47, 68.46, 78.26, 74.47, 81.43, 84, 92, 94, 110, 112, 119, 127, 130, 133, 140, 146, 155, 159, 173, 179, 194, 195, 209, 249, 281, 319, 339, 432, 469, 519, 633,725,817, 1776". Data set 2 (relief time data):

The real data set is considered from Clark and Gross [7], which provides the relief times of 20 patients receiving an analgesic. The data are:

"1.1, 1.4, 1.3, 1.7, 1.9, 1.8, 1.6, 2.2, 1.7, 2.7, 4.1, 1.8, 1.5, 1.2, 1.4, 3, 1.7, 2.3, 1.6, 2.0".

ii) Model Analysis

We calculate some well-known goodness-of-fit statistics to analyze data sets 1 and 2 and the fitted models are evaluated using the log-likelihood value (-2logL), Akaike information criterion (AIC), Hannan-Quinn information criterion (HQIC), Anderson-Darling (AD), Kolmogrov-Smirnov (KS) with p-values, and Cram'er-von Mises (CVM) for more detail (see Chen and Balakrishnan [5]). All the essential computations are carried out in R-software. For the comparison of fitting capability, we have selected some models such as inverse Weibull (IW), transformed sine Weibull (TSW) Sakthivel and Rajkumar [23], arctan generalized exponential (AGE) Chaudhary et al. [3], arctan Lomax (ALomx) Chaudhary and Kumar [4], arcsine exponential (ASE) Rahman [21], and arcsine exponentiated Weibull (ASEW) He et al. [11]. The PDFs of candidate models are as follows

fw (x; 8,0) = 50x-8-1e-0x ",x, 8,0 > 0.

frsw (x;a,0,X) = na0x0 1e

/age (x; a, 0, X)

nX

(1 - e-ax0 ) cos (2e-ax0 ) - (1 - X) sin (ne-ax0 )

, x, a, 0, X > 0.

a0X e-Xx (1 - e-Xx)0 1

arc tan (a)

1 + {a (1 - (1 - e-Xx)0) }

; x, a, 0 , X > 0.

./al omx (x;a, 0, X)

a0X

(1 + 0 x)

-X-1

arc tan (a)

1 + { a (1 + 0 x)-X}

; x, a, 0, X > 0.

/ase (x; a)

-y/g-x/C

\J1 - e-x/<

; x, a > 0.

na

2 (1 - g-Xxa)

/ASEW (x; a,0,X) = 2a0Xxa-1e-Xxa V J

0-1

n

1 - 1-

Xxa)20

; x, a, 0, X > 0.

In Tables 2 and 3, we have presented the estimated values of the parameters and their associated standard error (SE in parentheses) of the models under study using the MLE method for cancer and relief time data. Similarly, in Tables 4 and 5, we have presented the model selection and goodness of fit statistics like log-likelihood, HQIC, AIC, KS, AD, and CVM for both data sets. It has been observed that the suggested model has the least statistics as compared to IW, AGE, ALomx, ASE, ASEW, and TSW. Hence NCS-IW is more flexible (even four trigonometric distributions having three parameters) and provides a good fit. Also, we have displayed the graphical illustrations of the fitted models in Figures 5 and 6. These figures also verified that the NCS-IW model can perform well as compared to candidate models

0

x Empirical distribution function

Figure 3: KS and P-Pplots (data-/).

x Empirical distribution function

Figure 4: KS andP-Pplots (data-//).

Table 2: MLEs with SE (In parentheses) (data-/).

Distribution Parameter(SE) Parameter(SE) Parameter(SE)

NCS-IW(8, 0) IW(8, 0) AGE(a, 0, X) ALomx(a, 0, X) ASE(a) ASEW(a, 0, X) TSW(a, 0, X)

0.6317(0.0508) 0.9985(0.0393) 0.0179(0.5939) 27.525(5.8997) 341.8104(4.1943) 0.4578(0.0133) 0.0039(0.0025)

32.4048(6.9827) 75.557(4.4651) 1.0688(0.2216) 0.0640(0.0335)

13.1876(4.5521) 0.9742(0.1073)

0.0047(9.00E-04) 1.5273(0.2833)

0.4031(0.0919) 0.1327(0.1312)

Table 3: MLEs with SE (in parentheses) (data-II).

Distribution

Parameter(SE) Parameter(SE)

Parameter(SE)

NCS-IW(5,0) IW(5,0) AGE(a, ß, X) ALomx(a, ß, X) ASE(a) ASEW(a, ß, X) TSW(a, ß, X)

2.3934(0.4249) 4.0175(0.706) 29.0366(6.6483) 187.9197(5.1477) 127.8946(4.8432) 1.0488(0.1284) 0.0811(0.0398)

6.0185(1.3910) 6.0224(2.0083) 2.9010(3.1180) 0.2891(0.3043)

104.561(19.0921) 2.9331(0.4532)

2.5293(0.567) 12.8568(11.0058)

3.1656(0.1303) 0.1297(0.1388)

Table 4: Some selection criteria and goodness-of-fit statistics (data-I).

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Distribution -2logL AIC HQIC KS(p-value) CVM(p-value) AD(p-value)

NCS-IW 556.1646 560.1646 561.4879 0.0706(0.9698) 0.0302(0.9768) 0.2106(0.9873)

IW 559.1617 563.1617 564.4851 0.0916(0.8218) 0.0806(0.6906) 0.5084(0.7373)

AGE 563.9111 569.9111 571.8961 0.1496(0.2518) 0.1963(0.2753) 1.0219(0.3455)

ALomx 556.8248 562.8248 564.8098 0.0532(0.9990) 0.0142(0.9998) 0.1482(0.9988)

ASE 587.1019 589.1019 589.7636 0.2771(0.0018) 1.0569(0.0017) 5.3298(0.0020)

ASEW 554.7389 560.7389 562.7239 0.0634(0.9896) 0.0214(0.9959) 0.1351(0.9994)

TSW 561.8178 567.8178 569.8028 0.1126(0.5930) 0.1014(0.5799) 0.6674(0.5857)

Table 5: Some selection criteria and goodness-of-fit statistics (data-I).

Distribution -2logL AIC HQIC KS(p-value) CVM(p-value) AD(p-value)

NCS-IW 31.0171 35.0171 35.4059 0.0975(0.9913) 0.0254(0.9906) 0.1594(0.9979)

IW 30.8174 34.8174 35.2062 0.1020(0.9854) 0.0266(0.988) 0.1545(0.9984)

AGE 36.8149 42.8149 43.398 0.1193(0.9385) 0.0577(0.8338) 0.5597(0.6847)

ALomx 35.4117 41.4117 41.9949 0.1136(0.9587) 0.0565(0.8416) 0.4783(0.7670)

ASE 154.7472 156.7472 156.9416 0.8863(0.0000) 5.1247(0.0000) 31.4397(0.0000)

ASEW 31.1885 37.1885 37.7716 0.1170(0.9470) 0.0363(0.9551) 0.2096(0.9877)

TSW 39.7066 45.7066 46.2898 0.1694(0.6147) 0.1415(0.4194) 0.8932(0.4170)

o

O

500

1000

1500

500

1000

Figure 5: Estimated PDF (left) and empirical vs estimated CDF (right) (data-I).

x

x

Figure 6: Estimated PDF (/eft) and empirical vs estimated CDF (right) (data-//).

6. Conclusion

Based on the ratio of CDF G(x) and 1 + G(x) of baseline distribution, we developed the new trigonometric family of distributions by transforming the sine function and we named it the new class sin-G family of distributions. General properties of the suggested family of distributions are provided. Using Inverse Weibull distribution as a baseline distribution, we have introduced a member of the suggested family having reverse-j or increasing or inverted bathtub-shaped hazard function. Some statistical characteristics of this NCS-IW distribution are explored. The associated parameters of the new distribution are estimated through the MLE method. To assess the estimation procedure, we conducted a Monte Carlo simulation and found that even for small samples, biases and mean square errors decreased as the size of the sample increased. Two real medical data sets are considered for the application of the NCS-IW distribution. Using some model selection criteria and goodness of fit test statistics, we empirically proved that the suggested model performs better than six other existing models (most of which have more parameters). Hence, we expect that the suggested family and its member distribution can be used in broader areas like medical science, reliability engineering, survival analysis, etc., and one can generate a new model using this family of distributions in the future.

References

[1] Alzaatreh, A., Lee, C., & Famoye, F. (2013). A new method for generating families of continuous distributions. Metron, 71(1), 63-79.

[2] Balakrishnan, N., & Cohen, A. C. (1991). Order statistics & inference: estimation methods. Academic Press, London.

[3] Chaudhary, A. K., Sapkota, L. P. & Kumar, V. (2021). Some properties and applications of arctan generalized exponential distribution. Internationa/ Journal of Innovative Research in Science, Engineering and Technology (/J/RSET), 10(1), 456-468.

[4] Chaudhary, A. K., & Kumar, V. (2021). The ArcTan Lomax distribution with properties and applications. International Journal of Scientific Research in Science, Engineering and Technology, 4099, 117-125.

[5] Chen, G., & Balakrishnan, N. (1995). A general purpose approximate goodness-of-fit test. Journal of Quality Technology, 27(2), 154-161.

[6] Chesneau, C., & Jamal, F. (2020). The sine Kumaraswamy-G family of distributions. Journa/ of Mathematical Extension, 15.

[7] Clark, V. A., & Gross, A. J. (1975). Survival distributions: reliability applications in the biomedical sciences. New York, John Wiley Sons.

[8] Efron, B. (1988). Logistic regression, survival analysis, and the Kaplan-Meier curve. Journal of the American Statistical Association, 83(402), 414-425.

[9] Gomez-Deniz, E., & Calderin-Ojeda, E. (2015). Modelling insurance data with the Pareto ArcTan distribution. ASTIN Bulletin: The Journal of the IAA, 45(3), 639-660.

[10] Henningsen, A., & Toomet, O. (2011). maxLik: A package for maximum likelihood estimation in R. Computational Statistics, 26, 443-458.

[11] He, W., Ahmad, Z., Afify, A. Z., & Goual, H. (2020). The arcsine exponentiated-X family: validation and insurance application. Complexity, 1-18.

[12] Isa, A. M., Ali, B. A., & Zannah, U. (2022). Sine Burr XII Distribution: Properties and Application to Real Data Sets. AJBAR, 1(3), 48-58.

[13] Kenney, J. F. & Keeping, E. S. (1962). Mathematics of Statistics, 3 edn, Chapman and Hall Ltd, New Jersey.

[14] Kharazmi, O. & Saadatinik, A. (2016). Hyperbolic cosine-F families of distributions with an application to exponential distribution. Gazi University Journal of Science, 29(4), 811-829.

[15] Kumar, D., Singh, U., & Singh, S. K. (2015). A new distribution using sine function-its application to bladder cancer patients data. Journal of Statistics Applications & Probability, 4(3), 417.

[16] Mahmood, Z., Chesneau, C., & Tahir, M. H. (2019). A new sine-G family of distributions: properties and applications. Bull. Comput. Appl. Math., 7(1), 53-81.

[17] Marshall, A. W., & Olkin, I. (2007). Life distributions (Vol. 13). Springer, New York.

[18] Moors, J. J. A. (1988). A quantile alternative for kurtosis. Journal of the Royal Statistical Society: Series D (The Statistician), 37(1), 25-32.

[19] Muhammad, M., Alshanbari, H. M., Alanzi, A. R., Liu, L., Sami, W., Chesneau, C., & Jamal, F. (2021). A new generator of probability models: the exponentiated sine-G family for lifetime studies. Entropy, 23(11), 1394.

[20] Renyi, A. (1961). On measures of entropy and information. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics (Vol. 4, pp. 547-562). University of California Press.

[21] Rahman, M. M. (2021). Arcsine-G Family of Distributions. J. Stat. Appl. Pro. Lett. 8(3), 169-179.

[22] Sakthivel, K. M. and Rajkumar, J. (2020). Hyperbolic cosine Rayleigh distribution and its application to breaking stress of carbon fibers. Journal of Indian Society and Probability Statistics, 21(2), 471-485.

[23] Sakthivel, K. M., & Rajkumar, J. (2021). Transmuted sine-G family of distributions: theory and applications. Statistics and Applications/Accepted: 10 August 2021).

[24] Souza, L. (2015). New trigonometric classes of probabilistic distributions (Doctoral dissertation, Thesis, Universidade Federal Rural de Pernambuco).

[25] Souza, L., Junior, W. R. D. O., de Brito, C. C. R., Ferreira, T. A., & Soares, L. G. (2019a). General properties for the Cos-G class of distributions with applications. Eurasian Bulletin of Mathematics (ISSN: 2687-5632), 63-79.

[26] Souza, L., Junior, W., De Brito, C., Chesneau, C., Ferreira, T., & Soares, L. (2019b). On the Sin-G class of distributions: theory, model and application. Journal of Mathematical Modeling, 7(3), 357-379.

[27] Swain, J. J., Venkatraman, S., & Wilson, J. R. (1988). Least-squares estimation of distribution functions in Johnson's translation system. Journal of Statistical Computation and Simulation, 29(4), 271-297.

i Надоели баннеры? Вы всегда можете отключить рекламу.