Научная статья на тему 'The Poisson-Shukla Distribution and its Applications'

The Poisson-Shukla Distribution and its Applications Текст научной статьи по специальности «Науки о Земле и смежные экологические науки»

CC BY
128
90
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
Shukla distribution / Moments based measures / Estimation of parameter / Goodness of fit

Аннотация научной статьи по наукам о Земле и смежным экологическим наукам, автор научной работы — Kamlesh Kumar Shukla, Rama Shanker

In this paper, a study on Poisson-Shukla distribution (PSD), a Poisson mixture of Shukla distribution introduced by Shukla and Shanker (2019), has been carried out. The expression for r th factorial moment about origin has been derived. The expressions for its mean and variance have been given. Maximum likelihood estimation for estimating the parameters have been discussed. The goodness of fit of the proposed distribution has been explained with two count datasets and its fit was found quite satisfactory over Poisson distribution (PD), Poisson-Lindley distribution (PLD), Poisson-weighted Lindley distribution (P-WLD), Generalized Poisson Lindley distribution (GPLD) and Negative Binomial distribution (NBD).

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «The Poisson-Shukla Distribution and its Applications»

The Poisson-Shukla Distribution and its Applications

Kamlesh Kumar Shukla

Department of Statistics, Mainefhi College of Science, Asmara, Eritrea, Email: kkshukla22@gmail.com

Rama Shanker •

2Department of Statistics, Assam University, Silchar, India Email: shankerrama2009@gmail.com

Abstract

In this paper, a study on Poisson-Shukla distribution (PSD), a Poisson mixture of Shukla distribution introduced by Shukla and Shanker (2019), has been carried out. The expression for r th factorial moment about origin has been derived. The expressions for its mean and variance have been given. Maximum likelihood estimation for estimating the parameters have been discussed. The goodness of fit of the proposed distribution has been explained with two count datasets and its fit was found quite satisfactory over Poisson distribution (PD), Poisson-Lindley distribution (PLD), Poisson-weighted Lindley distribution (P-WLD), Generalized Poisson Lindley distribution (GPLD) and Negative Binomial distribution (NBD).

Keywords: Shukla distribution, Moments based measures, Estimation of parameter, Goodness of fit

1. Introduction

The probability density function (pdf) and the cumulative distribution function (cdf) of Shukla distribution proposed by Shukla and Shanker (2019) are given by

r\a+\

f (x; 6, a) =------(- + xa )e~ex; x > 0,6 > 0, a > 0 (1.1)

f ( ; , ) -a+\ + r(a + \)V ' ; , , ( )

6a — + xa) e- +aT(a—x)

F(x;6,a) = \----f--r--;x > 0,6 > 0,a>0 (1.2)

( ; , ) 0a+\ +r(a + \) , , ( )

This distribution, being a convex combination of exponential (-) distribution and gamma (a, 6)

distribution, has been proposed for modeling lifetime data form engineering and biomedical science. The statistical properties of Shukla distribution including its shapes, moments, skewness, kurtosis, hazard rate function, mean residual life function, stochastic ordering, mean deviations along with estimation of parameters using Maximum likelihood estimation are given in Shukla and Shanker (2019). The main motivation behind this new discrete distribution arises from the fact that it has been observed by Shukla and Shanker (2019) that Shukla distribution provides a better fit for modeling lifetime data than exponential distribution, Lindley distribution of Lindley (1958), gamma distribution, generalized Lindley distribution of Zakerzadeh and Dolati (2009) and weighted Lindley distribution given by Ghitanay et al (2011), and it is expected that the Poisson mixture Shukla distribution would provide a better fit for modeling count data than the Poisson

mixture of these distributions namely Poisson-Lindley distribution (PLD) of Sankaran (1970), Negative binomial distribution (NBD) of Greenwood and Yule (1920), generalized Poisson-Lindley distribution (GPLD) of Mahmoudi and Zakerzadeh (2010) and Poisson-Weighted Lindley distribution (P-WLD) introduced by El-Monsef and Sohsah (2014) and studied by Shanker and Shukla (2019). The probability mass function of PLD, NBD, GPLD and P-WLD are presented in the following table

Table 1: The pmf of Poisson Lindley distribution (PLD), Negative binomial distribution (NBD) , Generalized Poisson Lindley distribution (GPLD) and Poisson weighted Lindley distribution (PLWD)

Distribu tions Probability mass function (pmf) Introducer (Year)

PLD , , 02 (x + 0 + 2) P (x;0)= , ; ;x = 0,1,2,...,0 > 0 1 ( ; ) (0 + 1)x+3 ; , , , , Sankaran (1970)

NBD P(X=x)=(x+1)r001J (0+1J;x=0'1'2'.. Greenwood and Yule (1920)

GPLD 03 {ax2 + (0 + 3a + 1)x + (02 + 30 + 2a + 2) P ( fi ' Mahmoudi, and Zakerzadeh (2010) Zakerzadeh (2010)

v ' ' ' (02 + 0 + 2a)(0 +1)x+3

P-WLD P(x;0,a) = x + a) 0a+ x + 0 + a + 1;0> 0,a> 0 ( ; ' ' r(x + 1)r(a)(0 + a) (0 + 1)"a*\ ' ' El-Monsef and Sohsah (2014)

In the present paper Poisson-Shukla distribution (PSD) by compounding Poisson distribution with Shukla distribution has been proposed. The paper is divided into six sections. Second section deals with derivation of the pmf of PSD and shapes of PSD for varying values of parameters. Moments and moments based measures of the proposed distribution have been discussed in section 3. Sections four and five deal the discussion of the estimation of parameters using maximum likelihood estimation and the goodness of fit of the proposed distribution, respectively. Finally, conclusions of the paper have been given in section six.

2. Poisson-Shukla Distribution

Assuming the parameter A of the Poisson distribution following Shukla distribution (1.1), the Poisson mixture of Shukla distribution can be obtained as

w -A j x na+1

A ^--(d + Aa)e-eAdA (2.1)

0 x. „ hr(a + 1)V ) ( )

P (x,0,a) = J -

x! 0

Qa+\

{0a+1 +r(a +1 )}r( x +1)

J e ~(0+1)A (0Ax +Ax+a) d A

0 Y+\ r 1

1 0(0 + 1)ar( x + 1) + r(x + a +1)

0 +1J 01J r(x +1)

0a+1 +r(a +1)

; x > 0, (0,a, 0)> 0

(2.2)

We name this distribution "Poisson-Shukla distribution (PSD)". The behavior of the pmf of PSD for different values of parameter is shown in the figure 1

x

Fig.1: Behavior of the pmf of PSD for different values of the parameters 6 and a

3. Moments

The rth factorial moment about origin of PSD (2.2) can be obtained as

H.

= E

E[X{r) | A)], where X{r)= X(X- 1)(X-2)...(X-r + l)

Using (2.1), the r th factorial moment about origin of PSD (2.2) can be obtained as

h,

= E

E

( x ( r)}

ea

6a+1 + Y(a +1)

1

I ;

e~AAx x!

(e + Àa) e

-ex

dA

ea

ea+1 +Y(a +1)

—An x-r

-A n .

e A

( x - r )!

(e + Aa) e-eAdA

Taking x + r in place of x within the bracket, we get

Hr) =-

ea

ea+1+r(«+1)

1A'

leAx

: x!

(e + Aa) e

-eA

dA

The expression within the bracket is clearly unity and hence we have

Hr ) =■

e

a+1

ea+1 +Y(a +1)

lAr (e + Aa) e-eAdA

Using gamma integral and a little algebraic simplification, we get finally, a general expression for the r th factorial moment about origin of PSD (2.2) as

, 0 r(r + 1) + r(a + r +1) ;r = 123

^(r) = 0r {0a+1 +r(a +1)} ;r = 1,2,3,....

The first four factorial moments about origin can thus be obtained as

0a+1 +r(a + 2)

(3.1)

H(1) =

H 2) =

H(3) =

H 4) =

e{ea+1 +r(a+1)}

2ea+1 +r(a + 3 )

e2 {ea+1 +r(a+1)}

6ea+1 +r(a + 4)

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

e3 {ea+1 +r(a+1)}

24ea+1 +T(a + 5)

e4 {ea+1 +r(a+1)}

Using the relationship between moments about origin and factorial moments about origin, the first two moments about origin of PSD can be expressed as

6a+1 + r(a + 2)

M = MW =---—

e{ea+1 +r(«+1)}

, ,_ea+2+2ea+1 +er(«+2 )+r(a+3)

H2 = H(2) + H(1) =

e2 {ea+1 +r(a+1)}

Similarly, third and fourth moments about origin can be obtained. The variance of PSD can thus be obtained as

h = H ' -(h )2 =

fea+2 + 2ea+1 +er(«+2)+r(«+3)] f ea+1 +r(«+2)

J V

e2 {ea+1 +Y(a+1 )}

A

e{ea+1 +Y(«+1)}

The expressions for third and fourth moments about the mean are lengthy and hence are not being given. However, if they are required, can be obtained easily.

4. Maximum Likelihood Estimate (mle)

Let (X, X>•••> X) be a random sample of size n from the PSD (2.2) and let fx be the observed

k

frequency in the sample corresponding to X = X (x = 1,2,3,..., k) such that ^ f = n, where k is

X=1

the largest observed value having non-zero frequency. The likelihood function L of the PSD (2.2) is given by

L =

0

n("+1) f

0 +1

1

1

0a+1 +T(a +1)

n

(0+1)ti

The log- likelihood function is thus obtained as

0 }

E xfx x=1

0(0 +1)" r(x +1) + r(x + a +1) T( x +1)

log L = n (a +1) log I

0 +1

- n

log [0"+1 + T(a +1)] - nx log (0 +1)

+ ]T f log {0 (0 +1)" T(x +1) + T(x + a +1) - log r(x +1)}

x=1

where x is the sample mean.

The maximum likelihood estimates (MLE's (0,") of the parameter (0,a) of PSD (2.2) are the solutions of the following log-likelihood equations

ô log L n (" +1) n(" +1)0"

50

nx ^ - +

' + 1) x=1

(0a +1)(0 + 1)"r(x +1) fx

= 0

0(0 +1) 0"+i +r(" +1) (0 +1) ti 0(0 + 1)"r(x + 1) + r(x + a +1)

0(0 +1)" log(0 + 1)r(x + 1)l

ô log L = n log nQa+l\og0 + y(a +1) + £

ôa

00 +1 ) 0"+l +r(a +1 )

0(0 + 1)"r( x + 1) + r( x + a +1)

= 0,

d

where =-ln is the digamma function. These two log likelihood equations do not

seem to be solved directly because they do not have closed forms. Therefore, to find the maximum likelihood estimates of parameters an iterative method such as Fisher Scoring method, Bisection method, Regula Falsi method or Newton-Raphson method can be used. In this paper Newton-Raphson method has been used using R-software.

5. Goodness of fit

The PSD has been fitted to a number of datasets to test its goodness of fit along with PD, PLD, NBD, GPLD and P-WLD., because PLD, NBD, GPLD and P-WLD are always over-dispersed and hence their comparison regarding goodness of fit is justifiable. The maximum likelihood estimate has been used to fit PD, PLD, NBD, GPLD, P-WLD and PSD for two examples of observed count datasets. The first dataset in table 2 is due to the data regarding the number of European red mites on apple leaves, available in Bliss (1953), the second dataset in table 3 is the Mammalian Cytogenetic dosimetry Lesions in Rabbit Lymphoblast induced by streponigrin (NSC-45383), available in Catcheside et al (1946).

n

Table 2: Observed and Expected number of European red mites on Apple leaves, available in Catcheside et al (1946)_

Number of European red mites per leaf Observed frequency Expected frequency

PD PLD NBD GPLD P-WLD PSD

0 70 47.6 67.2 69.5 69.8 69.8 71.3

1 38 54.6 38.9 37.6 36.7 36.8 35.0

2 17 31.3 21.2 20.1 20.1 20.1 19.5

3 10 11.9 11.1 10.7 10.9 10.9 11.1

4 9 3.4 5.7 5.7 5.8 5.8 6.2

5 3 0.8 2.8 3.0 3.1 3.0 3.4

6 2 0.2 1.4 1.6 1.6 1.6 1.7

7 1 0.1 0.9 0.9 0.8 0.8 0.8

8 0 0.1 0.8 0.9 1.2 1.2 1.0

Total 150 150.0 150.0 150.0 150.0 150.0 150.0

ML estimate 0 = 1.14666 0 = 1.26010 a p = 1.02459 = 0.52811 0 = 1.09620 a = 0.78005 0 = 1.09141 a = 0.82194 0 = 1.8444 a = 3.1231

Standard 0.08743 0.11390 0.42097 0.25400 0.26231 0.6409

0) 0.40136 0.31550 0.25230 2.0978

Errors

a

x2 26.50 2.49 2.91 2.43 2.41 2.09

d.f 2 4 3 3 3 3

p-value 0.0000 0.5595 0.4057 0.4880 0.4917 0.5539

-2 log L 485.61 445.02 469.68 444.62 425.35 444.09

AIC 487.61 447.02 447.02 448.62 429.35 448.09

Table 3: Mammalian Cytogenetic dosimetry Lesions in Rabbit Lymphoblast induced by streponigrin (NSC-45383), Exposure- 90 jug | kg

Class/Exposure (ug1 kg) Observed frequency Expected frequency

PD PLD NBD GPLD P-WLD PSD

0 1 2 3 4 5 6 155 83 33 14 11 3 1 127.8 109.0 46.5 13.2 2.8 0.5 0.2 158.3 77.2 35.9 16.1 7.1 3.1 2.3 155.1 80.6 36.7 15.9 6.7 2.8 2.2 155.3 80.1 36.9 16.0 6.7 2.8 2.2 155.9 80.0 36.7 15.9 6.7 2.7 2.1 158.9 76.8 35.5 16.0 7.1 3.2 2.5

Total 300 300.0 300.0 300.0 300.0 300.0 300.0

ML estimate 0 = 0.85333 0) = 1.61761 0 = 1.56009 a = 1.33128 0) = 1.80860 a = 1.18743 0 = 1.82011 a = 1.16320 0 = 1.4384 a = 0.7011

SE(0) SE ( a ) 0.05333 0.11327 0.41479 0.33752 0.40045 0.37007 0.41992 0.32483 0.3398 1.1181

x2 24.969 1.51 1.60 1.69 1.78 1.40

d.f 2 3 2 2 2 2

p-value 0.0000 0.6799 0.4488 0.42955 0.4106 0.4965

-2 log L 800.92 766.10 765.86 765.79 834.51 766.32

AIC 802.92 768.10 769.86 769.79 838.51 770.32

The fitted plots of the distributions for dataset in tables 2 and 3 have been presented in figure 3 and 4 respectively.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

Fig.3: Fitted plot of distributions for datasets in table 2

Fig. 4: Fitted plot of distributions for datasets in table 3

It is clear from the goodness of fit of the considered distribution in tables 3 and 4 and the fitted plots of the distributions that PSD is gives much closer fit.

6. Concluding Remarks

In this paper Poisson-Shukla distribution (PSD) has been proposed. The expression for the r th factorial moment about origin has been derived and hence its mean and variance have been obtained. The maximum likelihood estimation for estimating its parameter has been discussed. The goodness of fit of PSD over PD, PLD, NBD, GPLD and P-WLD have been discussed with two examples of observed real datasets. It is observed that PSD gives much better fit than PD, PLD, NBD, GPLD and P-WLD on all the datasets and hence it can be considered an important discrete distribution for modeling count data over these distributions.

References

1. Bliss, C.I. (1953): Fitting negative binomial distribution to biological data, Biometrics, 9, 177 -200.

2. Catcheside, D.G., Lea, D.E. & Thoday, J.M. (1946). The production of chromosome structural changes in Tradescantia microspores in relation to dosage, intensity and temperature, Journal of Genetics, 47, 137-149.

3. El-Monsef, M.M.E. and Sohsah, N.M. (2014): Poisson-Weighted Lindley Distribution, Jokull Journal, 64(5), 192 - 202

4. Ghitany ,M.E., Alqallaf ,F., Al-Mutairi, D.K., Husain, H.A. (2011): A two-parameter weighted Lindley distribution and its applications to survival data, Mathematics and Computers in simulation, 81, 1190-1201.

5. Greenwood, M. and G.U., Yule, (1920) "An inquiry into the nature of frequency distributions representative of multiple happenings with particular reference to the multiple attacks of disease or of repeated accidents", Journal of the Royal Statistical Society, 83(2), 115 - 121

6. Lindley, D.V. (1958): Fiducial distributions and Bayes theorem, Journal of the Royal Statistical Society, 20 (1), 102- 107.

7. Mahmoudi, E. and Zakerzadeh, H. (2010): Generalized Poisson-Lindley distribution, Communications in Statistics-Theory & Methods, 39, 1785 - 1798.

8. Sankaran, M. (1970): The discrete Poisson-Lindley distribution, Biometrics, 26, 145- 149.

9. Shanker, R. and Shukla, K. K. (2019): On Poisson weighted Lindley distribution and its Applications Journal of Scientific Research, 11(1), 1-13.

10. Shukla, K.K. and Shanker, R (2019): Shukla distribution and its Application, R.T.&A,14(3), 4655

11. Zakerzadeh, H. and Dolati, A. (2009): Generalized Lindley distribution, Journal of Mathematical extension, 3 (2), 13 - 25.

i Надоели баннеры? Вы всегда можете отключить рекламу.