Научная статья на тему 'AN INFERENTIAL STUDY OF DISCRETE BURR-HATKE EXPONENTIAL DISTRIBUTION UNDER COMPLETE AND CENSORED DATA'

AN INFERENTIAL STUDY OF DISCRETE BURR-HATKE EXPONENTIAL DISTRIBUTION UNDER COMPLETE AND CENSORED DATA Текст научной статьи по специальности «Математика»

CC BY
180
97
i Надоели баннеры? Вы всегда можете отключить рекламу.
Область наук
Ключевые слова
Burr-Hatke exponential distribution / Method of maximum likelihood / Discrete distribution / Random censoring / Simulation study

Аннотация научной статьи по математике, автор научной работы — Arvind Pandey, Ravindra Pratap Singh, Abhishek Tyagi

In this article, a new one-parameter discrete distribution called discrete Burr-Hatke exponential distribution is introduced and its mathematical characteristics are thoroughly investigated. The proposed distribution is capable of modelling over-dispersed, positively skewed, decreasing failure rate, and randomly right-censored data. We have also introduced many statistical properties including moments, skewness, kurtosis, mean residual life and mean past lifetime, index of dispersion, coefficient of variation, stress strength parameter, quantile function, and order statistics. Method of maximum likelihood is used to estimate unknown model’s parameter under complete and censored data. In addition, a technique for generating randomly right-censored data from the proposed model is provided. To evaluate the behaviour of the estimator with complete and censored data, two simulation studies are presented. Two complete and two censored datasets from various disciplines are studied to demonstrate the significance of the suggested distribution in comparison to the existing discrete probability distributions.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «AN INFERENTIAL STUDY OF DISCRETE BURR-HATKE EXPONENTIAL DISTRIBUTION UNDER COMPLETE AND CENSORED DATA»

AN INFERENTIAL STUDY OF DISCRETE BURR-HATKE EXPONENTIAL DISTRIBUTION UNDER COMPLETE

AND CENSORED DATA

ARVIND PANDEY RAVINDRA PRATAP SINGH*

ABHISHEK TYAGI •

Department of Statistics, Central University of Rajasthan, Rajasthan-305817, India Department of Statistics, Chaudhary Charan Singh University, Meerut-250004, India

arvindmzu@gmail.com stats.rpsingh@gmail.com* abhishektyagi033@gmail.com

Abstract

In this article, a new one-parameter discrete distribution called discrete Burr-Hatke exponential distribution is introduced and its mathematical characteristics are thoroughly investigated. The proposed distribution is capable of modelling over-dispersed, positively skewed, decreasing failure rate, and randomly right-censored data. We have also introduced many statistical properties including moments, skewness, kurtosis, mean residual life and mean past lifetime, index of dispersion, coefficient of variation, stress strength parameter, quantile function, and order statistics. Method of maximum likelihood is used to estimate unknown model's parameter under complete and censored data. In addition, a technique for generating randomly right-censored data from the proposed model is provided. To evaluate the behaviour of the estimator with complete and censored data, two simulation studies are presented. Two complete and two censored datasetsfrom various disciplines are studied to demonstrate the significance of the suggested distribution in comparison to the existing discrete probability distributions.

Keywords: Burr-Hatke exponential distribution, Method of maximum likelihood, Discrete distribution, Random censoring, Simulation study

1. Introduction

Many continuous lifetime models have been proposed and investigated in reliability theory. However, measuring the life of a component on a continuous scale is frequently impossible or inconvenient. For example, in reliability engineering, the lifetime of an on/off switching device, in survival analysis, the survival times for those suffering from diseases such as lung cancer or the period from remission to relapse may be recorded as the number of days/weeks etc. Furthermore, the count phenomenon arises in a wide range of practical scenarios, including the number of earthquakes that occur in a calendar year, the number of absences, the number of accidents, the number of species kinds in ecology, the number of insurance claims, the number of deaths/daily cases due to the COVID-19 pandemic observed over a specified duration and so on. In all of these circumstances, it is more appropriate to measure these characteristics on a discrete scale rather than a continuous analogue.

Although there are several conventional discrete distributions such as the Binomial, Poisson, Geometric etc and recently developed discrete models to analyse above discussed characteristics. The research for new discrete distributions that are appropriate under various scenarios is still

underway. One prominent area of study in this field is the development of discrete distributions by discretizing suitable continuous probability distributions. Discretization of continuous distribution can be accomplished by a variety of methods. Out of which one of the most widely used methods is [1]. In this approach, he proposed discrete normal distribution using the survival function of its continuous counterpart. Chakraborty in [2] named this technique the survival discretization method. One of the most important advantages of this method is that the produced discrete distribution has the same functional form of the survival function as its continuous version. As a result of this feature, many of the reliability characteristics of the distribution remain unchanged. According to this methodology, for a given continuous random variable (RV) 'Y' with survival function (SF) SY(y) = P(Y > y), the random variable X = [Y] = largest integer less than or equal to Y will have the probability mass function (PMF),

P(X = x) = P(x < Y < x + 1)

= P (Y > x) - P (Y > x + 1)

= SY(x) - SY(x + 1); x = 0,1,2,... (1)

Many scholars have discretized various well-known continuous distributions using this approach. For instance, [3] investigated the discrete Rayleigh distribution, [4] researched the discrete Maxwell distribution. In addition, [5] investigated the discrete Burr and discrete Pareto distribution. Discrete inverse Weibull distribution developed by [6] . Discrete-continuous Burr III distribution defined by [7]. For more studies on discrete distribution, one can refer to [8], [9], [10], [11] and the references cited therein. Recently, [12] developed a discrete analogue of the odd Weibull-G family of distributions: properties, classical and Bayesian estimation with applications to count data of the number of new coronavirus cases.

In many circumstances, data collection is restricted by constraints such as time or money, making it hard to obtain the entire dataset. This form of incomplete data is referred to as censored data. Various censoring mechanisms are available in the literature to examine these datasets. One of the greatly applicable censorship is random censoring. This scheme consists of studies in which subjects can be censored at any time during the experiment period. Random censoring can be seen in clinical trials or medical studies where patients do not finish the course of treatment and leave before the endpoint. Randomly censored lifetime data are common in many applications such as medical science, biology, reliability studies, and so on, and must be properly analysed to make correct inferences and appropriate research conclusions. Random censoring has been widely investigated in the literature for continuous models see [13]. The censoring technique has also been studied merely under discrete models, namely [14] and [15]. Recently, [16] developed discrete inverted Nadarajah-Haghighi distribution and estimated its parameters under complete and random right-censored censored data.

The majority of existing discrete models were developed to assess count data and, in most cases, they do not accurately analyse the censored data. These situations motivate us to develop a more appropriate discrete distribution that is not capable only of analysing count data but also well enough for modelling censored data. Therefore, in this article, we have proposed a discrete analogue of the Burr-Hatke exponential model by using approach (1) and named it as discrete Burr-Hatke exponential (DBHE) distribution. Hence the ultimate objectives of developing the DBHE model is as follows, a) To construct a discrete model capable of modelling both complete and censored data, b) To design a discrete model with more flexibility and fewer parameters so that the form of diverse distributional properties can be easily handled, c) Numerous practical studies, such as newly developed engineering systems and infant mortality, have shown decreasing failure rate; consequently, we wish to construct a discrete model with a decreasing failure rate function,

d) To develop a model that can fit positively skewed, leptokurtic and over-dispersed real data,

e) To produce a discrete model that can provide consistently better fits than other well-known discrete models in the existing statistical literature.

The rest of the article is structured as follows: Section 2 introduces the DBHE distribution. Some significant distributional and survival features are investigated in Section 3. In Section 4, we use the maximum likelihood estimation approach to estimate the parameter of the DBHE distribution

with complete data and also present numerical illustrations based on empirical and real-world datasets. Section 5 discusses the maximum likelihood estimator (MLE) for the model's parameter under randomly right-censored data and it also includes the technique for generating censored observations from the proposed model. The numerical examples using randomly right-censored empirical and real data have also been presented in section 5. Section 6 concludes with some final observations.

2. The DBHE distribution

The Burr-Hatke exponential (BHE) distribution was proposed by [17].The probability density function (PDF) and SF of the BHE distribution are given as

f (y, 0 )

0 (2 + ву] exp (-0y) ; y > 0,0 > 0,

(1 + 0y)2 P( У) У >

(2)

S (y, 0) = P (Y > y)

exp (-0y).

(1 + 0y)

; y > 0,0 > 0,

(3)

respectively. The BHE distribution is rightly skewed with decreasing hazard rate function (HRF). This model is very useful to analyse reliability/medical data which have the pattern of decreasing hazard rate. Since it has been generalized by exponential baseline distribution so it may be regarded as an alternative to the several one-parameter exponential families of distributions. Now, using a methodology (1) the PMF of the DBHE model can be obtained as

Px (x, 0)

1 exp(-0)

(1 + 0x) - (1 + 0 + 0x)

exp (-0x),x = 0,1,2...;0 > 0.

The CDF corresponding to Equation (4) is given by,

г t a\ 1 exp (-0 (x + 1)) Fx (x, 0) = 1 - (1 + 0 + 6x) 'X

0,1,2,...; 0 > 0.

(4)

(5)

Figure 1: The PMF plots of the DBHE model for different values of d.

Figure 1 shows the PMF plots for different values of the model parameter. From Figure 1, we can conclude that the PMF of the DBHE distribution is unimodal and right-skewed. Also, the behaviour of the PMF at endpoints are as follows:

lim PX(x, 0)

x^t 0

1

xp(-

(1+0)

• lim PX (x, 0) = lim PX (x, 0)

x^™ 0^0

lim PX (x, 0) = 0.

AN INFERENTIAL STUDY OF DISCRETE BURR-HATKE RT&A, No 4 (71)

EXPONENTIAL DISTRIBUTION Volume 17, December 2022

3. Distributional Properties

3.1. Recurrence Relation for Probabilities

To obtain the probability mass on various values of X, we can use the following recursive relation

Px (x + 1,»> = ((1+TTto) - ) ((TTto) - (iSMfe)" expM)Px (x,«■

It is observable that {PX (x + 1)}2 < PX (x) PX (x + 1) for all x. As a result, the DBHE distribution is log-convex. Due to this convexity, the proposed distribution has a non-increasing failure rate [18].

3.2. Moments, Skewness and Kurtosis

Moments of a probability distribution are an important tool for measuring its different properties such as mean, variance, skewness, kurtosis, etc. If F (x) is the CDF of a discrete random variable, then the rth raw moments of this random variable can be obtained by using the following formula:

oo

E(Xr )= £{ ((x + 1)r - xr) (1 - F (x))}.

x=0

Using the above expression, the rth raw moment denoted by \lr of the DBHE distribution can be written as

M (7r + 1 )r — xr^

^r = E(Xr) = exp (-9) £ U(1 + 9}+d^ exp (-9x). (6)

Using the ratio test, we can easily observe that, the expression in Equation (6) is convergent. It

implies the existence of the rth moment of the proposed distribution.

Now, using Equation (6), the first four-row moments of the DBHE distribution are

,1 = E(X) = exp M) £ ,

,2 = e(x2)= exp (-9) £ ^^i+ljx) exp (-9x), ,3 = E(X) = exp M) £ ^ ++x|} exp Mx),

(7)

(8) (9)

,4 = E(X4) = exp (-9) £ (4X3(+ +Xg2 + ^ ^ exp (-9x). (10)

The variance of the DBHE distribution is given by ,

V«r(X) = E (x2) - E(X)2

£ (2X + 1fPl-9XM - exp (-9) £ ^^

2

(1 + 9 + 9x) \ ;xt0 (1 + 9 + 9x)

x=0 x=0

Using above raw moments in (7)-(10), we can easily find the skewness and kurtosis from the following relations

^ E(X4) - 4E(X2)E (X) + 6E(X2)(E (X))2 - 3(E (X))4

K = -5-.

(Var(X))2

Table 1 presents some numerical results of the mean, variance, skewness and kurtosis for the DBHE distribution for different values of 9.

Table 1: Mean, Variance, Skewness and kurtosisfor different values of 8.

Measure ^ 8^ 0.1 0.2 0.3 0.5 0.7 0.9 1 1.5 2

Mean 4.6575 1.8326 0.9674 0.3692 0.1701 0.0863 0.0629 0.0149 0.0041

Variance 52.5614 13.4619 5.7640 1.7577 0.7144 0.3336 0.2356 0.0505 0.0130

Skewness 4.8141 5.4455 6.9073 11.9474 20.6287 34.9582 45.2208 156.6311 517.1785

Kurtosis 10.7294 10.8777 12.1065 17.1201 26.0157 40.5689 50.8788 160.0551 504.2940

From Table 1, it is clear that:

1. As the parameter's value increases, the values of mean and variance of the DBHE distribution decrease, whereas the values of skewness and kurtosis increase.

2. The proposed model is appropriate for modelling positively skewed and leptokurtic data.

3.3. Index of Dispersion and Coefficient of Variation

The index of dispersion (IOD) is a measure used to determine the possibility of over-dispersion (under-dispersion) of the model under study. An IOD greater than one indicates over-dispersion, whereas an IOD lower than one indicates under-dispersion. Equi-dispersion is indicated when the IOD is equal to one. The expression for IOD of the DBHE distribution is

£ (2x+i+8x)) - (exp (-8)Jo ' 2

Var(X) V n (1+0+0x) ^n(1+e+ex)

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

IOD (X) = = --Trf^-—. (11)

E (X) exp (-0)£

x=0

Furthermore, the coefficient of variation (COV) is a measure of data variability. The COV measure is commonly used to compare the variability of independent samples. The larger the coefficient of variation (COV), the more erratic the data. If X follows DBHE model, the COV of DBHE may be represented as

„1, ((exp(-8)âs»82)- (exp(-8)â№3)

COV (X) —

1/2

E (X) exp (-8) ——0 saasg)

x=0

(12)

The numerical values of IOD and COV are shown in Table 2 for a variety of model parameter values.

Table 2: Index of dispersion and coefficient of variation of DBHE for different values of 8.

Measure ^ 8^ 0.1 0.2 0.3 0.5 0.7 0.9 1 1.5 2

IOD 11.2853 7.3457 5.9583 4.7613 4.1991 3.8674 3.7481 3.3862 3.2143

COV 1.5566 2.0021 2.4818 3.5913 4.9680 6.6959 7.7212 15.0746 28.1402

From Table 2, it is observable that, when the parameter's value increases, the IOD decreases and the COV increases. Since, IOD>1 indicating that the proposed model is appropriate for

modelling over-dispersed data.

3.4. Quantile Function

The point xq is known as the qth quantile of a discrete random variable X if it satisfies P (X < xq) > q and P (X > xq) > 1 - q that is F (xq - 1) < q < F (xq) (See, [19]).

Using this result, the qth quantile of DBHE distribution can be obtained by

1N

exp (-6

(1 + 6 Xq) (1 + 6 + 6 Xq)

- log q

(13)

where [.] is the ceiling function that returns the smallest integer greater than or equal to its argument.

A random number (integer) can be easily sampled from the proposed distribution by using Equation (13) when q be a uniform random number drawn from a Uniform distribution on the unit interval, i.e. U(0,1). In particular, if we put q = 0.5, we will get the value of the median of the proposed distribution.

3.5. Order Statistics

Order statistics have several applications in reliability engineering and life testing. Let Xi, X2,..., Xn be a random sample from DBHE distribution. Also, let X(1) < X(2) < ... < X(n), denote the corresponding order statistics. Then, the CDF of rth order statistic, say, Z = X(r), is given by

F (z, 6) = E

j

Fi (z) [1 - F (z, 6)]n

EE (-i)k i=1 k=0

n -i k

1

exp (-6 (z + 1)) (1 + 6 + 6z)

(i+k)

The corresponding PMF of r order statistic is

fr (z) = Fr (z) - Fr (z - 1)

EE (-1)k i=1k=0

kn

n — i

1

exp (-6 (z + 1)) (1 + 6 + 6 z)

(i+k)

1

exp (—6z) (1 + 6z)

(14)

(i+k)

(15)

Particularly, by putting r = 1 and r = n in Equation (15), we can obtain the PMF of minimum f X(i), X(2),..., X(n) and the PMF of maximum X(1), X(2),..., X(n) j, respectively.

3.6. Survival Characteristics

The Survival function of the proposed distribution is

S (x,0) = P (X > x) = ^j-ff ;x = 0,1,2.....

The hazard rate is a reliability characteristic that describes the system's failure behaviour over time. The discrete HRF for the DBHE distribution is given by

h (X, 0) = P (X = x|X > x) = P /5 = j = (1 +0 +0 x - exp (-0)(1 +0x)); x = 0,1,2.....(16)

S (x - 1,6)

(1 + 6 + 6 x)

provided that S (x - 1,0) > 0.

Figure 2 shows the HRF plots of the DBHE distribution for different values of 0. It is noted that the shape of the HRF is decreasing.

The reverse hazard rate function of the DBHE distribution is given by

1

x

q

n

i=r

r n—i

k

Figure 2: The HRF plots of the DBHE model for different values of 0.

h* (x,0) = P (X = x|X < x) =

1___exp(-0)

P (X = x) _ V(1+0x) (1+0+0x)

F (x, 0) ii exp(-0(x+1))\

V1 (1+0+0x) )

exp (- 0x)

The second rate of failure of the proposed model is given by

' S (x - 1)'

h** (x, 0) = log{ S (sx(x)1^ = 0 + log (1 + 0 + 0x) - log (1 + 0x).

(17)

(18)

3.7. Mean Residual and Mean Past Lifetime

The mean residual life (MRL) function, which represents the ageing mechanism, is broadly used in a wide variety of fields, including reliability engineering, survival analysis, biomedical research, and among others. In the literature, it is widely established that the MRL function uniquely characterises the distribution function F since it comprises all of the model's data. In discrete setup, the MRL, represented by the symbol m(i), may be defined as follows:

1 TO

m(i) = E(Y - i|Y > i) = ^ £ S(j); i = 0,1,2.....

S(i) j=i+1

where S (.) is SF. If X has DBHE distribution with parameter 0, then the MRL function of X is

m(i) = i^ £ exp(-0')

exp(-0i)

j=i+1

(1 + 0 j)

A function is known as the mean past life (MPL) function or expected inactivity time function (EITF) denoted by m* (i), is used to estimate the amount of time since the failure of X if the system has failed at some point before 'i'. In a discrete setting, the MPL function can be defined as

m*(i) = E(i - X|X < i) =

1

F(i' - 1)

£ F(k - 1); i = 1,2.....

By replacing the CDF (5) in the expression of m* (i), we can easily obtain the MPL for the proposed model.

3.8. Stress-Strength Parameter

Stress-strength analysis has been extensively used in reliability modelling. Suppose the random variable X and Y denotes the strength and stress of a system (both X and Y are in the positive domain), respectively, then the stress strength reliability R = P [X > Y] can be defined as

R = P [X > Y]

£ Px (x) Fy (x),

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

x=0

where PX (x) and FY (x) respectively, denote the PMF and CDF of the independent discrete random variables X and Y. Let X - DBHE (01) and Y - DBHE (02), then R of the DBHE is,

R

£

x=0

1 -

exp (-01 (x + 1))W 1 _ exp (—02) (1 + 01 + 01 x) ) V(1 + 02x) (1 + 02 + 02x)

exp (—02x)

(19)

Since, it is difficult to obtain the expression of R in explicit form therefore we perform a numerical analysis of R for different values of 01 and 02. The numerical outputs of R are presented in Table 3.

Table 3: The numerical values of R for different combinations of 01 and 02.

01 ; 02^ 0.05 0.1 0.25 0.5 1 2 5

0.05 0.51381 0.35905 0.18823 0.09638 0.03710 0.00830 0.00020

0.1 0.66849 0.51372 0.30089 0.16324 0.06501 0.01478 0.00036

0.25 0.81902 0.69564 0.46859 0.27722 0.11696 0.02740 0.00067

0.5 0.87830 0.77949 0.56485 0.35267 0.15502 0.03718 0.00092

1 0.90091 0.81440 0.61111 0.39304 0.17724 0.04320 0.00107

2 0.90559 0.82202 0.62219 0.40351 0.18342 0.04496 0.00112

5 0.90593 0.82258 0.62304 0.40435 0.18394 0.04511 0.00112

From this table, we observe that for any fixed value of 01, R decreases as 02 increases, whereas for a fixed value of 02, as 01 increases, the value of R also increases.

4. Analysis of complete data under DBHE distribution

In this section, we estimate the unknown parameter of the DBHE distribution using the MLE method. An algorithm for generating random data is presented. We also present numerical examples based on empirical and real-world datasets to demonstrate the utility of the proposed approach for evaluating complete data.

4.1. Maximum Likelihood Estimation with Complete Data

Suppose x = (x1, x2,...., xn) be a random sample from DBHE distribution then the log-likelihood function can be written as

n n /1

log L(x; 0) = —0 £ xi + glo^(TT0xy

exp (—0) (1 + 0 + 0 xi )

(20)

By differentiating Equation (22) with respect to the parameter 0, we get the non-linear likelihood equation as follows

1

£

i=1

exp(—0M ( 1+x,

(1 + 0 + 0x, )) lv(1 + 0 + 0x; )

+ 1 —

(1 + 0x;)2

1

exp(—0)

(1 + 0x; ) (1 + 0 + 0x; )

— £ x, i=1

0.

(21)

The solution of Equation (21) gives the MLE of 0. However, there is no explicit form for the solution of Equation (21). Therefore, Equation (21) has to be solved by using iterative methods such as Newton-Raphson, Nelder-Mead etc.

m

m

4.2. Numerical Illustration Using Simulated Data

In this subsection, we perform a Monte Carlo simulation study to show how well the MLE can estimate the unknown parameter of the DBHE distribution. Therefore, we conduct a simulation study with replication number 1,000. The true parameter values are used as 9 = 0.05, 9 = 0.25, and 9 = 0.5. There is no stated reason for using these parameter values. It may be used in several different ways. Random samples from the DBHE distribution are generated with n = 15,20,25,..., 100 sample sizes using Equation (13). The simulation results are interpreted based on the mean square errors (MSEs) and absolute biases (ABs) where

MSE

1

1000 1000 E

2

-0) and AB

1

1000

1000

E

j=1

0j - 0

here, 9 is an estimate of 9.

The simulation results are graphically summarized and displayed in Figure 3.

Figure 3: Plots for MSEs and ABs for different va/wes of 9 for complete data.

Figure 3 illustrates that the MSEs of the MLEs tend to zero as n approaches infinity. This demonstrates the consistency of the estimator. Furthermore, when n increases, the ABs is also declined to zero.

4.3. Real Data Analysis

In this section, we illustrate the utility of the DBHE distribution by examining two real-world datasets. Several criteria are used to compare fitted models, including the -logL, the Akaike information criterion (AIC), the Bayesian information criterion (BIC), the Hannan Quinn information criterion (HQIC), and the Chi-square (x2) statistic with its associated P-value. The descriptive summaries of the datasets are shown in Table 4. From this table, we can see that the IOD for all datasets is greater than 1, indicating that the considered datasets can only be modelled by discrete distributions with overdispersion phenomena. The comparing models to DBHE distribution are listed in Table 5.

Table 4: Descriptive Statistics of the Datasets.

Data n Mean Variance Skewness Kurtosis IOD COV

Dataset I Dataset II 100 400 0.67 0.5475 1.1526 1.1256 2.4697 9.7478 4.532 15.6829 1.7203 2.0558 1.6024 1.9378

Table 5: The competitive models of the DBHE distribution.

Distribution Abbreviation Parameter(s) Author(s)

Geometric Geo e -

Discrete Lindley DLi A [20]

Discrete Lindley-Two Parameter DLi-II P, ß [21]

Discrete Pareto DPa ß [5]

Discrete linear failure rate DLFR A1A2 [22]

Discrete inverse Weibull DIW a,ß [6]

Discrete log-logistic DLogL S,A [23]

Discrete Nielsen DN p, e [24]

Negative Binomial NB v® -

Zero-Inflated Negative Binomial ZINB V, 0, œ -

Poisson- Lindley PL e [25]

Generalized Poisson-Lindley GPL e, a [26]

Dataset I: The first dataset, consists of the recordings of the total number of carious teeth among the four deciduous molars in a sample of 100 children 10 and 11 years old [5]. The expected frequency of the fitted models along with their MLE, standard error (SE), -logL, and goodness of fit measures are presented in Table 6. Since, the values of -logL, x2 test statistic, AIC, BIC, CAIC, and HQIC of DBHE distribution are smallest among those of other considered models, hence this new distribution appears to be a very suitable model for this dataset. Similarly, the higher P-value corresponding to x2 statistic for DBHE distribution show its dominance on other candidate models in terms of model fitting.

Table 6: The MLE (SEs) and goodness of fit statistics for different models under dataset I.

Observed

X DBHE Geo DLi DLi-II DPa DLFR DIW DLogL

Frequency

0 64 62.80 59.88 57.13 59.88 69.04 59.9 63.3 62.73

1 17 21.37 24.02 26.88 24.02 15.37 24.01 22.48 22.42

2 10 8.60 9.64 10.45 9.64 6.01 9.63 6.44 7.01

3 6 3.78 3.87 3.71 3.87 3.01 3.86 2.76 2.98

>=4 3 3.45 2.59 1.83 2.59 6.57 2.6 5.02 4.86

Total 100 100 100 100 100 100 100 100 100

0.401 0.401 0.633 0.745

MLE (SE) 0.55043 (0.064) 0.59879, (0.038) 0.274 (0.029) (0.269), 0.478 (0.529) 0.184 (0.032) (0.056), 1.0 (0.044) (0.049), 1.576 (0.251) (0.101), 1.768 (0.267)

-logL 112.328 112.474 113.68 112.475 116.83 112.470 116.275 115.470

1.575 3.347 6.638 3.347 3.225 3.340 3.503 2.783

D.F. 2 2 2 1 2 1 1 1

P-value 0.455 0.188 0.036 0.067 0.199 0.068 0.061 0.095

AIC 226.656 226.947 229.36 228.950 235.66 228.940 236.550 234.940

BIC 229.261 229.552 232.96 234.160 238.27 234.150 241.760 240.150

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

CAIC 226.697 226.988 229.39 229.073 235.70 229.063 236.673 235.063

HQIC 227.710 228.001 230.41 231.058 236.72 231.048 238.658 237.048

Dataset II: The second dataset represents the number of chromatid aberrations in 24 hours [28]. The expected frequency of the fitted models along with their MLE, SE, -logL, and goodness of fit measures are presented in Table 7. On comparison of the values of -logL, x2 test statistic, P-value, AIC, BIC, CAIC, and HQIC, we again found that the DBHE distribution is the best model than the other five models understudy for this dataset.

AN INFERENTIAL STUDY OF DISCRETE BURR-HATKE RT&A, No 4 (71)

EXPONENTIAL DISTRIBUTION Volume 17, December 2022

Table 7: The MLE (SEs) and Goodness of fit statistics for different models under dataset IT.

X Observed Frequency DBHE DN NB ZINB PL GPL

0 268 269.36 270.14 270.18 270.18 257.02 269.24

1 87 80.48 79.40 78.55 78.55 93.39 78.70

2 26 29.28 29.21 29.84 29.84 32.76 30.86

3 9 11.76 11.88 12.22 12.22 11.21 12.55

4 4 5.01 5.11 5.19 5.19 3.77 5.13

5 2 2.22 2.28 2.25 2.25 1.25 2.09

6 1 0.90 1.05 0.99 0.99 0.41 0.85

7 3 0.47 0.93 0.78 0.78 0.13 0.35

Total 400 400 400 400 400 400 400

0.5475

0.5301 0.5475 (0.1701), 1.576

MLE (SEs) 0.63026 (0.0601), (011539), 0.6200 2.379 (0.259),

(0.037) 1.1089 0.6200 (0.3383), (0.169) 0.473

(0.2179) (0.1270) 0.00008 (0.2989) (0.159)

- log L 399.342 399.410 399.860 399.860 399.857 400.553

X2 1.781 1.924 2.416 2.416 6.283 2.940

D.F. 3 2 2 1 3 2

P-value 0.619 0.382 0.299 0.120 0.098 0.229

AIC 800.683 802.820 803.720 805.720 801.714 805.106

BIC 804.675 810.803 811.703 817.694 805.706 813.089

CAIC 800.693 802.850 803.750 805.781 801.724 805.136

HQIC 802.264 805.981 806.881 810.462 803.295 808.267

5. Analysis of randomly censored data under DBHE distribution

In this section, we derive the MLE of the unknown parameter of the DBHE distribution for random rightly-censored data. For the DBHE model, an algorithm for generating random right-censored data is presented. We also present numerical examples based on empirical and real-world datasets to show the usefulness of the proposed approach for evaluating random censored data.

5.1. Maximum Likelihood Estimation with Randomly Censored Data

Due to the availability of right-censored observations, the contribution of the ith individual for the likelihood function based on a random sample (x,, dj) of size n is given by

Li = [f (xi)]di [S (xi)]1-di,

where dj is a censoring indicator variable, that is, dj = 1 for an observed lifetime and d, = 0 for a censored lifetime (i = 1,2,3,...., n). Assuming the DBHE model, the likelihood function for 9 is given by

L (9|x,d) = ft {(^ - - <-*>}* { ^ (22)

where d = (di, d2,...., dn). The corresponding log-likelihood function is

exp (-9)

logl (9ix,d) = £ di k>g {- ^mX)} + £ (d-1) log(1+9xi) - 9gxi.(23)

Taking the first derivative of Equation (23) w.r.t. 9 and setting this derivative equal to zero, we can obtain the likelihood equation for the parameter 9. Although, it is hard to find a closed-form expression of MLE for the parameter 9 using this likelihood equation, therefore, we can use an appropriate numerical methodology such as the Newton-Raphson iteration method to obtain the MLE of 9.

5.2. Algorithm to Simulate Random Right-Censored Data

We present a simple approach in this part for generating random right-censored data from the suggested model. The algorithm is as follows:

Step 1: Fix the values of the parameter 9.

Step 2: Draw n random pseudo from Uniform(0,1) i.e. Ui ~ U(0,1); i = 1,2,..., n.

Step 3: Obtain x[ = F-1 (u;9);i = 1,2,...,n, where F-1 (•) is defined in Equation (13).

Step 4: Draw n random pseudo from c ~ U(0,max(x//));i = 1,2,...,n. This is the distribution that controls the censorship mechanism.

Step 5: If x[ < Cj, then x, = [x/j and di = 1, i = 1,2,..., n, else, x, = [c,] and di = 0, i = 1,2,..., n. Hence, pairs of values (xi, di), (x2, d2),..., (xn, dn) are obtained as the random right-censored data.

5.3. Numerical Illustration Using Simulated Random Right-Censored Data

This subsection portrays a simulation study to evaluate the performance of the MLE using randomly right-censored data. The whole study is based on randomly chosen samples from the DBHE distribution of sizes 20, 25, ...,100. The values of 9 are set to 0.05, 0.25, and 0.50. The procedure described above is used to generate the requisite random right-censored data. All simulation findings are based on 1000 replications for different settings of parameter values and sample sizes. Based on these 1000 values, we estimated the MSE and AB of the parameter estimate, and the resultant graphs are given in Figure 4.

Figure 4: Plots for MSEs and ABsfor different values of 9 under censored data.

As seen in Figure 4, the MSEs of the MLE approach 9 as n approaches infinity. This illustrates the estimator's consistency. Additionally, when n increases, the ABs is also tending to zero.

5.4. Application to Real Data Analysis

Here, we examine two real datasets to illustrate the applicability of the DBHE model to randomly

censored data. The following datasets and their fitting are described as follows:

Dataset III: This dataset is obtained from [29]. The data below are remission times, in weeks, for

a group of 30 patients with leukaemia who received similar treatment.

1,1, 2, 4, 4, 6, 6, 6, 7, 8, 9, 9, 10, 12, 13, 14, 18, 19, 24, 26, 29, 31*, 42, 45*, 50*, 57, 60, 71*, 85*, 91.

The observations with asterisks indicate censored times. The MLE (SE) of the 9 for the given

dataset is 0.0201 (0.0008). Now, we have been used Kolmogorov-Smirnov (K-S) test to check

whether the given data follows DBHE distribution or not. The calculated value of the K-S test is

0.13333 and P-value is equal to 0.9525. These values announce that the DBHE distribution can be

used to model this data.

Dataset IV: Here, we analyze another real dataset obtained from [29]. The data below show survival times (in months) of patients with Hodgkin's disease who were treated with nitrogen mustards.

1.05, 2.92, 3.61, 4.20, 4.49, 6.72, 7.31, 9.08, 9.11,14.49*, 16.85,18.82*, 26.59*, 30.26*, 41.34*. The asterisks observations represent censored times. For the provided dataset, the MLE (SE) of the 9 is 0.0311 (0.0027). We have also performed the K-S test to see whether the data distribution fits the DBHE distribution or not, and it is found that the K-S test has a value of 0.2 and a P-value of 0.9383. So, it can be seen that the DBHE distribution fits the data very well.

6. Conclusions

In this paper, we have proposed discrete Burr-Hatke exponential distribution. It is observed that with one parameter, this model has great flexibility in terms of fitting as it is capable of modelling right-skewed, decreasing failure rate, and over-dispersed counts datasets. Some of its fundamental properties have been discussed in detail. The unknown parameter of the DBHE distribution with complete and censored data has been estimated by using the maximum likelihood approach. We have provided an algorithm to generate randomly right-censored data. Additionally, the performance of the estimator under complete and censored data have been examined through an extensive simulation study. Finally, the flexibility of the DBHE distribution has been empirically proven by using four real-life applications consisting of two complete and two censored datasets. Hence, we can conclude that the proposed model will serve a wide spectrum of applications in various domains such as medical, reliability, survival analysis, etc.

Acknowledgement

The authors would like to express their gratitude to the editor for their care of the paper and giving us the the opportunity to make the paper better.

References

[1] Roy, D. (2003). The discrete normal distribution. Commun. Statist. Theor. Meth. 32(10):1871-1883.

[2] Chakraborty, S. (2015). Generating discrete analogues of continuous probability distributions-A survey of methods and constructions. Journal of Statistical Distributions and Applications, 2(1), 6.

[3] Roy, D. (2004). Discrete Rayleigh distribution. IEEE Trans. Re/iab. 53:255-260.

[4] Krishna, H., & Pundir, P. S. (2007). Discrete maxwell distribution. InterStat, 3.

[5] Krishna, H., & Pundir, P. S. (2009). Discrete Burr and discrete Pareto distributions. Statistical Methodology, 6(2), 177-188.

[6] Jazi, M. A., Lai, C. D., & Alamatsaz, M. H. (2010). A discrete inverse Weibull distribution and estimation of its parameters. Statistical Methodology, 7(2), 121-132.

[7] Al-Huniti, A. A., & AL-Dayian, G. R. (2012). Discrete Burr type III distribution. American Journal of Mathematics and Statistics, 2(5), 145-152.

[8] Alamatsaz, M. H., Dey, S., Dey, T., & Harandi, S. S. (2016). Discrete generalized Rayleigh distribution. Pakistan journal of statistics, 32(1).

[9] Jayakumar, K., & Babu, M. G. (2018). Discrete Weibull geometric distribution and its properties. Communications in Statistics-Theory and Methods, 47(7), 1767-1783.

[10] Tyagi, A., Choudhary, N., and Singh, B. (2019). Discrete additive Perks-Weibull distribution: Properties and applications. Life Cycle Reliability and Safety Engineering, 8(3), 183-199.

[11] Tyagi, A., Choudhary, N., & Singh, B. (2020). A new discrete distribution: Theory and applications to discrete failure lifetime and count data. J. Appl. Probab. Statist, 15,117-143.

[12] El-Morshedy, M., Eliwa, M. S., & Tyagi, A. (2021). A discrete analogue of odd Weibull-G family of distributions: properties, classical and Bayesian estimation with applications to count data. Journal of Applied Statistics, 1-25.

[13] Garg, R., Dube, M., & Krishna, H. (2020). Estimation of parameters and reliability characteristics in Lindley distribution using randomly censored data. Statistics, Optimization & Information Computing, 8(1), 80-97.

[14] Krishna, H., & Goel, N. (2017). Maximum likelihood and Bayes estimation in randomly censored geometric distribution. Journal of Probability and Statistics, 2017.

[15] Achcar, J. A., Martinez, E. Z., de Freitas, B. C. L., & de Oliveira Peres, M. V. (2021). Classical and Bayesian inference approaches for the exponentiated discrete Weibull model with censored data and a cure fraction. Pakistan Journal of Statistics and Operation Research, 467-481.

[16] Singh, B., Singh, R. P., Nayal, A. S., & Tyagi, A. (2022). Discrete Inverted Nadarajah-Haghighi Distribution: Properties and Classical Estimation with Application to Complete and Censored data. Statistics, Optimization & Information Computing, 10(4), 1293-1313.

[17] Yadav, A. S., Altun, E., & Yousof, H. M. (2019). Burr-Hatke Exponential Distribution: A Decreasing Failure Rate Model Statistical Inference and Applications. Annals of Data Science, 1-20.

[18] Gupta, P. L., Gupta, R. C., and Tripathi, R. C. (1997). On the monotonic properties of discrete failure rates. Journal of Statistical Planning and Inference, 65(2), 255-268.

[19] Rohatgi, V. K., & Saleh, A. M. E. (2015). An introduction to probability and statistics. John Wiley & Sons.

[20] Gomez-Deniz, E., & Calderin-Ojeda, E. (2011). The discrete Lindley distribution: properties and applications. Journal of Statistical Computation and Simulation, 81(11), 1405-1416.

[21] Hussain, T., Aslam, M., & Ahmad, M. (2016). A two parameter discrete Lindley distribution. Revista Colombiana de Estadística, 39(1), 45-61.

[22] Kumar, C., Tripathi, Y. M., & Rastogi, M. K. (2017). On a discrete analogue of linear failure rate distribution. American journal of mathematical and management sciences, 36(3), 229-246.

[23] Para, B. A., & Jan, T. R. (2016). Discrete version of log-logistic distribution and its applications in genetics. Int. J. Mod. Math. Sci, 14(4), 407-422.

[24] Castellares, F., Lemonte, A. J., & Santos, M. A. (2020). On the Nielsen distribution. Brazilian Journal of Probability and Statistics, 34(1), 90-111.

[25] Sankaran, M. (1970). 275. note: The discrete Poisson-Lindley distribution. Biometrics, 145-149.

[26] Mahmoudi, E., & Zakerzadeh, H. (2010).Generalized Poisson-Lindley distribution. Communications in Statistics—Theory and Methods, 39(10), 1785-1798.

[27] Phyo, I. (1973). Use of a Chain Binomial in the Epidemiology of Caries. Journal of dental research, 52(4), 750-752.

[28] Castellares, F., Lemonte, A. J., & Santos, M. A. (2020). On the Nielsen distribution. Brazilian Journal of Probability and Statistics, 34(1), 90-111.

[29] Lawless, J. F. (2011). Statistical models and methods for lifetime data (Vol. 362). John Wiley & Sons.

i Надоели баннеры? Вы всегда можете отключить рекламу.