ESTIMATION OF DIFFERENT ENTROPIES OF INVERSE RAYLEIGH DISTRIBUTION UNDER MULTIPLE
CENSORED DATA
Hemani Sharma and Parmil Kumar
Department of Statistics, University of Jammu, J&K. [email protected], [email protected]
Abstract
The inverse Rayleigh distribution finds widespread applications within life testing and reliability research. Particularly, it proves invaluable in scenarios involving multiple censored data points. In this context, the Renyi, Havrda, Charvat, and Tsallis entropies of the inverse Rayleigh distribution are efficiently calculated. The maximum likelihood approach is used to get the estimators, as well as the approximate confidence interval. The mean squared errors, approximate confidence interval, and their related average length are computed. To illuminate the behavior of estimates across varying sample sizes, a comprehensive simulation study is conducted. The outcomes of the simulation study consistently reveal a downward trend in mean squared errors and average lengths as the sample size increases. Additionally, an interesting finding emerges as the censoring level diminishes. The entropy estimators progressively converge towards their true values. For practical demonstration, the effectiveness of the approach is showcased through the analysis of two real-world datasets. These applications underscore the real-world relevance of the methodology, further validating its utility in addressing complex scenarios involving censored data and inverse Rayleigh distributions.
Keywords: inverse Rayleigh distribution, Renyi entropy, Havrda and Charvat entropy, Tsallis entropy, multiple censored.
1. Introduction
The concept of entropy measurement is essential in many fields, including statistics, economics, and physical, chemical, and biological phenomena. The concept of entropy was first proposed as a thermodynamic state variable in classical thermodynamics, and it is based on principles from probability theory and mathematical statistics. Although the term information theory does not have a precise meaning, it can be considered of as the study of problems involving any probabilistic system. Entropy is referred to as the amount of information found in the sample. One of the most important aspects of statistics is the study of probability distributions. Every probability distribution contains some element of uncertainty. Entropy is a phenomenon that can be utilised to provide a quantitative estimate of uncertainty. Entropy is also a measure of disorder or randomness in a probabilistic system having a large number of random states with equal probability, and is zero when the system is in a specified state with no uncertainty. In other terms, a random variable's entropy is a measure of the amount of information required to explain a random variable on average. Shannon [13] established the concept of entropy as a measure of information. Here, we focus our attention on three entropy measures- the Renyi [11], Havrad and Chavrat [7], Tsallis entropies [15]. The Renyi entropy [11] comes from information theory, whereas the Tsallis entropy [15] comes from statistical physics, and both have a wide range of applications in their respective fields. These three entropy measures are defined, accordingly, for
an arbitrary variable X with the Probability Density Function (PDF) f(x;<j corresponding parameters.
Rs(X; f)
1 - S
log
f (x; f)Sdx
), where q denotes the
(1)
where S =1 and S > 0, and
HCs (X; f)
21-S — 1
rœ t / f (x; f)Sdx - 1
J — TO
where S =1 and S > 0, and
where S =1 and S > 0,
Ts(X; f)
S1
1 - f (x; f)S dx
J — TO
(2)
(3)
Lord Rayleigh [12] initially proposed the Rayleigh distribution in relation to an acoustic problem. Since then, a great deal of work has been done in numerous domains of science and technology to improve this distribution. The Rayleigh distribution's hazard function is an increasing function of time, which is an important property. If the random variable Y has a Rayleigh distribution, the random variable X = Y has an inverse Rayleigh distribution (IRD). Trayer [14] proposed the Inverse Rayleigh distribution (IRD). The IRD is used in a variety of applications, including as life tests and reliability studies. A random variable X is said to have inverse Rayleigh distribution if its PDF and CDF has the following form:
f (x; a) = ^St^P
F(x; a) = exp
a
a \ 2
; x > 0, a > 0
; x > 0, a > 0
(4)
(5)
Wong and Chan [17] explored the entropy of ordered sequences and the order statistic. The entropy of upper record values was studied by Baratpour et al. [4], whereas the entropy of lower record values was proposed by Morabbi and Razmkhah [?]. Abo-Eleneen [1] discussed the entropy of progressively censored samples, Cho et al. [5] estimated the entropy for the Rayleigh distribution via doubly-generalized Type II hybrid censored samples using maximum likelihood and Bayes estimators, and Hassan and Zaky [6] investigated point and interval estimation of the Shannon entropy for the for the inverse Weibull distribution under multiple censored data. Bantan et al. [3] used multiple censored data to derive the Renyi and q-entropy for the inverse Lomax distribution. To measure the Lomax distribution's dynamic cumulative residual Renyi entropy, Al-Babtain et.al [2] explored the Bayesian and non-Bayesian techniques.
However, the estimation of entropy measures for the inverse Rayleigh distribution (IR), such as the Renyi, Havrad, and Chavrat, Tsallis entropies, still an unresolved subject . The problem is examined in the context of multiple censored data in this study, which fills the gap. This is a common scenario in which many censoring levels are logically present, as it is in many situations for life assessment and survival analysis. Renyi, Havrad and Chavrat, Tsallis entropies are derived in our study after analysing the maximum likelihood estimator of a .A comprehensive numerical analysis is carried out, demonstrating that the derived estimates behave well across a range of sample sizes. The mean squared errors, estimated confidence intervals, and associated average lengths are considered as benchmarks. The values of the mean squared errors and average lengths decreases as the sample size rises, according to our numerical findings. Furthermore, as the censoring level is reduced, the Renyi, Havrad and Chavrat, Tsallis entropies estimates approaches the real value. The findings are illustrated using a real-life data set.
The next is how the rest of the article is organised: Section 2 gives the Renyi, Havrad, and Chavrat,
1
TO
TO
1
1
TO
2
Tsallis entropies for the inverse Rayleigh (IR) distribution. Section 3 focuses at how they can be estimated using multiple censored data. Section 4 contains the simulation and numerical results. Section 5 demonstrates how the method can be used to real-world data sets. Section 6 ends with some summing comments.
2. Expressions of the Renyi, Havrad and Chavrat, Tsallis entropies
Let X be an arbitrary variable with parameter a that follows the IR distribution. The Renyi entropy of X with q> = (a) by using (1)and (3) is given as
Rs (X; a)
1- s
log
rœ 2a2
L -yexp
- -x )2
dx
put x = y ^ x = y ^ dx = a (-i) dy.
Rs(X; a) = log £ 2sy2s (y)Sexp(-Sy2)^dy
1 tœ (—js ) — logJ y3s-2exp (-sy2)dy
1-s 1
1- s
log
_20 pœ
y3s-2exp (-sy2)dy
Put y2 = t ^ y = /t ^ dy = 2/ dt
(6)
Rs (X; a)
1
1
1-s 1
1- s
1 - s log
log
log
-2s Гœ 3J-L . v . 1
-T-T t 2 exp (-st)—Fdt as-1 Jo 'lyft
2s 1
as
s 1 i™ 3s 1 1 , , ч ,
T.— Jo t T - 2-1 exp (-st)dt
_2s-1 fœ
~ '0
s1
cr
"fœ / - N 3s 1 1 ,
rJ exp (-st) * t T - 2-1dt
Rs (X; a)
1s
log
(Л0-1 Г , с \ 3s 1 . (-J J exp (-st) * t t - 2-1dt
Rs (X; a) = ^ log
0 N s-1 r3s-1
— I — I 1 2
a; s
3Q-1 2
with s= 1, s > 0 and 3s - 1 > 0.
(7)
Similarly, on using equation equation (7), Havrad and Chavrat entropy and Tsallis entropy of X is given by
" ' ' /0 x3
HCs (X; a)
21-s - 1 1
Г 2a2 ^rexp
- ( x )2
dx 1
21-s 1
a
2 \ s-1 г 3q-!
3q-1
s 2
1
with s = 1, s > 0 and 3s - 1 > 0.
Ts(X; a)
s1
1
2a2
^rexp
0 x3
- ©
dx
s
1
1
s
2
1
5- 1
0 \ 5-1 r 3q-1 1 + l i)
(9)
with 5 = 1, 5 > 0 and 35 - 1 > 0.
The appropriate expressions of Renyi, Havrad, and Chavrat and Tsallis entropies of X, simply stated as functions of parameter a, are represented by Equations (7), (8), and (9) respectively.
1
3. Entropy Estimation
Let X be a random variable with cdf and pdf equal to f(x;f) and F(x;f), respectively. We acquire n values xi,x2,...,xn based on n units under a given test, where nf and nm are the number of failed and censored units, respectively. The Likelihood function for f is as follows:
L(ç) = Kn[f (Xi; ?)]£''f [1 - ; ?)]£'> i=1
(10)
where K is a constant.
£i,f=1 if the ith unit failed, and 0 otherwise (so ££=1 £i,f = nf) £j,m=1 if the ith unit censored, and 0 otherwise (so £=1 £i,m = nm).
By inserting (4) and (5) in (10), we can get the likelihood function of the IR distribution based on multiple censored samples is given by
L(a) = K n
i=1
2a2 ^exp
a
£i, f
1 - exp
a
(11)
The log-likelihood function is given by
log 1(a) = logK + 2 ££i,f log(^2) - £ £i,f log(x3) - £ £i,f( V ) + £ £i,m log
i=1 i=1 VXi/ i=1
The MlE is obtained by maximizing L(a) with respect to a, and is given by
2
1 — exp--
dlog 1(a) _ 2nf
2a
(
da
¿f -2a - E£ ^ + E
i=1
i i=1
1
1 - exp - a
a
-2a
-v2
4nf " (2a\ En=1£ i,mexp(-Xa
- E £i,f U2 + 7-7 \~2"
a i=1 Vx7 h - exp
2a £
(12)
The above equation is in closed form therefore, cannot be solved manually. So the MLE estimate of a is obtained with the help of matlab.
On substituting the MLE of a in (7), (8) and (9), estimates for the entropies Rg(X; a), HCg(X; a) and Tg(X; a), are, respectively, given by
R (X; a)
15
log
2\ 5-1 r 35-
u
35-1 5 2
(13)
with 5 = 1, 5 > 0 and 35 - 1 > 0.
n
£
2
2
i.m
2
n
n
n
n
n
£
im
2
2
1
21-5 _ i
¿-1 r 3q-1
3q-1
5 2
- 1
(14)
HCS (X; a) = with 5 = 1, 5 > 0 and 35 — 1 > 0.
Ts(X; a)
with 5 = 1, 5 > 0 and 35 - 1 > 0.
Under sufficient regularity requirements, the MLE estimators are consistent and asymptotically normal distributed for large sample sizes. At the confidence level 100(1-a) with a = (0,1), the estimated confidence interval for the Renyi entropy can be calculated as follows:
5- 1
0 \ 5-1 r3q-1
1+' D
(15)
P
z R5(x) - R5(X) z
—z a ^ - ^ z a
Ci
R5 (X)
1 — a
(16)
where z a is 100(1 — f) the standard normal percentile and v is the significant level. As a result, approximate Renyi entropy confidence bounds can be determined, such that
P[R5(X) - z« CT^X)) < R5(X) < R5(X) + za Cr(X))]
1 — a
(17)
where LH = R5(X) — z a ar(X)), Uh = R5(X) + za ar(X)) are the lower and upper confidence limits for Rg(X) and a is the standard deviation and a = 0.05, the approximate confidence limits for Renyi entropy will be constructed with confidence levels 95%. A similar result holds for HC5 (X) and T5 (X) .
4. Simulation Study
The procedure adopted to examine the performance of the Proposed estimators given by (13), (14) and (15) are as:
• 1000 random samples of sizes n = 50,100,150,200,300,400 are obtained from the IR distribution based on multiple censored samples, Using the method described in [16].
• The values of parameters are selected as 5 = 0.4,1.2,1.5 and a = 1.2. We chose CL = 0.5 and 0.7 at random for failures at the censoring level (CL).
• The estimated value for a, true values for R5(X; a), HC5(X; a) and T5(X; a) are obtained by (12), (7), (8) and (9), and the estimates R5(X;a), HC5(X;a) and T5(X; a) given by (13), (14)and (15) are calculated, respectively.
• At last, the average of the derived estimates, MSEs, and ALs are computed with a threshold of 95% All the calculations are done by the use of the software Matlab and R. From the tables,the following conclusions have been made:
• As the sample size grows, the bias and MSEs of entropy estimates fall.
• Additionally, as the sample size grows, the ALs of estimates diminish.
• As the sample size expands, the entropy estimations approach their true values.
• The MSE of entropy estimates at CL = 0.5 is usually less than the MSE of estimates at CL = 0.7.
These findings demonstrate the high precision of our entropy estimates.
Table 1: Renyi Entropy Estimates at CL=0.5(a = 1.2,5 = 0.4)
n Actual Value Estimate Bias MSE AL
50 0.9930 1.1060 0.1130 6.33 * e- -05 0.0395
100 0.9089 0.0841 5.40 * e- -05 0.0190
150 0.9465 0.0495 4.71 * e- -05 0.0121
200 0.9506 0.0424 1.79 * e- -05 0.0079
300 0.9560 0.0370 4.56 * e- 06 0.0064
400 0.9943 0.0013 9.52 * e- 09 0.0047
Table 2: Renyi Entropy Estimates at CL=0.7(a = 1.2,5 = 0.4)
n Actual Value Estimates Bias MSE AL
50 0.9930 0.8798 0.1132 3.19 * e- 04 0.0382
100 1.0934 0.1004 1.01 * e- 04 0.0219
150 1.0694 0.0764 3.19 * e- 04 0.0140
200 1.0533 0.0603 2.42 * e- 05 0.0099
300 0.9561 0.0369 1.95 * e- 05 0.0071
400 0.9864 0.0065 7.26 e- -07 0.0044
Table 3: HC Entropy Estimates at CL=0.5(a = 1.2,5 = 1.5)
n Actual Value Estimate Bias MSE AL
50 6.5486 6.0175 0.5311 0.0056 0.2407
100 6.0733 0.4753 0.0023 0.1215
150 6.2092 0.3394 7.67 * -04 0.0828
200 6.3493 0.1993 1.98 * e-04 0.0635
300 6.6832 0.1346 6.03 * e-05 0.0446
400 6.5667 0.0181 8.16 * e-07 0.0328
Table 4: HC Entropy Estimates at CL=0.7(a = 1.2,5 = 1.5)
n Actual Value Estimate Bias MSE AL
50 6.5486 5.8458 0.7028 0.0099 0.2338
100 5.9364 0.6122 0.0037 0.1187
150 6.1541 0.3945 0.0010 0.0821
200 6.2186 0.3300 5.44 * e-05 0.0622
300 6.4073 0.1413 6.65 * e-05 0.0427
400 6.5249 0.0237 1.40 * e-07 0.0326
Table 5: Tsallis Entropy Estimates at CL=0.5(a = 1.2,5 = 1.2)
n Actual Value Estimate Bias MSE AL
50 11.9156 12.5135 0.5979 0.0036 0.4883
100 12.2072 0.2916 0.0017 0.2434
150 12.0878 0.1722 9.88 * e-05 0.1607
200 12.0497 0.1341 1.19 * e-05 0.1198
300 11.9792 0.0636 2.02 * e-05 0.0806
400 11.9027 0.0130 4.19 * e-07 0.0595
Hemani Sharma, Parmil Kumar RT&A, No 1 (77)
ESTIMATION OF DIFFERENT ENTROPIES Volume 19, March 2024
Table 6: Tsallis Entropy Estimates at CL=0.7(a = 1.2, g = 1.2)
n Actual Value Estimate Bias MSE AL
50 11.9156 12.6254 0.7098 0.0101 0.5050
100 12.3534 0.4377 0.0019 0.2471
150 12.1353 0.2197 3.21 * e-04 0.1618
200 12.0573 0.1417 1.00 * e-04 0.1206
300 11.8533 0.0623 1.29 * e-05 0.0790
400 11.9585 0.0429 4.60 * e-07 0.0598
Renyi Entropy HC Entropy Tsallis Entropy
50 100 150 200 250 300 350 400
Figure 1: (a) Bias of Renyi, Havrda and Charvat, Tsallis entropy at CL=0.5 and (b) Bias of Renyi, Havrda and Charvat, Tsallis entropy at CL=0.7
50
100
200
250
300
350
400
n
n
Figure 2: (a) Average Length of Renyi, Havrda and Charvat, Tsallis entropy at CL=0.5 and (b) Average Length of Renyi, Havrda and Charvat, Tsallis entropy at CL=0.7
n
Figure 3: (a) MSE of Renyi entropy at CL=0.5 and CL=0.7, (b) MSE ofHavrda and Charvat entropy at CL=0.5 and CL=0.7 and (c) MSE ofTsallis entropy at CL=0.5 and CL=0.7
5. Data Analysis
To demonstrate the effectiveness of our estimation methods, we utilize the dataset pertaining to fatigue failure times of twenty-three ball bearings as documented in [8]. This dataset has been extensively employed in various research investigations.
Dataset I: 0.1788,0.2892,0.3300,0.4152,0.4212,0.4560,0.4840,0.5184,0.5196,0.5412,0.5556,0.6780, 0.6864,0.6888,0.8412,0.9312,0.9864,1.0512,1.0584,1.2792,1.2804,1.7340. The Kolmogorov-Smirnov (K-S) distance and its corresponding p-value for the actual dataset are calculated as 0.1440 and 0.6988 respectively. These values suggest that the observed dataset aligns well with the inverse Rayleigh distribution. This assertion gains further validation through the visualization of the empirical Cumulative Distribution Function (ECDF) plot, the quantile-quantile (Q-Q) plot, and the Histogram, showcased in figures 4 and 5. Derived from the complete sample, the maximum likelihood estimate of the parameter sigma is determined as 0.4681, with a standard error of 0.0499.
Empirical CDF
E
M 0.4
1 1.5
Theoretical Quantiles
Figure 4: (a) Ecdf plot for the dataset I (b) Q-Q plot for the dataset I
0
0
0
2
0
0.0 0.5 1.0 1.5
data
Figure 5: Plot of the fitted density for dataset I
Dataset II: The second dataset, sourced from [9], encompasses monthly actual tax revenues in Egypt spanning from January 2006 to November 2010. These data points are measured in 1000 million Egyptian pounds and exhibit the following sequence: 5.9,20.4,14.9,16.2,17.2,7.8,6.1,9.2,10.2, 9.6,13.3,8.5,21.6,18.5,5.1,6.7,17,8.6,9.7,39.2,35.7,15.7,9.7,10,4.1,36,8.5,8,9.2,26.2,21.9,16.7,21.3, 35.4,14.3,8.5,10.6,19.1,20.5,7.1,7.7,18.1,16.5,11.9,7,8.6,12.5,10.3,11.2,6.1,8.4,11,11.6,11.9,5.2,6.8, 8.9,7.1,10.8. The Kolmogorov-Smirnov (K-S) distance and its corresponding p-value for this dataset stand at 0.08219 and 0.8203, respectively. These results suggest a fitting match with the inverse Rayleigh distribution. This assertion gains further support from the visual analyses, including the Empirical CDF plot, Quantile-Quantile (Q-Q) plot, and Histogram are depicted in figures 6 and 7. The maximum likelihood estimate for the parameter sigma, obtained from the complete dataset is 9.3595, with a standard error of 0.6092. Table 7 and 8 present estimates for different entropy measures in both datasets. These tables reveal as the parameter 5 increases, Renyi entropy demonstrates an ascending trend, whereas Tsallis and HC entropies exhibit a descending trend with the increase of 5. Additionally, the estimates are notably influenced by the level of censoring.
Figure 6: (a) Ecdf plot for the dataset II (b) Q-Q plot for the dataset II
'55
CD Q
oo o
CD
o
o
o
o o
10
I
20 data
30
40
Figure 7: Plot of the fitted density for dataset II
Table 7: Estimated of Renyi entropy, Tsallis entropy and HC entropy at CL=0.5, 0.7 for Dataset I.
CL=0.5
CL=0.7
5 Rg (X) T5 (X) HC5 (X) R5 (X) T5 (X) HC5 (X)
1.2 2
-2.6986 -1.7333
13.5778 6.6593
14.7472 2.4727
-2.2308 -1.2654
12.8115 4.5447
15.4360 2.7547
Table 8: Estimated of Renyi entropy, Tsallis entropy and HC entropy at CL=0.5, 0.7 for Dataset II.
CL=0.5
CL=0.7
5 R5 (X) HC5 (X) T5 (X) R5 (X) HC5 (X) T5 (X)
1.2 2
0.3961 1.3615
9.6191 1.2562
20.7654 12.4407
0.4187 1.3841
9.5983 1.2505
20.8244 12.6790
0
6. Conclusion
In this article, the Renyi, Havrda and Charvat, Tsallis entropies of the inverse Rayleigh distribution are estimated using multiple censored data. Using maximum likelihood and plugging approach, we present an efficient estimation strategy. The Renyi, Havrda and Charvat, and Tsallis entropies estimates' behaviour is measured in terms of mean squared errors and average lengths. According to numerical results, the bias and mean squared errors of our estimators decreases as the sample size grows. It's also worth noting that as the sample size grows, the average length of our estimators shrinks. As a result, the proposed estimates show to be efficient, giving new valuable tools with potential relevance in a wide range of applications involving the inverse Rayleigh distribution's entropy. The paper concludes with an applications to a real-world data sets. In upcoming research endeavors, one could explore the assessment of entropies using both Bayesian and E-Bayesian methodologies across various censoring scenarios.
Acknowledgments: This work is supported by the Department of Science and Technology (DST).
References
[1] Abo-Eleneen, Z. A. (2011). The entropy of progressively censored samples. Entropy, 13(2):437-449.
[2] Al-Babtain, A. A., Hassan, A. S., Zaky, A. N., Elbatal, I. and Elgarhy, M. (2021). Dynamic cumulative residual renyi entropy for lomax distribution: Bayesian and non-bayesian methods. J. AIMS Mathematics, 6(4):3889-3914.
[3] Bantan, R. A. R., Elgarhy, M., Chesneau, C. and Jamal, F.(2020). Estimation of entropy for inverse lomax distribution under multiple censored data. Entropy, 22(6).
[4] Baratpour, S., Ahmadi, J. and Arghami, N.R. (2007). Entropy properties of record statistics. Statistical Papers, 48:197-213.
[5] Cho, Y., Sun, H. and Lee, K. (2014). An estimation of the entropy for a rayleigh distribution based on doubly-generalized type-ii hybrid censored samples. Entropy, 16(7):3655-3669.
[6] Hassan, A. S. and Zaky, A. N. (2019). Estimation of entropy for inverse weibull distribution under multiple censored data. Journal ofTaibah University for Science, 13(1):331-337.
[7] Havrda, J. and Charvat, F. (1967). Quantification method in classification processes: concept of structural a-entropy Kybernetika, 3:30-35.
[8] Lawless, J. F. (2011). Statistical models and methods for lifetime data. John Wiley & Sons, New York, NY, USA.
[9] Mead, M.E. (2016). On five-parameter lomax distribution: Properties and applications. Pak. J. Stat. Oper. Res., 1:185-199.
[10] Morabbi, H. and Razmkhah, M. (2010). Entropy of hybrid censoring schemes. Journal of Statistical Research of Iran, 6(2).
[11] Renyi, A. (1961). On the measure of entropy and information. Proceedings of the fourth Berkely symposium on mathematical statistics and probability, 1:547-561.
[12] Rayleigh S J. W. S. (1880). On the resultant of a large number of vibrations of the some pitch and of arbitrary phase.Philosophical Magazine, 10:73--78.
[13] Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3):379-423.
[14] Trayer, VN. (1964). Inverse rayleigh (ir) model. Proceedings of the Academy of Science, Doklady Akad, Nauk Belarus, USSR.
[15] Tsallis, C. (1968). Possible generalization of boltzmann-gibbs statistics. Journal of Statistical Physics, 52:479-487.
[16] Wang, F. K. and Cheng, Y. F. (2010). Em algorithm for estimating the burr xii parameters with multiple censored data. Quality and Reliability Engineering International, 26.
[17] Wong, K.M. and Chan, S. (1990). The entropy of ordered sequences and order statistics. IEEE Transactions on Information Theory, 36(2):276-284.