NEW MEDIAN BASED ALMOST UNBIASED EXPONENTIAL TYPE RATIO ESTIMATORS IN THE ABSENCE OF AUXILIARY VARIABLE
Sajad Hussain*, Vilayat Ali Bhat1 •
*University School of Business, Chandigrah University, Mohali, Gharuan - 140413, INDIA department of Statistics, Pondicherry University, Puducherry - 605014, INDIA [email protected], [email protected]
Abstract
The problem of biasness and availability of auxiliary variable for the estimating population mean is a big concern, both can be handled by proposing unbiased estimators in the absence of auxiliary variable. So in this paper unbiased exponential type estimators of population mean have been proposed. The estimators are proposed in the absence of the instrumental variable called the auxiliary variable by taking the advantage of the population and the sample median of the study variable. To about the first order approximation, the theoretical formulations of the bias and mean square error (MSE) are obtained. The circumstances in which the suggested estimators have the lowest mean squared error values when compared to the existing estimators were also deduced. In comparison to the currently used estimators, it was discovered that the suggested estimators of population mean had the lowest MSE, hence highest efficiency. Also least influence from the data's influential observations when it came to accurately calculating the population mean for skewed data. The theoretical findings of the paper are validated by the numerical study.
Keywords: Dual estimator, Median, Study variable, Unbiased Estimator, Mean square error
1. Introduction
In sampling, a representative part of the population called the sample is studied to determine the parameters or the characteristics of the population like the mean, variance, median, correlation coefficient etc. The sample mean estimator is a good device to find the approximate value of the population mean. The population mean can be estimated more precisely by using the auxiliary information as the pioneer work of Cochran [2] proposed a ratio estimator which is more precise than the sample mean estimator when the study and auxiliary variable are positively correlated, though this estimator is biased. Using auxiliary information which is often obtained at an extra survey cost, the authors such as Sisodia and Dwivedi [9], Yadav and Kadilar [10], Ekpenyong and Enang [15] etc. proposed modified ratio estimators which are more efficient than the sample mean and the classical ratio estimator. The pioneering work of Subramani [6] offered a precise ratio estimator to estimate the mean of skewed population without the usage of auxiliary variable by leveraging the median of main variable as the aforementioned estimators are inapplicable in the absence of the auxiliary variable. The median of the study variable may easily be available without having exact information on every data point (See Subramani [6] and these median based estimators are robust in nature since outliers have the least impact on them. The exponential ratio estimator of population mean introduced by Bahl and Tuteja [1] can be applied even when correlations are weak. Later, to be more accurate than the traditional exponential ratio estimator, Singh et.al [12], Yasmeen et al. [8], Zaman and Kadilar [4], Hussain et al. [3] and several others suggested various modified exponential ratio type estimators.
The estimators discussed above are all biassed, which could cause the population mean to
be overestimated or underestimated. As a result, the authors Singh et al. [13], Yadav et al. [11], Singh et al. [14] etc. proposed almost unbiased estimators of population mean in presence of auxiliary variable. The auxiliary variable may not be always avaliable, therefore in the absence of auxiliary variable present study is carried to propose high precision exponential type estimators of population mean which may be unbiased in nature and also able to handle the voluminous data influenced by outliers.
2. Material and Methods
Assume that a simple random sampling without replacement (SRSWOR) sampling strategy is used to select a random sample of size n from a finite population of N number of units. The goal of the study is to estimate the population mean Y = N Eî=i Y without the use of auxiliary variable information. Assume that data on the study variable Y's correlation with the auxiliary variable X is accessible for every member of the population. The notations and formula used in the paper are as follows
Study Variable Auxiliary Variable
"Cy = Sy is the coefficient of variation . : Cx = x is the corfficfent of variation.
= _L_¿N (Y - ~
Jy = N-l ¿i=1( i
square. mean square.
syy = Tn=i(Vi - y)2 is the sample mean : sx = n-i¿=i(xi - x)2 is the sample
square. square.
S? = nLtEi=i (Y - Y)2 is the population mean : SX = n— Ö=i (Xi - X) is the population
mean
Further,
M (m) are the population (sample) median of the main variable.
l Nc
M = NC_ £i=i mi is the average of sample medians of the main variable.
mi is the sample median of ith sample (i = 1,2, ...,N Cn). NCn is the number of samples of size n from N. The study variable dataset is skewed, M = Y.
Cm = Sm, Cym = Ym, Sym = n^¿Cl (y - Y)(mi - M), S2m = NC* ¿=1 (mi - M)2. P = "Sf, is the population correlation coefficient between X and Y.
1_f
Y = —nf, where the sampling fraction f = N.
6
rX »
2(rX+s)
3. Examining Current Ratio Type Estimators
The sample mean estimator is the fundamental estimator of the population mean without the usage of an auxiliary variable as
1 n t1 = ^ y.
Bias and MSE of the estimator ti up to O(n)-1 are as
Bias(h) = 0. (1)
MSE(t1) = yY2Cy. (2)
Making the use of auxiliary variable Cochran [2] proposed a ratio estimator which is more efficient than the estimator t1, if 2Cy < p < +1 as
X
t2 = y—.
y x
The Bias and MSE expressions of the estimator t2 up to O(n) 1 are as
Bias(t2) = jY(C2x - Cyx). MSE(t2) = yY2(C2 + C2 - 2Cyx).
(3)
(4)
According to Bahl and Tuteja [1] proposal, an effective exponential ratio type estimator may be applied even when X and Y have weak correlations.
t3 = y exp
X — x
X + x
The estimator t3 is more efficient than the sample mean estimator t\, if 1 < pxy(y < f. The Bias and MSE up to O(n)-1 are as
31 Bias(t3) = jY [ 8Cx - 2Cyx
MSE(t3) = 7Y2 (C.2y + C2 - Cyx^j .
(5)
(6)
A family of modified exponential ratio estimators was presented by Singh et al. [12] employing various well-known auxiliary variable characteristics, such as correlation coefficient, coefficient of variation, skewness, etc. as
t4 = y exp
(rX + s) - (rx - s)
(rX + s) + (rx - s)_ The Bias and MSE expressions of up to O(n)-1 for the estimator t4 are as
Bias(t4) = yY(92C2 - 0Cyx).
MSE(t4) = jY2(C^ + 62C2 - 26Cyx).
A median-based ratio type estimator without the usage of an auxiliary variable was proposed by Subramani [6] as
M
(7)
(8)
t5 = y
m
The estimator t5 is biased with the expressions of Bias and MSE up to O(n) 1 as
Bias(t5) = yY [Cm - Cym -
Bias(m) \ YM )
MSE(t5 ) = yY2 ( Cy + M Cm - 2 M^Cym
(9)
(10)
The sample mean estimators is biased while as all other estimators viz. Cochran [2], Bahl and Tuteja [1], Singh et al. [12] and Subramani [6] discussed above are biased.
4. Proposed Exponential Ratio Estimators
In the absence of auxiliary variable, the proposed exponential ratio type estimators of the study are as
(M - m\ .„ . (m - MN aexp „» +(1 - a) exp
tue1 = y tue2 = y
ß exp
aM
M - m bm
+ (1 - ß) exp
aM
m - M bm
The value of non zero constants a and b is chosen such that the estimators tue1 and tue2 are unbiased and the value of constants a and fi are chosen such that the MSE of tue1 and tue2 should be minimum. Consider,
Therefore,
y = Y (1 + eo ) and m = M(1 + e1 )
Bias(m)
E(eo) = o, E(el)
M
E(e^) = yC2; E(e2) = jC^; E(e0 ex) = jCym Transforming the estimator tue1 and tue2 in terms of ei (i = 0,1), the equations obtained are as
tuel = Y (1 + -o )
а exp( -r) + (1 - а) exp i J )
tue2 = Y (1 + -o )
ß exp
-1 4(1 - ß) ex/ -1
(11)
(12)
b(1 + e^)'^ *\b(1 + d)
Solving the equations (11) & (12) and retaining the terms only up to up 2nd degree, the reduced equations are as
tuel = Y ^ tuel - Y = Y tue2 = Y
^ tue2 - Y = Y
1 + -o + (1 - 2а) - + ^ + (1 - 2а) — a 2a2 a
eo + (1 - 2а)11 + ^ + (1 - 2а)-o-1 a 2a2 a
1 + -o + (1 - 2ß) eb +( 1 + Iß - Л f + (1 - Iß)
(13)
-o+(l - 2ß)-1+ ( 2b+iß - 0 ?+(l - 2ß)IT
(14)
The bias of the estimators tue1 and tue2 is obtained by taking expectation on both sides of (13) and (14) as
Bias (tuel ) = yY
— C2 + 1 (1-W C + BÍas(m)
2a2 Cm + a(1 2аЧCym + YM
1
Bias(tue2 ) = yY b
à+iß -1) Cm+(l - Cym+«Ym
(15)
(16)
Taking the expectation of the square of the equations (13) & (14), the mean square error of t^e1 and tde2 is obtained as
MSE (tuel ) = jY2
MSE(tue2 ) = JY2 The estimator tue1 is unbiased, if
C2 2
C2 + (1 - 2а)2+ 2 (1 - 2а)Cym aa
C2 2
+ (1 - 2ß)2Cm + b (1 - 2ß)Cym
C2
m
2(2a - 1) (Cym + Whereas the estimator tue2 is unbiased, if
C2
7 m
2(2ß - 1)í Cym - Cl
(17)
(18)
(19)
Substituting the values of (19) and (20) in equations (7) and (8) respectively, the following
equations are obtained as
2
MSE(tuei ) = yY2
C2 + 4(1 - 2a)
4 v Cym +
Bias(m) YM
C2
m
- 4(1 - 2a)
2 Cym
Cym +
Bias(m) YM
C2
m
(21)
MSE(tue2) = ir2
C2 + 4(1 - 2ß)
f c i Bias(m) c2 V
\Cym + YM CmJ
C2
m
- 4(1 - 2ß)
f C i Bias(m) c2 A [Cym + YM CmJ
C2
m
C
ym
(22)
Differentiating equations (21) and (22) with respect to a and ft respectively and equating to zero, the optimum value of a and ft is obtained as
1 1
a = ^ ± -r 2 2
C
ym
Cym +
Bias(m) YM
and
1 1 ß = 2 ± 2
C
ym
Cym +
Bias(m) c2 YM Cm
The constants a, b, a and ft contain the unknowns Cm and Cym whose value is considered to be known well in advance and if unknown, they can be determined from past surveys, experience carried by the researcher in the due course of time or from the pilot survey (See Srivenkataramana & Tracy [17], Singh & Kumar [16] and the references cited therein). Now on using the value of a and ft in equation (21) and (22) respectively, the minimum value of MSE of the estimator tue1 and tue2 up to O(n)-1 is obtained as
\?2
MSEmin (tuei) yY
Cy2 y-
Cy2m ym
~C2~
m
(23)
2
2
5. Theoretical Efficiency Comparisons
From equations (2), (4), (6), (8), (10) and (23), the circumstances and conditions in which the suggested estimators outperform the sample mean estimator, the existing estimators of Cochran [2], Bahl and Tuteja [1], Singh et al. [12] and Subramani [6] are obtained as
MSEmin (tuei) < MSE(t1)
^ yY2
Cy2
Cy2m ym
~Cr
m
< yY2 cy, if C¡m > 0.
MSEmin (tuei ) < MSE(t2)
^ yY2
^ yY2
C2 C2 _ ^ym
Cy C2
C2
C2 _ ^ym Cy C2
m
< yY2(C2 + C2X - 2Cyx), if C¡m > cm(2Cyx - Cx2).
MSEmin (tuei ) < MSE(t3)
< yY2 (cy + f - Cy^j , if C¡m > ^(4Cyx - C2). MSEmin (tuei) < MSE(ti)
^ yY2
Cy2m
c2 _ ^ym
Cy C2
m
< yY2(c2 + 62CX - 26Cyx), if c2 > cm(26Cyx - 62C2).
MSEmin (tuei) < MSE(ts)
^ yY2
Cy2
C2
ym
C2"
m
< yy2I c2v+M cm -2 Mcym i,if c
Y
'M'
(24)
(25)
(26)
(27)
2 Y Y2 2
ym > Cm [ 2~MCym — M2 Cm ' . (28)
The proposed median based unbiased exponential ratio type estimators tuei are more precise than the estimators t1 to t6 under the conditions (24) to (28).
6. Numerical Study Comparisons
For the numerical study, data of populations P1 and P2 containing influential observations have been considered as given in Table-1. The data set of population P1 is sourced from Singh and Chaudhary [7] where the study variable is to estimate area of wheat under cultivation in the year 1974 and the auxiliary variable is the cultivated area under wheat in the year 1971. The data set of population P2 is taken from Mukhopadhyay [5] where the study variable is to estimate the ammount of raw materials for 20 jute mills and the auxiliary variable is the number of workers.
Table 1: Summary statistics of the population data sets.
Data constants Populations
P1 P2
N 34 20
n 5 5
NC Cn 278256 15504
Y 856.4118 41.50000
MM 736.9811 40.05520
M 767.5000 40.50000
X 208.8824 441.9500
Y M 1.115800 1.024700
C2 Cy 0.125014 0.008338
C2 0.088563 0.007845
C2 ^m 0.100833 0.006606
Cym 0.073140 0.005394
Cyx 0.047257 0.005275
pyx 0.449100 0.652200
It can be observed from Table-1 that a sample of 5 units has been drawn from two populations P1 and P2 having size 34 and 20 respectively. The values of different parameters like population mean, population median, coefficient of variation etc. of the study and auxiliary variable are obtained and can be seen from the table.
Table 2: MSE, Bias and PRE of the estimators t\, t2, t3, t4, t5 and tuei.
Estimator Population
P1 P2
MSE Bias PRE MSE Bias PRE
t1 15641.306 0.000 100.000 2.154 0.000 100.000
t2 14896.738 6.035 104.998 1.455 0.016 148.041
t3 12498.850 1.399 125.142 1.297 0.002 166.076
t4 12499.093 0.227 125.139 1.298 0.004 165.948
t5 10926.773 38.100 143.147 1.090 0.463 197.615
tuei 9003.545 0.000 173.724 1.016 0.000 212.007
EXPONENTIAL TYPE RATIO ESTIMATORS
The suggested median based exponential ratio estimators tuei (i = 1,2) have the lowest MSE values for both population data sets P1 and P2, as can be seen from Table 2. When comparing with the sample mean estimators, estimators of Cochran [2], Bahl and Tuteja [1], Singh et al. [12], and Subramani [6], the PRE of the suggested estimators is found highest. Furthermore, as the estimators suggested in the paper are unbiased, they may be used to address the issue of under or overestimating the population mean.
The paper presents two estimators of population mean as tue1 and tue2 based on population median. Since median is a type of parameter which is least influenced by the effect of outliers, so the proposed estimators may work efficiently for skewed data as evident from numerical study. It can be observed from empirical study that the MSE value for tuei is 9003.545 and 1.016 for populations P1 and P2 respectively which can be observed as minimum value among all other estimators considered for comparision. The minimu value of MSE highlights that the estimators tuei are most efficient. Further looking at the bias values, tuei can be found as unbiased so will take care of under or over estimation problem . The population and sample median used in the construction of estimators tuei are of study variable only, so do have a good advantage as the
auxiliary variable may not be always available
• In the absence of an auxiliary variable, the suggested median-based almost-unbiased
exponential ratio estimators of population mean are as follows
• The proposed estimators tue\ and tue2 are median based and therefore have least influence of outliers present in the data set.
• The estimators proposed in the study are more precise for a skewed data set than the estimators considered.
[1] Bahl, S. and Tuteja, R. K. (1991). Ratio and product type exponential estimator. Information and Optimization Sciences, 12(1), 159-163.
[2] Cochran, W. G. (1940). The estimation of the yields of the cereal experiments by sampling for the ratio of grain to total produce. The Journal of Agricultural Science, 30, 262-275.
[3] Hussain, S., Sharma, M. and Bhat, M. I. J. (2021). Optimum exponential ratio type estimators for estimating the population mean. Journal of Statistics Applications and Probability Letters, 8(2), 73-82.
[4] Zaman, T. and Kadilar, C. (2019). Novel family of exponential estimators using information of auxiliary attribute. Journal of Statistics and Management Systems, 22 (8),1499-1509.
[5] Mukhopadhyay, P. (2005). Theory and methods of survey sampling, PHI Learning, 2nd edition, New Delhi.
[6] Subramani, J. (2016). A new median based ratio estimator for estimation of the finite population mean, Statistics in Transition New Series, 17 (4): 1-14.
[7] Singh, D and Chaudhary, F.S. (1986). Theory and analysis of sample survey designs. New Age International Publisher, New Delhi.
7. Discussion
8. Conclusion
References
[8] Yasmeen, U., Noor ul Amin, M., Hanif, M., (2016). Exponential ratio and product type estimators of population mean, Journal of Statistics and Management System, 19(1), pp. 55-71.
[9] Sisodia BVS, Dwivedi VK. A modified ratio estimator using coefficient of variation of auxiliary variable. Journal of the Indian Society Agricultural Statistics. 1981; 33(2): 13-18.
[10] Yadav SK, Kadilar C. Improved Class of Ratio and Product Estimators. Applied Mathematics and Computation. 2013; 219 (22): 10726-10731.
[11] Yadav R, Upadhyaya LN, Singh HP, Chatterjee S. Almost unbiased ratio and product type exponential estimators. Statistics in Transition new series. 2012; 13(3): 537-550.
[12] R. Singh, P. Chauhan, N. Sawan and F. Smarandache, Improvement in Estimating the Population Mean Using Exponential Estimators in Simple Random Sampling. Bulletin of Statistics and Economics. 3 (A09), 13-18 (2009).
[13] Singh, R.,Kumar, M., and Smarandache,F. (2008): Almost unbiased estimator for estimating population mean using known value of some population parameter(s). Pak.j.stat.oper.res. Vol.IV No.2 2008 pp63-76.
[14] Singh, R., Gupta, S.B. and Malik, S., 2016. Almost Unbiased Estimator Using Known Value of Population Parameter (s) in Sample Surveys. Journal of Modern Applied Statistical Methods, 15(1), p.30.
[15] Ekpenyong, E.J and Enang, E.I. 2015. A modified class of ratio and product estimators of population mean in simple random sampling using information on auxiliary variable. Journal of statistics, 22: 1-8.
[16] Singh HP, Kumar S. A general family of estimators of finite population ratio, product and mean using two phase sampling scheme in the presence of non-response. Journal of Statistical Theory and Practice. 2008; 2(4):677-692.
[17] Srivenkataramana. T and D.S Tracy (1980), An alternative to ratio method in sample surveys. Annals of the Institute of Statistical Mathematics. 32:111-120.