AN IMPROVED DIFFERENCE CUM - EXPONENTIAL RATIO TYPE ESTIMATOR IN RANKED SET
SAMPLING
Khalid Ul Islam Rather1*, Asad Ali2, M. Iqbal Jeelani3
Division of Statistics and Computer Science, SKUAST-Jammu,India.13 Department of Economics and Statistics, University of Management and Technology, Lahore, Pakistan.2 [email protected] [email protected] [email protected]
Abstract
Ranked set sampling is an approach to data collection originally combines simple random sampling with the field investigator's professional knowledge and judgment to pick places to collect samples. Alternatively, field screening measurements can replace professional judgment when appropriate and analysis that continues to stimulate substantial methodological research. The use of ranked set sampling increases the chance that the collected samples will yield representative measurements. This results in better estimates of the mean as well as improved performance of many statistical procedures. Moreover, ranked set sampling can be more cost-efficient than simple random sampling because fewer samples need to be collected and measured. The use of professional judgment in the process of selecting sampling locations is a powerful incentive to use ranked set sampling. This paper is devoted to the study, we introduce an approach to the mean estimators in ranked set sampling. The amount of information carried by the auxiliary variable is measured with the on populations and samples and to use this information in the estimator, the basic ratio and the generalized exponential ratio estimators are as an improved form of a difference cum exponential ratio type estimator under the ranked set sampling in order to estimate the population mean Y of study variate Y using single auxiliary variable X. The expressions for the mean squared error of propose estimator under ranked set sampling is derived and theoretical comparisons are made with competing estimators. We show that the proposed estimator has a lower mean square error than the existing estimators. In addition, these theoretical results are supported with the aid of some real data sets using R studio. Therefore, Under RSS architecture, a better difference cum exponential ratio type estimator has been suggested. The estimator's mathematical form has been developed, and its efficiency requirements have been developed in relation to various already-existing estimators from the literature. By imputing various values for the constants used in the creation of our proposed estimator, we also provide several specific situations of our estimator.
Keywords: Ranked Set Sampling; Exponential Ratio Type Estimator; Ratio Estimator, Mean Square Error (MSE), Efficiency, R studio.
1. Introduction
It is well known that the information of the auxiliary variable is commonly used in order to increase efficiency and precision in sample surveys. It has also a role in the related methods of estimation, such as ratio, product, and regression. If the correlation between the study variable (Y) and the auxiliary variable (X) is highly positive, the ratio method of estimation is used. If not, the product
method of estimation is employed effectively provided that this correlation is highly negative. In recent years, there have been many articles on estimators for the population mean in the Sampling Theory Literature, such as unbiased estimators in general form for estimating the finite population mean in stratified random sampling [1], a generalized ratio estimator is proposed by using some robust measures with single auxiliary variable [2 and 3], an efficient families of ratio-type estimators to estimate finite population mean using known correlation coefficient between study variable and auxiliary variable by [5 aand 6], Estimation of rare and clustered population mean using stratified adaptive cluster sampling and using auxiliary character in stratified random sampling [7 and 8]. The estimation of population mean using auxiliary attribute under ranked set sampling (RSS) [9, 10 and 11]. The problem of exponential estimator for estimating the population mean considered under RSS using attribute, two phase sampling by [12, 13, 14, and 15].
In addition to the Simple Random Sampling (SRS) method, RSS, which may be considered as a controlled random sampling design, was first introduced to estimate the pasture yield by [16]. The RSS procedure involves randomly drawing n sets of n units each from the population for which the mean is to be estimated. It is assumed that the units in each set can be ranked visually. From the first set of n units, the lowest unit ranked is measured. From the second set of n units, the second lowest unit ranked is measured. This process continues until the nth ranked unit is measured. The gain in efficiency by a computation involving five distributions illustrated by [16]. As a simple introduction to the concept of RSS, when X is a random variable with a density function F(x) and (xi,x2,...,x«) are the unobserved values from n units, we may then rank them by visual inspection or based on a concomitant variable. RSS involves selecting one unit among every ranked set consisting of m units for quantification.
The RSS method can be briefly described step by step as follows: Step 1: Randomly select m2 units from the target population.
Step 2: Allocate the m2 selected units as randomly as possible into m sets, each of size m. Step 3: Without knowing any values of the variable of interest, rank the units within each set with respect to variable of interest. This may be based on personal professional judgment or done with concomitant variable correlated with the variable of interest.
Step 4: Choose a sample for actual quantification by including the smallest ranked unit in the first set, the second smallest ranked unit in the second set and this process continues in this way until the largest ranked unit is selected from the last set.
Step 5: Repeat Steps 1 through 4 for n cycles to obtain a sample of size mn for actual quantification.
When it is ranked on the auxiliary variable, let y^, x^ denote an Ith judgment ordering in the ith set for the study variable and the ith order statistic in the ith set for the auxiliary variable, respectively.
In the remaining part of this article, the estimators for the population mean under RSS are mentioned in Section 2, the adapted estimator from the SRS to RSS is given in Section 3, theoretical and numerical comparisons of the adapted estimator are performed with the existing adapted estimators in literature in Sections 4 and 5, respectively.
the population total and mean. Then, the estimator for the population mean can be written as follows:
[17]
2. Estimators in literature
The estimator of the population ratio using the RSS as defined by [19].
(2.1)
_ ) _ )
Where y[— = -£-=1 y(i) and X[— = -£-=1 x(i). Note that the estimator in (2.1) can also be used for
(2.2)
Where it is assumed that the population mean X of the auxiliary variable x is known and the MSE equation of the estimator in (2.2) can be given by
(m m m \
= T?(i) - 7.R = Tyx(0 + R- = t.(0 ? (2.3)
i=l i=l i=l / Where, R = S- is the population variance of the auxiliary variable, S? is the population variance of the study variable, Syx is the population covariance between the auxiliary and study variables,
T*(i) = (fe(i) - X), Ty(i) = (^y(i) - fy and Ty.(i) = (^y(i) - (i) - Here, Y is the population
mean of the study variable. Note that the values of (") and ju,^) depend on the order statistics from some specific distributions and these values can be found in [19]. We would like to remind that the values of ^ and ju,^) can be taken to be same in the absence of judgment error if the variables have the same distribution (see the appendix of [20]
The following estimator by adapting [21] to the RSS proposed by [22]:
yfcflss = —77 % Where k is a constant.
The MSE of the estimator in (2.4) is given by
MS£(yr$ss) = ^(fc*2^ - 2flfc*Sy. + + F2(fc* - 1)2
(2.4)
mr
/ lit m2r \ ¿—1
fc*2T,(") - 2flfc
lit lit = Ty.(i) +fi2 =
.(i)
where fc* = Here, W,
i=l l
y.(i)
Si=i Tyx(i) and Wy2[i]
2
(2.5)
2!=i T,(i), k = mm+, c. and
Cy are the population coefficients of variation of the auxiliary and study variables, respectively, p is the population correlation between the auxiliary and the study variables.
3. Proposed estimator
An improved difference cum-exponential ratio type is defined for estimating F as following [18 and 21]
= {ti3&[n] + t2(;& - x[n])K jexP To obtain the MSE of
^W, write
?(n) = &(1 +60), and xo = X(1 +£;), Such that E(e0) = £(e;) = 0,
and i (6=)2 = 7 = m/r-lj [S? - mlZ t,(i)] =
^(6i)2 = ^) = mlr-j K2 - mlZt,(i)] = [ec
£(e=6l)= ¿^[V - mZt?(i)] = [ec:
X - X[n]
^ + X[n]
- w?(i)],
.(i)J
-y.
- w.
Where W.2[i] = ¿l-j Zm=l r-(i) Expressing (1.1) in terms of e' s,
V = (tl&(1 +6=) + t-(* - *(1 +6l))} L exp
X-X(1 +6l)
x + x(1 +6l)
= (tl^ + tl^ 6=- t-X 6l} L exp I -
1+f.
(3.1)
(3.2)
Expanding the right hand side of (1.2) and retaining terms up to the second power of e's,
R
el e/ 1 + —- + -)-2 4
= (t^ + ^^ e0- t2;r e-) |exp (-y)
_Cee2 e2-
?$< = &F + t!? e=- t-jf e -)Ll -2 + -4- + -8-
From (3.3),
eoe i + ^ ei2 + 3ti e-
-& = y|(^-i) + ^e0--12-1-t2B6i- 2
Squaring (3.4) and then taking expectation of both sides, the MSE of the estimator is MS£(F$<) = F2{tx- t^ + t22fi2^3 - t^flp?}
Where,
<Pl = + C.- - 2Cyx] - {W,[j] + W2W - 2Wyx[j]}
<?2 = W® - {W.[i]}
r3 i r3
(3.3)
(3.4) (3.5)
= 7 (4Q2 - CyxJ + l^w^] - wy.["]J
<?4 = - Cyx} + {W*["] - Wyx[j]}
Obtain the optimum t1 and t2 to minimize MSf (?$<). Differentiate M5£(F$<) with respect to ^ and t2 and equating the derivatives to zero, optimum values of t1 and t2 is given by 2<P2<?3
10A£ -?42
= ^2^4 ^2— '
Substituting the value of ^ op£ and t2op£ in (3.5), we get the minimum value of MS£(Y$<) as MS£m"n(?$<) — ?2{t!- t^ - t22fi2^>)
(3.6)
4. Efficiency
In this section, the performances of the proposed estimator have been demonstrated over the traditional ratio estimator in the RSS and the estimator of [23] respectively, as follows:
MS£(yr$ss) - MS£mj-(F$K) > 0 {(1 - tx2)^ + t^ + t22fi2^>} > 0 (4.1)
MS£(5w) - ) > 0
{(fc* - 1)2 + (1 - tx2)^ + t^ + t22fi2^>} > 0
Table 1: Some members of exponential ratio type estimator in ranked set sampling
(4.2)
Estimator tl t2
= {y[n] + - x[n])} jexP X - X[n] } 1 1
>$<2 = - X[n])}|eXP X - X[n] ^ + *[n] } 0 1
= {yw} |exp X - X[n] ^ + *[n] } 1 0
5. Numerical example To observe performances of the estimators, we use some real-life populations. The descriptions of these populations are given below: Population I {source: [24]}
Y: Acceleration of automobiles X: Engine horsepower of automobiles
Objective: To estimate population mean of Acceleration of automobiles. The summary statistics are given below:
N = 392, n = 30,m = 10, r = 3,^. = 104.4694,^, = 15.5413,5, = 2.7589,5. = 38.4912, C. = 0.3684, Cy = 0.1775, C.y = -0.0451,= 0.6541, = 1.079,p., = 0.9091
Population II {source: [25]}
Y: Body Mass Index (BMI) of Crohn's disease patients X: Weight of Crohn's disease patients
Objective: To estimate population mean of Body Mass Index (BMI) of Crohn's disease patients. The summary statistics are given below:
N = 117, n = 20,m = 5,r = 4,^. = 69.0256,^, = 26.0624,5, = 4.9888,5. = 14.2438, C. = 0.2063, Cy = 0.1914, C.y = 0.0325, = 0.7746, = 0.6571, p.y = 0.8222
Population III {source: [26]} Y: Body Mass Index (BMI) X: Thigh Circumference
Objective: To estimate population mean of Body Mass Index (BMI). The summary statistics are given below:
N = 36, n = 8, m = 4, r = 2, = 49.3806, = 25.678, Sy = 3.8198, Sx = 3.7599, Cx = 0.0761, Cy = 0.1488, Cxy = 0.0066, ß2(x) = -0.6159,ßl(x) = -0.0607,pxy = 0.9848
Percent Relative Efficiencies (PREs) of our proposed estimators along with competitor estimators from literature have been presented in Table 2, 3 and 4 for different real-life populations.
Table 2: PRE of Estimators for Population I
yrRSS y kRSS ^RKl
yrRSS 100
ykRSS 212.19 100
^Wl 245.19 231.72 100
241.45 210.47 98.81 100
238.97 189.37 90.84 93.08 100
361.74 275.18 249.18 245.15 213.49 100
Table 2, revealed the percent relative efficiencies (PRE) of estimators for population I. It is observed that the proposed difference cum exponential ratio type estimator in ranked set sampling proved to be the best estimator in the sense of having highest percent relative efficiency than usual unbiased estimators?r$EE, FkB55 for the population I. The generalized form of proposed difference cum exponential ratio type estimator is 361.74% more efficient than the existing estimator yrRSS and 275.18% more efficient than ?k$EE.
Moreover, the special cases of our proposed generalized estimator ?$<!, ?$<2 and F$<3 are also proved to be more efficient than existing estimators. These results suggest using proposed difference cum exponential ratio type estimator to estimate population mean of Acceleration of automobiles.
Table 3: PRE of Estimators for Population II
_ _
yrRSS ykRSS ^Wl y$K2 ^RK
yrRSS 100
y kRSS 204.74 100
238.48 238.29 100
237.37 204.28 93.92 100
221.49 174.28 89.32 82.74 100
352.86 252.48 248.82 229.23 190.48 100
Table 3, showed the percent relative efficiencies (PRE) of estimators for population II. It is observed that the proposed difference cum exponential ratio type estimator in ranked set sampling also proved to be the best estimator in the sense of having highest percent relative efficiency than usual unbiased estimators , for the population II. The generalized form of proposed difference cum exponential ratio type estimator is 352.86% more efficient than the existing estimator yrRSS
and 252.48% more efficient than F3flEE. Moreover, the special cases of our proposed generalized estimator ?$<!, ?$<2 and F$<3 are also proved to be more efficient than existing estimators. These results suggest using proposed difference cum exponential ratio type estimator to estimate population mean of Body Mass Index (BMI) of Crohn's disease patients.
Table 4: PRE of Estimators for Population III
yrRSS ykRSS ^RKl y$K2 ^RK
yrRSS 100
y kRSS 238.48 100
^RKl 275.28 264.82 100
263.82 249.27 98.47 100
239.83 237.42 97.38 98.37 100
384.27 283.38 259.37 278.38 239.57 100
Table 4, showed the percent relative efficiencies (PRE) of estimators for population III. It is observed that the proposed difference cum exponential ratio type estimator in ranked set sampling also proved to be the best estimator in the sense of having highest percent relative efficiency than usual unbiased estimators?+$EE, FkB55 for the population III. The generalized form of proposed difference cum exponential ratio type estimator is 384.27% more efficient than the existing estimator yrRSS
and 283.38% more efficient than F3flEE. Moreover, the special cases of our proposed generalized estimator ?$<!, F$<2 and F$<3 are also proved to be more efficient than existing estimators. These results suggest using proposed difference cum exponential ratio type estimator to estimate population mean of Body Mass Index (BMI).
6. Conclusion
In this article, an improved difference cum exponential ratio type estimator has been proposed under RSS design. The mathematical form of the estimator has been derived and its condition of efficiencies has been formulated with respect to some existing estimators from literature. Further, we present some special cases of our proposed estimator by imputing different values of constants utilized in the formation of proposed estimator. For comparing the efficiencies of proposed estimator with some existing estimators, we utilized some real-life populations for estimating population mean of Acceleration of automobiles, population mean of Body Mass Index (BMI) of Crohn's disease patients and population mean of Body Mass Index (BMI). The result from these populations shows that our proposed estimator and its special cases perform efficiently as compare to existing estimators. We
also observe that efficiency of proposed estimator and its special cases increases when the correlation between study and auxiliary variable increases. Therefore, it is recommended to use proposed estimator for estimating population mean when correlation between study and auxiliary variable is strong positive.
References
[1] Cekim, H.O., Kadilar, C. (2018). New families of unbiased estimators in stratified random sampling. Journal of Statistics and Management Systems, 21: 1481-1499.
[2] Qureshi, M.N., Kadilar, C., Noor ul Amin, M. Hanif, M. (2018). Rare and clustered population estimation using the adaptive cluster sampling with some robust measures. Journal of Statistical Computation and Simulation, 88: 2761-2774.
[3] Singh, G.N., Singh, A.K., Kadilar, C. (2018). Almost unbiased estimation procedures of population mean in two-occasion successive sampling. Hacettepe Journal of Mathematics and Statistics, 47: 1268-1280.
[4] Irfan, M., Javed, M., Lin, Z. (2018). Efficient ratio-type estimators of finite population mean based on correlation coefficient. Scientia Iranica, 25: 2361-2372.
[5] Irfan, M., Javed, M., Lin, Z. (2019). Enhanced estimation of population mean in the presence of auxiliary information. Journal of King Saud University,31: 1373-1378.
[6] Iqbal, K., Moeen, M., Ali, AL., et al. (2020). Mixture regression cum ratio estimators of population mean under stratified random sampling. Journal of Statistical Computation and Simulation. 90: 854-868.
[7] Qureshi, M.N., Kadilar, C., Hanif, M. (2020). Estimation of rare and clustered population mean using stratified adaptive cluster sampling, Environmental and Ecological Statistics, 27: 151-170.
[8] Zaman, T., Kadilar, C. (2020). On estimating the population mean using auxiliary character in stratified random sampling. Journal of Statistics and Management Systems, 23: 1415-1426.
[9] Ali, A., Butt, M. M., Iqbal, K., et al. (2021). Estimation of Population Mean by Using a Generalized Family of Estimators Under Classical Ranked Set Sampling. RMS: Research in Mathematics & Statistics, 8:1, 1948184.
[10] Ali, A., Butt, M. M., Azad, M, D., (2021). Stratified Extreme-cum-Median Ranked Set Sampling. Pakistan Journal of Statistics, 37: 215-235.
[11] Rather, K .U .I .& Kadilar, C. (2021). Exponential type estimator for the population mean under Ranked set sampling .Journal of Statistics: Advances in Theory and Applications, 25: 1-12.
[12] Zaman, T., Kadilar, C. (2019). Novel family of exponential estimators using information of auxiliary attribute. Journal of Statistics and Management Systems, 22: 1499-1509.
[13] Zaman, T. (2020). Generalized exponential estimators for the finite population mean. Statistics in Transition, 21: 159-168.
[14] Zaman, T., Kadilar, C. (2021). New class of exponential estimators for finite population mean in two-phase sampling, Communications in Statistics - Theory and Methods, 50: 874-889.
[15] Rather, K U I., Eda, K G Unal, C & Jeelani, M.I (2022). New exponential ratio estimator in Ranked set sampling Pakistan Journal of Statistics and operation research, 18, 403-409.
[16] McIntyre, G. A. (1952). A method for unbiased selective sampling using ranked sets. Australian Journal of Agricultural Research, 3: 385-390.
[17] Al-Omari, A.I., Bouza, C.N. (2014). Review of ranked set sampling: Modifications and applications, Revista Investigacion Operacional, 35: 215-235.
[18] Samawi, H. M., Muttlak, H. A., (1996). Estimation of ratio using rank set sampling. Biometrical Journal, 38: 753-764.
[19] Arnold, B. C., Balakrishnan N., Nagaraja, H.N., (1993). A first course in order statistics. John Wiley, New York.
[20] Dell, T. R., Clutter, J. L., (1972). Ranked set sampling theory with order statistics background. Biometrics, 28: 545-555.
[21] Prasad, B., (1989). Some improved ratio type estimators of population mean and ratio in finite population sample surveys. Communications in Statistics-Theory and Methods, 18: 379-392.
[22] Kadilar, C.; Unyazici, Y.; Cingi, H., (2009), Ratio estimator for the population mean using ranked set sampling, Statistical Papers, 50:301-309.
[23] Kadilar, C., Cingi, H., (2005). A new ratio estimator in stratified random sampling. Communications in Statistics: Theory and Methods, 34: 597-602.
[24] James, G., Witten, D., Hastie, T., et al. 2013 An introduction to statistical learning (Vol. 112, p. 18). New York: springer.
[25] Daly, M.J., Rioux, J.D., Schaffner, S.F., (2001). High-resolution haplotype structure in the human genome. Nature genetics, 29: 229-232.
[26] Husby, C.E., Stasny, E.A., Wolfe, D.A. (2005). An application of ranked set sampling for mean and median estimation using USDA crop production data. Journal of agricultural, biological, and environmental statistics, 10: 354-373.