RATIO ESTIMATOR OF POPULATION MEAN USING A NEW LINEAR COMBINATION UNDER RANKED SET SAMPLING
Saba Riyaz1, Khalid Ul Islam Rather2*, Showkat Maqbool3 and T. R. Jan1
�
Department of Statistics, University of Kashmir, Srinagar � 190006, J&K, India1,4
Division of Statistics and Computer Science, SKUAST-Jammu, India.2
Division of AGB, FVSC & AH, SKUAST-K, Shuhama, J&K, India3
Abstract
Ranked set sampling is an approach to data collection originally combines simple random sampling with the field investigator's professional knowledge and judgment to pick places to collect samples. Alternatively, field screening measurements can replace professional judgment when appropriate and analysis that continues to stimulate substantial methodological research. The use of ranked set sampling increases the chance that the collected samples will yield representative measurements. This results in better estimates of the mean as well as improved performance of many statistical procedures. Moreover, ranked set sampling can be more cost-efficient than simple random sampling because fewer samples need to be collected and measured. The use of professional judgment in the process of selecting sampling locations is a powerful incentive to use ranked set sampling. This paper is devoted to the study, we introduce an approach to the mean estimators in ranked set sampling. The amount of information carried by the auxiliary variable is measured with the on populations and samples and to use this information in the estimator, the basic ratio and the generalized exponential ratio estimators are as an improved form of a difference cum exponential ratio type estimator under the ranked set sampling in order to estimate the population mean of study variate Y using single auxiliary variable X. The expressions for the mean squared error of propose estimator under ranked set sampling is derived and theoretical comparisons are made with competing estimators. We show that the proposed estimator has a lower mean square error than the existing estimators. In addition, these theoretical results are supported with the aid of some real data sets using R studio. Therefore, Under RSS architecture, a better difference cum exponential ratio type estimator has been suggested. The estimator's mathematical form has been developed, and its efficiency requirements have been developed in relation to various already-existing estimators from the literature. By imputing various values for the constants used in the creation of our proposed estimator, we also provide several specific situations of our estimator. Y
Keywords: mean squared error, auxiliary variable, median, coefficient of variation, kurtosis
1.Introduction
It is well known that the information of the auxiliary variable is commonly used in order to increase efficiency and precision in sample surveys. It has also a role in the related methods of estimation, such as ratio, product, and regression. If the correlation between the study variable (Y) and the auxiliary variable (X) is highly positive, the ratio method of estimation is used. If not, the product method of estimation is employed effectively provided that this correlation is highly
negative. In recent years, there have been many articles on estimators for the population mean in the Sampling Theory Literature, such as unbiased estimators in general form for estimating the finite population mean in stratified random sampling [1], a generalized ratio estimator is proposed by using some robust measures with single auxiliary variable [2 and 3], an efficient families of ratio-type estimators to estimate finite population mean using known correlation coefficient between study variable and auxiliary variable by [5 and 6], Estimation of rare and clustered population mean using stratified adaptive cluster sampling and using auxiliary character in stratified random sampling [7 and 8]. The estimation of population mean using auxiliary attribute under ranked set sampling (RSS) [9, 10 and 11]. The problem of exponential estimator for estimating the population mean considered under RSS using attribute, two phases sampling by [12, 13, 14, and 15].
In addition to the Simple Random Sampling (SRS) method, RSS, which may be considered as a controlled random sampling design, was first introduced to estimate the pasture yield by [16]. The RSS procedure involves randomly drawing n sets of n units each from the population for which the mean is to be estimated. It is assumed that the units in each set can be ranked visually. From the first set of n units, the lowest unit ranked is measured. From the second set of n units, the second lowest unit ranked is measured. This process continues until the nth ranked unit is measured. The gain in efficiency by a computation involving five distributions illustrated by [16]. As a simple introduction to the concept of RSS, when X is a random variable with a density function F(x) and (x1,x2,...,xn) are the unobserved values from n units, we may then rank them by visual inspection or based on a concomitant variable. RSS involves selecting one unit among every ranked set consisting of m units for quantification.
The RSS method can be briefly described step by step as follows:
Step 1: Randomly select m2 units from the target population.
Step 2: Allocate the m2 selected units as randomly as possible into m sets, each of size m.
Step 3: Without knowing any values of the variable of interest, rank the units within each set with respect to variable of interest. This may be based on personal professional judgment or done with concomitant variable correlated with the variable of interest.
Step 4: Choose a sample for actual quantification by including the smallest ranked unit in the first set, the second smallest ranked unit in the second set and this process continues in this way until the largest ranked unit is selected from the last set.
Step 5: Repeat Steps 1 through 4 for n cycles to obtain a sample of size mn for actual quantification.
When it is ranked on the auxiliary variable, let ..(..),..(..) denote an ith judgment ordering in the ith set for the study variable and the ith order statistic in the ith set for the auxiliary variable, respectively.
In the remaining part of this article, the estimators for the population mean under RSS are mentioned in Section 2, the adapted estimator from the SRS to RSS is given in Section 3, theoretical and numerical comparisons of the adapted estimator are performed with the existing adapted estimators in literature in Sections 4 and 5, respectively.
2.Estimators in literature
[1]Presented classical ratio estimator for estimating the population mean ... is given as:
..
(2.1)
=.........=......
where ... population mean of the auxiliary variable .. is a known value. ... is the sample mean of the study variable and ... is the sample mean of the auxiliary variable.
Mean square error of the ratio estimator with first degree of approximation is given as
......
or
......
(2.2)
where ..=1..(on ignoring ..=....),..=...... ....=.......,....=.......,..=..............,......=.(........)2...1....,
....2=.(........)2...1....2=.(........)2...1........
Modified ratio estimator for ... is given by [15] as given
.......=...[...+.......+....]
(2.3)
where .... is the coefficient of variation and known estimator for the case when coefficient of kurtosis is known, [14] suggested as follows
.......=...[...+..2(..)...+..2(..)]
(2.4)
Ratio and product estimators suggested by [18], following, respectively:
.......=...[.......+..2(..).......+..2(..)]
(2.4)
where .... and ..2(..) are the known value of the coefficient of variation and kurtosis of an auxiliary variable.
In order to find the first degree of approximation the mean squared error (MSE) of these estimators is:
......(.......)......2[....2+..12....2+2..1..........]
(2.5)
......(.......)......2[....2+..22....2+2..2..........]
(2.6)
......(.......)......2[....2+..32....2+2..3..........]
(2.7)
where ..1=......+.... ,..2=......+..2(..) ,..3=..............+..2(..)
RATIO AND PRODUCT ESTIMATORS UNDER RSS
The classical ratio estimator proposed by [13] under rank set sampling is given by: .........=...[..]...(..)
where ...[..]=1.....[..]....,...(..)=1.....(..).... are the ranked set sample means for variables .. and .. respectively.
[4]assuming that the population mean of the auxiliary variable is known, the ratio estimatorusing ranked set sampling is given as
.....,......=...[..](......(..))
(2.8)
With the following bias and MSE is given by
........
(2.9)
......(.....,......).1....(....2+..2....2.2........).1..2..(.....[..]2+..2.....(..)2.2.........(..)............)
(2.10)
where ......(..)=1..2...............(..)....,....(..)2=1..2.....2.....(..)2....and....[..]2=1..2.....2.....[..]2.... and ....(..)=....(..)....,....[..]=....[..]....,......(..)=(....[..]....)(....(..)....)
Following are modified ratio estimators of finite population mean that [7] presented using data on the coefficient of variation of an auxiliary variable in ranked set sampling.
.....1,......=...[..][...+.......(..)+....]
(2.11)
Bias has been derived as
........(.....1,......)=...[..{..2....2.................}.{..2....(..)2.........(..)}]
(2.12)
and the MSE equation as
......(.....1,......)=...2[..{....2+..2....2.2................}.{....[..].......(..)}2]
(2.13)
where ..=......+....
They have also proposed ratio type estimator utilizing coefficients of variation and kurtosis under RSS as
.....2,......=...[..][.......+..2(..)...(..)....+..2(..)]
(2.14)
with Bias and MSE given as ........(.....2,......)=...[..{..2....2.................}.{......(..)2.........(..)}]
......(.....2,......)=...2[..{....2+..2....2.2................}.{....[..].......(..)}2]
(2.15)
where ..=..............+..2(..)
3.Proposed estimator
We propose the following ratio-type estimators for ... under RSS scheme using linear combination of population CV of auxiliary variable, the kurtosis, the median, the tri mean ..... Motivated by [7,15] a ratio estimator based on RSS is proposed as
..
((3.1)
Where .... is the coefficient of variation and .... is the median of the auxiliary variable .. and are available to us.
To obtain the bias and MSE, of ..........., we put ...[..]=...(1+..0) and .....=...(1+..1) so that..(..0)=..(..1)=0 ..(..0)=..(..02)=..(...[..])...2=1....1...2[....2.1.......[..]2....]=[......2.....[..]2]
Similarly, ..(..1)=..(..12)=[......2.....(..)2]
And ......(..0,..1)=..(..0,..1)=......(...[..],.....)......=1......1.... [.......1.........(..)....]=[.......................(..)]
To find first degree of approximation, assuming that the sample size is large enough to get |..0|and |..1| as small so that the terms involving ..0 and, or ..1 in a degree more than two will be
negligible.
The bias and MSE can be found as below
..(...........)=..(..... ......)....
Here ...........=...(1+..0)(1+....1).1, where ..=......+........
Suppose |..|<1, so that (1+....1).1 is expandable...........=...(1+..0){1.....1+..2..12+..(....1)}
(Using Taylor series expansion, [20], where ..(..1)with power more than 2 are neglected for large power of ..1) ..(...........)=...[..2..(..12).....(..0..1)],.......... ..(..0)=..(..1)=0=...[..2{......2.....(..)2}...{.......................(..)}]=...[..{..2....2.................}.{..2....(..)2.........(..)}]
Now ......(...........)=..(...............)2=...2..[..0.....1+..2..12.2....0..1]2
......
(3.2)
where ..=......+........
4.Efficiency
For efficiency comparison of our proposed estimator with those existing in literature, we consider the following equations.
......(........... )<......(.......) (Proposed with [15]) [Eq. (3.2) & Eq.(2.5)]
i.e., ...2[..{....2+..2....2.2................}.{....[..].......(..)}2]<.....2[....2+..12....2+2..1..........]
where ..=......+........ and ..1=......+....
......(........... )<......(.......) (Proposed with [14]) [Eq. (3.2) & Eq.(2.6)]
i.e., ...2[..{....2+..2....2.2................}.{....[..].......(..)}2]<.....2[....2+..22....2+2..2..........]
where ..=......+........ and ..2=......+..2(..)
5.Numerical example
In order to put the suggested estimators to use in practice, we take into account the data from [21]. Between 1920 and 1930, information on the population size (measured in 1000s) of 49 cities was gathered. The example uses the population data from 1930 as the research variable y and the population from 1920 as the auxiliary variable x. Table 1 lists this population's characteristics.
We draw 16 simple random samples from the population of 49 cities for the research variable and the auxiliary variable, and then we divide each sample into 4 groups of size 4. Finally, ith ranked unit from ith set is drawn, producing (m=4) ranked set samples after ranking the data within each set. m= 4 units are drawn once more after repeating this method. 8 samples from the ranked set are chosen in this manner.
Table 1: Data Statistics
..=49
...=103.14
�3(..)=76
..1(..)=2.20
..=8
...=127.8
�4(..)=144
..2(..)=7.22
..=4
....=32.8720
�1(..)=70
..=2
....=56.2342
�2(..)=87
..=0.18
�1(..)=70
�3(..)=93.5
....=130.77
..2(..)=42
�4(..)=103.5
Table 2: MSE value of estimators
Estimators
MSE
PRE
.....
638.900
-
.......
838.138
76.228
.......
761.490
83.901
.......
703.065
90.874
.....,......
622.145
102.693
.....1,......
617.114
103.530
.....2,......
516.486
123.701
...........
273.867
233.289
Table 2 shows the estimated MSEs and pre discussed in our study. The table reveals that the MSE of proposed estimator is smaller than MSE of the compared existing estimators.
Simulation
General settings for the simulation are xy = 0.65, 0.70, 0.75; set size m = 3, 4, 5; and cycle r = 2 and 6. For the 10,000 trials, the mean estimators have been obtained. The population size of 1000 is derived from a bivariate normal distribution, with the mean and standard deviation being the random parameters 5 and 1, respectively. The relative efficiency (re) values, for which the reference estimator is simple mean estimator of RSS as ...0=......... are calculated over the complete simulation through proportioning the mean square errors as follows: ....=......(.....,......)......(.......,......) ,..=1,2 ...... ....=......(.....,......)......(...........) (5.1)
Using (20), calculated re results are given in table 3 for the normal distribution, respectively.
Table 3. Relative efficiency values under N(5,1) distribution
.
0.65
0.70
0.75
r
2
3
2
3
2
3
m=3
RE1
1.0108
1.0101
1.0113
1.0106
1.0119
1.0110
RE2
1.2805
1.2646
1..2958
1.2783
1.3144
1.2951
RE3
3.0311
3.1004
3.2559
3.3483
3.5712
3.7019
m=4
RE1
1.0103
1.0094
1.0108
1.0098
1.0113
1.0102
RE2
1.2699
1.2485
1.2842
1.2607
1.3016
1.2756
RE3
3.0767
3.1757
3.3165
3.4501
3.6567
3.8489
m=5
RE1
1.0099
1.0087
1.0103
1.0090
1.0108
1.0094
RE2
1.2593
1.2323
1.2725
1.2429
1.28.86
1.2558
RE3
3.1248
3.2578
3.3811
3.5629
3.7489
4.0155
It is shown that the estimator suggested in Tables 3 gave better results with the normal distribution in the simulation study performance. In general, it can be said that the RE value of the ........... estimator is directly proportional to the correlation and the number of cycles.
6.Conclusion
In this paper, by using coefficient of variation and median of the auxiliary variable, the modified ratio estimators of population mean have been proposed under ranked set sampling (RSS). Large sample approximations to the mean square errors of these estimators have been derived and compared with the MSE of the usual ratio estimator and other existing estimators of same class. Through numerical illustration is concluded that the proposed estimator performs much better
than the classic ratio estimator as well as the estimators given by [7] based on RSS. Thus, the proposed new estimator can be used instead, in order to increase the efficiency of parameter estimates.
References
[1]Cochran, W. G. (1940). The estimation of the yields of cereal experiments by sampling for theratio of grain to total produce. The Journal of Agricultural Science, 30(2): 262-275.
[2]Javaid, S., and Maqbool, S. (2016). Modified variance estimation using mid-range of anauxiliary variable. International Journal of Agricultural and Statistical Sciences, 12(2): 347-350.
[3]Jeelani, M. I., Bouza, C. N., and Sharma, M. (2018). Modified ratio estimator under rank setsampling. Investigacion Operacional, 38(1): 103-106.
[4]Kadilar, C., Unyazici, Y., and Cingi, H. (2009). Ratio estimator for the population mean usingranked set sampling. Statistical Papers, 50(2): 301-309.
[5]Maqbool, S., Raja, T. A., and Javaid, S. (2016). Generalized modified ratio estimator usingnon-conventional location parameter. International Journal of Agricultural and Statistical Sciences, Vol, 12(1): 2016.
[6]McIntyre, G. A. (1952). A method for unbiased selective sampling, using ranked sets.Australian Journal of Agricultural Research. 3(4): 385-390.
[7]Mehta, N.,and Mandowara, V. L. (2016). A modified ratio-cum-product estimator of finitepopulation mean using ranked set sampling. Communications in Statistics-Theory and Methods, 45(2): 267-276.
[8]Murthy, M. N. (1964). Product method of estimation. Sankhya: The Indian Journal ofStatistics, Series A, 69-74.
[9]Raja, T. A., and Maqbool, S. (2021). On modified ratio estimator using a new linearcombination. International Journal of Agricultural and Statistical Sciences. Vol, 17(1): 209-211.
[10]Rather, K.U.I. and Kadilar, C. (2021). Exponential Type Estimator for the population Meanunder Ranked Set Sampling. Journal of Statistics: Advances in Theory and Applications, 25(1): 1�12.
[11]Rather, K.U.I., Eda, K. G. and Unal, C. (2022). New exponential ratio estimator in Rankedset sampling. Pakistan Journal of Statistics and operation research, 18(2), 403�409.
[12]Robson, D. S. (1957). Applications of multivariate polykays to the theory of unbiased ratio-type estimation. Journal of the American Statistical Association, 52(280): 511-522.
[13]Samawi, H. M., and Muttlak, H. A. (1996). Estimation of ratio using rank set sampling.Biometrical Journal, 38(6): 753-764.
[14]Singh, H. P., and Kakran, M. S. (1993). A modified ratio estimator using known coefficientof kurtosis of an auxiliary character. unpublished paper.
[15]Sisodia, B. V. S., and Dwivedi, V. K. (1981). Modified ratio estimator using coefficient ofvariation of auxiliary variable. Journal-Indian Society of Agricultural Statistics.
[16]Tailor, R., and Sharma, B. (2009). A modified ratio-cum-product estimator of finitepopulation mean using known coefficient of variation and coefficient of kurtosis. Population, 10: 1.
[17]Takahasi, K., and Wakimoto, K. (1968). On unbiased estimates of the population meanbased on the sample stratified by means of ordering. Annals of the Institute of Statistical Mathematics, 20(1): 1-31.
[18]Upadhyaya, L. N., and Singh, H. P. (1999). Use of transformed auxiliary variable inestimating the finite population mean. Biometrical Journal: Journal of Mathematical Methods in Biosciences, 41(5): 627-636.
[19]Wang, T., Y. Li and H. CUI. (2007). On weighted randomly trimmed means. Journal ofSystems Science and Complexity, 20(1): 47-65.
[20]Wolter, K. M. (1985). Introduction to Variance Estimation, Springer �Verlag.
[21]W.G. Cochran, Sampling Techniques, John Wiley and Sons, 1977.