A CLASS OF LOGARITHMIC-CUM-EXPONENTIAL ESTIMATORS FOR POPULATION MEAN WITH RISK ANALYSIS USING DOUBLE SAMPLING
Diwakar Shukla1, Astha Jain2
1,2 Department of Mathematics and Statistics, Dr. Harisingh Gour Vishwavidyalaya, Sagar, M.P., 470003 [email protected], [email protected]
Abstract
In order to improve upon the efficiency of an estimate in double sampling for estimating population mean of character under study using an auxiliary variable, a part of survey resources are used to collect the information on auxiliary variable. Some authors have suggested exponential-type estimators and some others advocated for log-type estimators. But combination of such is required for specific situation. This paper presents a class of logarithmic-cum-exponential ratio estimators in double sampling setup. The expressions for the mean squared error and bias of the proposed class of estimators are derived for two different cases(sub-sample and independent sample). Sometimes the persons involved in the sample survey have to undergo for risk on life. For example, data collection in naxalites area, working in intense forest, interview during spread of epidemic or data collection in politically disturbed region. Such risk may affect the accuracy, efficiency of estimation. A linear Risk function is used for the proposed class of estimators. Two cases of double sampling are compared in terms of relative efficiency in view to risk aspect.It is found that the proposed class of estimators has a lower mean squared error than the simple mean estimator, usual ratio, usual exponential, usual log estimators in the double sampling setup. In addition, these theoretical results are supported by a numerical example. Risk function based simulated study is performed for the support of findings of the content. Optimal sample sizes under risk are derived and compared under two cases.
Keywords: Exponential estimator, Logarithmic estimator, Mean squared error, Bias, Risk function, Risk Analysis, Survey sampling, Double sampling, Simple random sampling without replace-ment(SRSWOR).
1. Introduction
In double sampling, some part of the resources available for the survey are used to collect data for auxiliary variable. It is because the population mean of auxiliary variable is assumed unknown. Such are collected through sample at the preliminary level and then used to estimate population mean (or population total).
In recent study on the estimators in the double sampling Sahoo et al.[9] discussed the approach of estimating the population mean using regression-type estimator. It boosted the analytical approach of estimation for dealing with double sampling scheme. Bahal and Tuteja[2] developed exponential-type ratio and product estimator for the SRSWOR setup which later extended by the many authors in verity of other sampling schemes. Shashi Bhushan et al.[5] suggested double sampling ratio type estimator using two auxiliary variables. Authors discussed asymptotic properties of the estimators with bias and mean squared error. Shabbir and Sat Gupta[14] suggested exponential ratio-type estimator for estimating the population mean in the setup of
stratified sampling. Such proposal is found to perform better than the usual mean, usual ratio, usual exponential ratio, traditional regression estimators.
Zahoor et. al.[17] suggested regression estimator in double sampling using multi-auxiliary information in the presence of non-response and measurement error in the second phase sample. Such an extension of Azeem[8] who suggested ratio and ratio-cum-exponential estimators in double sampling for population mean incorporating the possibility of non-response and measurement error. The Wu and Luan[6] marked that major advantages of double sampling are the gain in high precision without much substantial increase in cost. Sanaullah et al.[10] suggested generalized exponential-type estimators for the stratified double sampling setup. Sanaullah et al.[12] developed the generalized exponential type estimators for estimating the population variance in double sampling with the help of two auxiliary variables. Zaman and Kadilar[18] proposed exponential ratio-type estimation procedures in the stratified two phase sampling setup. Shukla and Alim[1] proposed parameter estimation approach based an double sampling showing on application in big-data environment. Bhusan and Gupta[3] discussed some log-type estimators using attribute. In another useful contribution Bhushan and Kumar[4] proposed log-type estimators for population mean under the setup of ranked set sampling.
1.1. Risk in data collection
While the conduct of sample survey, using the personal interview method, some areas may be politically disturbed, some may dangerous due to being forest area, some may risky because of naxalites movement and few may under the risk of intense epidemic spread (like Covid-19). Such exposure of risk may possible on the life of field workers involved in data collection. Consider an example where area of a district exposed under risk are identified as A, B, C, D and each having different zones z\, z2, z3, z4, z5 with percentage of risk varying over zones.
Table 1: Risk distribution as per area and zones
Zones with risk (ri)
Area of District z1 Z2 Z3 Z4 Z5 Overhead Risk(r')
A 25% 10% 20% 30% 7% 8%
B 15% 13% 28% 12% 22% 10%
C 35% 14% 5% 25% 10% 11%
D 16% 11% 18% 19% 23% 13%
Risk per units (ri) belongs to zones and overhead risk r' belong to the geographical areas of a district.
Deriving motivational idea and scientific approach from above contributions, this paper consider the development of new class of estimators under the risk of life of surveyor during data collection using double sampling.
1.2. Symbols used for population
Let a population of finite size N, D be the variable of main interest and A is an auxiliary variable correlated to D. The pair (D^ A,), i = 1,2,3,..., N represents population values such that
1 N I N
D = 1 E Di, A = — V Ai (1.1)
N i N i
i = 1 i = 1
-I N
S2 = ^ E(D - D)2,
N- 1'
1
i=1 N
'da
N1
E(Di - D)(Ai - A),
i=1
1N
s2 E(Ai - A)
Л\2
C
da
N-1
SDA
(DA)
i=1
(1.2) (1.3)
d
Cd = D ,
f, _ Sa
Ca = A ,
Sda
SdSa'
m = p Cd
Ca
where Cd and Ca denote coefficient of variations, p correlation coefficient.
(1.4)
1.3. Notations in SRSWOR Setup:
Assumed that information about variable of main interest D is not available, so a simple random sampling is used, using sample of size n(n < N), to predict about that. Further, in usual practice such assumes population mean of auxiliary variable A available. All possible samples are (N).
Figure 1: Population and Sample
Let values of random sample by SRSWOR are (di, a^), i = 1,2,3,..., n then one can define sample statistics as:
1
N
d = — E di,
i=1
1
N
a = — E ai ni
i=1
(1.5)
d n-1'.
i=1
E(dr - d)2,
sda
n1
E(di - d)(ai - a),
i=1
s2 =
E (ai- a)2
n1
i=1
mm = щ-s2
(1.6) (1.7)
1.4. Some usual estimators in SRSWOR
(a) Usual Ratio Estimator: DR = d A
a
(b) Usual Product Estimator: D P =
A
(c) Usual Regression Estimator: DRe = d + M(A — a)
(d) Usual Log Estimator:!L = d
1 + I f
(e) Usual Exponential Estimator:!)Ex = d Some useful symbols are:
exp
Vq
qs
E{(d - D)q(a - A)s} DqAs '
vq
qs
A - a A + a
E{(d - D)q(a' - A)s} DqAs
; q, s = 0,1,2
p
n
1
1
V20
V11
1
N
^ IC2,
V02
n - N J pCdCa
V1
11
1 - -) C2 n n; Ca '
n- N )pCdCa
V0
02
1 - 1) C2 n' N
Symbols have their usual meaning as adopted by the survey practitioners in the concerned literature. The Bias Bias() and Mean Squared Error MSE( ) of above existing estimators under SRSWOR are expressed as under:
Bias (IDR ) = D [V02 - V11 ], mse(D r ) = D2 [V20 - 2V11 + V02] (1.8)
Bias(D P ) = D [V02 + V11 ], mse(D p ) = D2 [V20 + 2V11 + V02] (1.9)
Bias (II Re ) = II [V02 - $V11 ], MSE(D Re ) = D2 V20 - 2$V11 + $2V02 (1.10)
Bias(D L ) = D [V02 - V11 ], MSE(D l ) = D2 [V20 - 2V11 + V02] (1.11)
Bias(D Ex ) = " 3 1 D 8 V02 - 1V11 ' mse(D Ex ) = D2 1 V20 + 4 V02 - V11 (1.12)
2. Double Sampling Approach
When the information about population mean of variable is not available then during sample survey with the extra risk and efforts, the sample could be obtained using two different strategies.
1
Assume n' be the size of first sample with values (a1, a'2,..., a'ni ) and a' = — £i=1 ai
• Case I: When the second-phase sample of size n is a sub-sample of the first-phase sample of size n
Population (N)
First phase preliminary sample (fr' <; ai'j
a'
Second phase main sample (rr < w'J
Figure 2: Sampling strategy under case I
• Case II: When the second-phase sample of size n is drawn independently of the first-phase sample of size n .
Figure 3: Sampling strategy under case II
2.1. Some existing estimators in double sampling
In Double sampling setup, the existing estimators with their respective bias Bias(-)I,Bias(-)II and mean squared error MSE(-)I & MSE(-)II under case I and case II are as below.
(a) Simple Random sample mean estimator:
1 n
D = - £ d (2.1)
i=1
V(D) = D2 V2o (2.2)
where V(-) denotes variance of estimators.
(b) Usual Ratio Estimator:
D Rd = d( (2.3)
ч a r
Bias(DRd)i = D[(Vo2 - VO2) - (V11 - V[i)\ (2.4)
Bias(DRd)u = D[(V02 + V2) - Vu] (2.5)
MSE(DRd)! = D2[V20 + (VO2 - VO2) - 2(Vu - Vh)] (2.6)
MSE(DRd)ii = D2[V2O + (VO2 + VO2) - 2Vn] (2.7)
(c) Usual Exponential Ratio Estimator:
DDExd = d"exp( i-^) (2.8) \a' + a J
a 3 1
Bias(DExd)i = D[3(V02 - VO2) - 2 (Vu - Vh)] (2.9)
11
Bias(DExd)ii = D[-(3Vo2 - V02) - 2Vn] (2.10)
1
MSE(DExd)i = D2[V2O + 4(V02 - VO2) - (V11 - V-i)] (2.11)
1
MSE(DExd)ii = DD2[V2O + 1 (V02 + VO2) - V11 ] (2.12)
(d) Usual Log Ratio Estimator:
D Lod = d
1+a
(2.13)
Bias(DL0d)i = D[2(Vo2 - V2) - (V11 - Vi)] (2.14)
Bias(DLod)ii = D [2V02 + V2 - Vn] (2.15)
MSE(DLod)i = DD2[V20 + (VO2 - Vk) - 2(VU - Vi)] (2.16)
MSE(DLod)ii = D2[V2O + (VO2 + VO2) - 2V11 ] (2.17)
(e) Usual Regression Estimators:
D Red = di + M (a'- a) (2.18)
Bias(DRed)i = D[(V02 - VO2) - (V11 - Vh)] (2.19)
Bias(DRed)ii = D[V02 + VO2 - V11 ] (2.20)
MSE(DRed)i = Y2[V20 + MM2(VO2 - VO2) - 2M(V11 - Vh)] (2.21)
MSE(DRed)ii = YY2[V20 + MM2(VO2 + VO2) - 2MVu] (2.22)
where M is the regression coefficient.
2.2. Motivation
Estimators suggested in simple random sampling, double sampling, stratified sampling may usual type or exponential type or log-type. Sometime the data may follow the pattern different that of exponential or log-type. It may be a mixture of log and exponential type (Fig4c). This motivates to look for a new combined class of log-cum-exponential type estimators. This paper considers the same in the setup of double sampling. Several authors have suggested estimators
(a) D = log(A) Log type graph
(b) D = eA Exponential type graph Figure 4: Graphical pattern of relationship
(c) D = log(A)eA log-exponential type graph
for relationship between D and A variables as shown in Fig(4a) and Fig (4b). But for relationship of type as in Fig (4c) yet needs to be explored. This paper is focused on proposing estimation methodologies with respect to mutual relation shown in fig 4c under the double sampling setup.
3. Proposed class of Logarithmic-Exponential Type Estimators
A family of estimators under the double sampling is proposed, to estimate the unknown population mean of the study variable D assuming the presence of auxiliary information A:
D
LEd =
d
exp
1- -
1 + log -
(3.1)
assuming expo-log type relationship between D and A(fig4c), where a, fi are constants may positive or negative real numbers.
Theorem 1. The bias of the proposed class of estimator for the sub-sample(Case I) and independent sample(Case II) respectively are:
Bias(DLEd)i = aD((Vn - V^) - - )) (3.2)
Bias(DLEd)ii = aD(Vn - № + Vk)) (3.3)
where Bias(-)I, Bias(-)II are for case I and case II strategies respectively.
Proof. For large sample approximation, define some quantities e0,e1, e2 with |e0| < 1, |ei| < 1, |e21 < 1 such that
d = D (1 + e0), a = A (1 + e1), a' = A (1 + e2 )
1 '
where a' = — (En=1 ai) and (a[, a'2,..., a'n) is first phase sample of size n'.
E(eo ) = E(e1 ) = E(e2 ) = 0
Moreover,
E(e2)= ( — - N) Cd, E(e1 )= ( — - N) C2, E<e2)= (^ N) C2
E(e0e1) = ( — - PCdCa, E(e0e2) = ( — - PCdCa, E(e1 e2) = (^ - Q
General expression for bias for DLEd is
Bias(DLEd) = [E(DLEd) - D] Under large sampling approximation, upto first order ,
DLEd = D(1 + eo) [exp{ (1 - (1 + e2)a(1 + e1 )-a)(1 + p log(1 + e2)(1 + e1)-1
Since |e01 < 1, |e11 < 1 and |e2| < 1, using Taylor series expansion upto the first order approximation, ignoring terms of higher order (e'0, e1, e2) for i > 2, j > 2, 2 > 2, (i + j + 2) > 2,
D LEd
D
1 + eo + a(e1 - e2) + aeo(e1 - e2) + pa(e2 + ef - 2e1e2)
Using expectation E(eo)=E(ei)=E(e2)=0, which leads to bias of proposed class of estimator,
Bias(D LEd ) i
aD[(V11 - Vh) - p(Vo2 - VO2)]
Bias(DLEd)u = aD)[Vh - fi(Vo2 + V02)] Since E(e0e2)=Vl1=0 for case II because of sample n' being independent to n.
(3.4)
(3.5)
■
Theorem 2. The mean squared error of the proposed class of estimator for the sub-sample(Case I) and independent sample(Case II) respectively are
MSE(D LEd )i MSE(D LEd )ii
D2
D2
V20 + 2a(Vn - V11) + a2(V02 - VO2)
V20 + 2aVn + a2 (V02 + VO2)
(3.6)
(3.7)
Proof. The proposed class in double sampling is,
D
LEd
exp
1 -( a.
a
1+log( TT
and above in terms of large sample approximation is,
D
LEd
d
ex^ (1 - (1 + e2)a(1 + e1 )-a)(1 + p log(1 + e2)(1 + e1)-
Using |e01 < 1, |e11 < 1 and |e21 < 1 and Taylor series expansion upto the first order of approximation, one can get
D
LEd
D [1 + eo + a(e1 - e2)]
by ignoring terms of higher order (e'0, el, e2) for i > 1, j > 1, k > 1, (i + j + k) > 1, i,j,k=0,1,2... Subtracting D and squaring both sides one can get,
(DLEd - D)2 By taking expectation both sides, E(DLEd - D)2
D2
e2 + 2aeo(e1 - e2) + a2(e1 - e2)2
dd2 e
e2 + 2aeo(e1 - e2) + a2(e1 - e2)2
So the mean squared error is for Case I and Case II are:
MSE(D)LEd))i = ))2 [V20 + 2a(V11 - Vh) + a2(V02 - V02)
(3.1
P
a
1
and
MSE (D LEd)) ii
D2
V20 + 2aVii + a2(Vo2 + )
Since E(e0e2) = V[1 = 0 for case II.
Remark 1: Gain in precision under case I and case II
MSE(DLEd)i - MSE(DLEd)ii = -2D2(a2V2 + aV^)
(3.9) ■
(3.10)
The gain in precision depends on the sign of V1/1. In general, case I is better, but if (aV0'2 < V11) then case II of double sampling is better than case I. It provides range when 0< a < (V1) then case II is more efficient than case I.
Remark 2: Some particular estimators in the proposed class are in table6:
Table 2: Estimators as member of proposed class.
Estimators
D1 = d" D2 = d" D3 = d" D4 = d"
D5 = d"
D6 = d" D7 = d" D8 = d" D9 = d"
exp exp exp
1 - 4
(l + log
1 - a \ ( я' ) i- a
1 + bg f
exp exp exp
1- Or
1 - a
v Vя у 1- ("
}
(1 + log Й
(1 + log
)} )}
)} )}
a в
-1 -1
-1 0
-1 1
0 -1
0 0
0 1
1 -1
1 0
1 1
Table 3: Mean Squared Error of Estimators under case I as members of proposed class
Mean Squared Error a в
MSE(D1)i = D2 [V20 - 2(Vn - V11) + (V02 - V2)] -1 -1
MSE(D2)i = D2 [V20 - 2(Vn - Vh) + (V02 - Vfc)] -1 0
MSE(D3)i = D2 [V20 - 2(Vn - V1) + (V02 - V0 2)] -1 1
V (D4) = ^2V20 0 -1
V (D5) = D2 V20 0 0
V (D6) = D2V20 0 1
MSE(D7)i = D2 [V20 + 2(Vn - V1) + (V02 - V2)] 1 -1
MSE(D8)i = D2 [V20 + 2(Vn - Vh) + (V02 - V2 )] 1 0
MSE(D9)i = D2 [V20 + 2(Vn - V1) + (V02 - V0 2)] 1 1
3.1. Optimal sub-class of estimators
Differentiating MSE(-) with respect to a, one can obtain optimum value of a as Case I
a=w=(—' D = (—M) (3'">
Table 4: Mean Squared Error of Estimators under case II as members of proposed class
Mean Squared Error
MSE(D1)II = D2 [V20 - 2V11 + (V02 + V2)] MSE(D2)II = D2 [V20 - 2V11 + (V02 + V02)] MSE(D3)II = D2 [V20 - 2V11 + (V02 + V02)] V(D^4) = D2 V20 V(D$) = DD2 V20 V(D6) = DD2 V20
MSE(D7)II = D2 [V20 + 2V11 + (V02 + V02)] MSE(D8)II = DD2 [V20 + 2V11 + (V02 + V02)] MSE(D9)II = D2 [V20 + 2V11 + (V02 + V02)]
a
-1 -1
-1 0
-1 1
0 -1
0 0
0 1
1 -1
1 0
1 1
Table 5: Bias of Estimators under case I as members of proposed class
Bias
Bias(DjI = -D [(Vn - V1) + (V02 - V02)]
BiasD)I = -DD(Vu - V1)
Bias(Dj)I = -DD [(V11 - V'1) - (V02 - V02)]
Bias(D 4) = 0
Bias(D5) = 0
Bias(D 6) = 0
Bias(D7)I = D [(Vn - V11) - (V02 - V02)]
Bias(Ds)I = DD(Vn - V11)
BiasD)I = DD [(Vn - V11) + (V02 - V02)]
a fi
-1 -1
-1 0
-1 1
0 -1
0 0
0 1
1 -1
1 0
1 1
Table 6: Bias of Estimators under case II as members of proposed class
Bias
Bias(D7) II
Bias ) ii
Bias(D9) II
a fi
= -D [V11 + (V02 + V02)] -1 -1
= -DD [V11] -1 0
= -D [Vn - (V02 + V02)] -1 1
= 0 0 -1
= 0 0 0
= 0 0 1
= D [V11 - (V02 + V2)] 1 -1
= DD Vn 1 0
= D [Vn - (V02 + V02)] 1 1
Case II
where, 5
V11
V02 + V0
02
n Cd L(1 + 5) V Ca
" M '
[(1 + 5)J
The mean squared error under the optimum value of a = a [as per (3.8), (3.9)] are Case I
[MSE(DLEd)i]opt = D2C2 ( fn - -(1 - i) n2
nn
(3.12)
(3.13)
Case II
[ MSE(D LEd ) II ]opt
1 + S
(3.14)
4. Comparison with existing estimators
The existing estimators will be less efficient to the proposed estimators for case I and case II respectively under the following conditions:
(1) Simple random sample mean estimator (d):
Case I: a <
-2(Vii - Vji)
(V02 - V2) '
Case II: a <
-2V11 (V02 + VO2)
(2) Usual Ratio Estimator (DRd)[eq(2.3)]
Case I: a
1 - 2( P C
Case II: a
1
P Cd
(1 + S) VP Ca
(3) Usual Exponential Ratio estimator (DEd)[eq(2.8)]
Case I: a < ^
1 - 4p C
Ca
Case II: a < ^
1
4
P Ca
(1 + s) vp ca
(4) Usual Log Ratio Estimator (IDLd)[eq(2.13)]
Case I: a
1 - 2( P Ca
Case II: a
1
P Cd
(1 + S) VP Ca
(5) Usual Regression Estimator (DRed)[eq(2.18)]
Case I: a < -2 ( pC ) ,
Case II: a 1
(1 + S)
Cd 'cT
a
5. Risk function and the Proposed estimator
The risk in data collection for dangerous area while implementing a sampling procedure is defined as
(a) Total Risk
(b) Per unit respondent contact risk (infection, injury, life risk)
(c) General risk (area dependent risk)
Risk is associated to various ground conditions like risk in hilly area during data collection, risk of reaching to the household, risk of non-response, risk of dangerous situations, risk of attack on the life of surveyor, risk of epidemic etc. Let us use symbols for risk as: r : Overhead risk r0 : Total risk
r1 : Risk per unit for information collection on variable D and A using second sample n. r2 : Risk per unit for first sample for collecting information on auxiliary variable A.
Linear risk function for collecting information is:
2
P
2
r0 = r + T]U + r2 n
It is matter of interest to determine the n and n' for a given risk r0 at the situation when MSE of DLEd is minimum. To minimize risk function under risk constraint $ and optimum MSE, one can get,
Case I
$ = [MSE(DLEd)i]opt + A(r' + rin + r2n' - ro)
where A is a Lagrange's multiplier. Differentiating with respect to n and n', equating it to zero, the optimum values of n and n are
"opt
(ro - r' )^rjR ri Mi '
lopt
(ro - r')^-r2(R - C2)
r2 M1
where
Mi = [VnR + ^-ri(R - C2)]'
R = [C2 + laCda + a2C2a]
(5.1)
Case II
$ = [MSE(DLEd)ii]opt + A(r' + rin + r2n' - ro)
where A is a Lagrange's multiplier. Now differentiating with respect to n and n , equating it to zero, the optimum values of n and n under case II are
'opt
(ro - r' )yrjR ri M2 '
opt
(ro - r')aCa r2 M2
where
M2 = [^r^ + \J r2(a2 C2) ]'
R = [C2 + 2aCda + a2 Ca2 ]
(5.2)
The ratio of optimal selection of n and n' under fixed risk c0 is Case I
o pt
r2 (y/r\R)
Case II
p*j riU-r2(R - C2)
nopt\ = r2 (VrjR) n'opt) = riaCa Vr2
6. Empirical risk based Study
Consider a positively correlated population with two variables D and A(Data source -6th Minor Irrigation Census - Village Schedule - Assam)[19] with N=100.
The values of variable D and A are shown in Table 7, where A represents geographical area and D represents the net shown area in hectares.
n
Table 7: Population Undertaken.
Di 152 98 75 68 60 295 72 125 16 260
At 165 111 80 79 78 319 86 189 26 380
Di 62 95 210 95 175 180 100 37 87 96
At 74 123 220 123 185 197 120 48 105 109
Di 80 148 85 98 38 95 200 84 18 38
Ai 110 158 121 108 40 110 350 95 28 46
Di 53 69 30 55 29 75 78 48 81 75
Ai 71 81 45 63 45 89 110 59 95 92
Di 103 97 82 25 76 70 57 182 55 85
Ai 113 105 96 35 94 81 70 192 65 122
Di 70 24 190 53 190 158 80 93 176 81
Ai 75 34 200 67 232 169 100 103 186 89
Moreover, population parameters are in the Table 8.
Table 8: Population Parameters
D = 135 S2d = 82327 C2 = 4.534 Sad = 96274.91 A = 161 S2 = 113076.5 C2 = 4.356 Cad = 4.43
Table 9: PREs of different estimators with respect to proposed estimator in double sampling
PRE
Estimators Case I Case II
Simple Random sampling (D) 22.13% 56.083%
Ratio Estimator (DRd) 0.009% 41.493%
Exponential ratio estimator (DExd) 6.856% 2.011%
Log ratio Estimator (DLod) 0.009% 41.493%
where PRE is Percentage Relative Efficiency defined as:
(PRE)h„ = MSE(T)V LEd>'■">"" X 100 (6.1)
and T represents estiamtors like usual ratio, usual expo-ratio, usual log- ratio estimators. It is observed that in case I, at the aopt, the proposed is 22.13% efficient over sample mean estimator, 6.85% better over exponential ratio estimator and same to the usual ratio usual log ratio estimator. Moreover, in case II, at value aopt, the proposed is 56% efficient to sample mean estimator, 41.4% efficient over ratio estimator, 2% efficient over to exponential estimator and 41.4% over log-ratio estimator.
In Figure 5, while general variation of a values, the case I bears lower MSE then case II. But while reaching to aopt, both cases achieve the same MSE level equivalent to that of Regression estimator in double sampling.
Figure 6, reveals the variation of total risk r0 over the optimum sample sizes (nopt & n'opt). It is observed that increasing fixed risk r0 leads to larger n'opt (first sample) in comparison to second sample optimum nopt. Low level risk indicates for equal(but small) n and n to be used by the survey practitioners.
Figure 5: Comparison between MSE's of the proposed class under case I and case II over variation of a
Figure 6: nopt and n'opt for case I over change to total risk r0
Figure 7, depicts similar pattern among nopt and nopt while considering variation of total risk r0. But interesting is that with the increment in total risk r0, the case II needs smaller optimum first phase (preliminary) sample than case I.
The Figure 8, reveals some interesting features of two cases I and II as when ratio ( than
\nopt)
case I. This feature confirms that if r2 increases over fixed r1 then nopt increases over fixed . But such increment is high in case II rather than case I.
Figure 7: Variation ofn0pt and n'optfor case II over change to total risk r0
Figure 8:
with respect to ratio of ( r^
n
n
7. Conclusion
On recapitulation, this paper presents a new class of estimators for estimating the unknown population mean in double sampling in the presence of auxiliary information. Some authors in literature have proposed exponential-type and some others proposed log-type estimators. The suggested estimation procedure is a combo-type class of estimators incorporating both expo and log-type structure. Its properties are discussed and compared in the set up of double sampling, under case I and case II sampling strategies. The proposed is found conditional efficient over usual expo-type and usual log-type estimators (Table 9). Moreover, a linear risk function is used in the paper with three risks parameters r0, r1, r2 and expressions for optimal sample sizes nopt and n'opt are derived. Risk based simulation study reveals that increasing the fixed risk r0 leads to larger nopt (first sample) in comparison to equal (but small) n and n' to be used by the survey practitioner over incrementing ro. Case II needs smaller preliminary sample size in comparison
to case I. While considering variation of optimum ratio of sample sizes (nopt/n'^) with respect to the risk ratio (r2/r1) variation, the case I graph of such ratio constantly reveals lower than the case II, graph indicating lesser need of comparative optimum sample ratio in double sampling
using the suggested expo-log estimator at a = aopt choice.
References
Alim A. and Shukla D. (2021). Double sampling based parameter estimation in Big Data and application in Control Charts,Reliability(RT&M), 16(62):72-86.
Bahl, S. and Tuteja, R.K. (1991). Ratio and product type estimators, Journal of Information and Optimization Sciences, 12(1):159-164.
Bhushan, S. and Gupta, R (2019). Some log-type classes of estimator using auxiliary variable attribute, Advances in Computational Science and Technology, 12(2):99-108. Bhushan, S. and Kumar, A. (2020). Log-type estimators of population mean in rank set sampling, Predictive analysis using statistics and Big Data: Concepts and Modeling, 28:47-74. Bhushan, S., Pandey, A. and Shubra, K. (2008). A class of estimators in double sampling using two auxiliary variables, Journal of Reliability and Statistical studies, 1(1):67-73. Changbao Wu and Ying L. (2003). Optimal calibration estimators under two-phase sampling, Quality Engineering, 19(2):119-131.
Kumari, C. and Thakur, R. K. (2021). An efficient log-type class of estimators using auxiliary information under double sampling, Journal of Statistics Application and Probability, 10(1):197-202.
Muhammad A. and Muhammad H. (2014). On estimation of population mean in the presence
of measurement error and non-response, Pak. J. Statist., 31(5):657-670.
Sahoo J., Sahoo L.N. and Mohanty S. (1993). A regression approach for estimation in
two-phase sampling using auxiliary variables, Current Science, 65:73-75.
Sanaullah, A., Ali, H. A., Amits, Muhammad Noor ul and Hanif, M. (2014). Generalised
exponential chain ratio estimators under stratified two-phase random sampling, Applied
Mathematics and Computation, 226: 541-547.
Sanaullah, A., Amin, Muhammad Noor-ul and Hanif, M. (2015). Generalized exponential-type ratio-cum-ratio and product-cum-product estimators for population mean in presence of non-response under stratified two-phase random sampling, Pakistan Journal of Statistics, 31(1):71-94.
Sanaullah, A., Hanif, M. and Asghar, A. (2016). Generalized exponential estimators for population variance under two-phase sampling, International Journal of Applied Computation and Mathematics, 2:75-84.
Shabbir, J., Ahmed S., Sanaullah, A. and Onyange, R. (2021). Measuring performance of ratio-exponential-log type general class of estimators using two auxiliary variables, Mathematical Problems in Engineering, 2021(3):1-12.
Shabbir, J. and Sat G. (2011). On estimating finite population mean in simple and stratified
random sampling, Communication in Statistics(Theory and Methods), 40:199-212.
Shabbir, J., Sat G. and Masood, S. (2022). An improved class of estimators for finite population
mean in simple random sampling, Communication in Statistics, Theory and Method, 51(11):3508-
3520.
Shukla, D. (2002). FT estimator under two phase sampling,Metron (International Journal of Statistics), 60(1-2):97-106.
Zahoor, A., Iqra M. and Muhammad H. (2014). Regression estimator in two phase sampling using Multi-auxiliary information in the presence of non-response and measurement error at second phase, Journal of Applied Probability and Statistics, 9(2):41-5.
Zaman, T. and Kadilar, C. (2021). Exponential ratio and product type estimators of mean in stratified two-phase sampling, AIMS Mathematics, 6(5):4265-4279.
https://data.gov.in/ Data source -6th Minor Irrigation Census - Village Schedule - Assam