Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CALIBRATED ESTIMATORS OF FINITE POPULATION COVARIANCE Aleksandras Plikusas1 and Dalius Pumputis2 1 Vilnius University, Institute of Mathematics and Informatics, Lithuania e-mail: [email protected] 2 Institute of Mathematics and Informatics, Vilnius Pedagogical University, Lithuania e-mail: [email protected] Abstract Calibrated estimators of the population covariance are constructed using auxiliary information in order to get more accurate estimates. Different distance measures are used to construct calibrated estimators. Four estimators of the covariance are presented in the paper. The experimental comparison of the considered estimators will be presented when correlation between study variable and known auxiliary variable be 0.8, 0.6, 0.4, 0.2. 1 Introduction Calibrated estimators are widely used in finite population statistics to improve the quality of estimators, using auxiliary information. The idea of calibration technique for estimating of population totals was presented in (J.-C. Deville, C.-E. Särndal 1992, 376–382 p.). Recently the calibration technique has been widely used in the presence of non-response (S. Lundström 1997). The calibrated estimators of a ratio of two totals were introduced in (A. Plikusas 2003, 543–547 p.). The calibrated estimator of a ratio as well as of population variance and covariance can be defined in different ways. We can choose a different calibration equation and use different distance functions. Five distance functions have been presented in (J.-C. Deville, C.-E. Särndal 1992, 376–382 p.), but only one of them ( L1 ) is being used in practice. This distance function is the simplest one and there exists an explicit solution of calibration equations when calibrating the estimator of the total as well as of the ratio (A. Plikusas 2001, 457–462 p.). An undesirable property of this distance function is that for some populations calibrated weights can be negative. 2 Calibration problem Consider a finite population U {u1 , u2 , ..., u N } of N elements. Let y and z be two study variables defined on the population U and taking values { y1 , y2 , ..., yN } and {z1 , z2 , ..., z N } respectively. The values of the variables y and z are not known. We are interested in the estimation of the covariance S yz . Let us consider the estimator of the covariance Sˆ yz 1 1 d k yk N 1 ks N d y z is i i k 1 N d z . is i i Here s denotes a probability sample set, d k 1/ k are sample design weights, k is a probability of inclusion of the element k into the sample s. Suppose, that some auxiliary information is available. Let a variable x y with the population values {x y1 , x y 2 , ..., x yN } and a variable xz with the values {x z1 , x z 2 , ..., x zN } be auxiliary variables with the known covariance S x y xz . The covariance estimator is constructed using the known auxiliary variables. It is known that in case the auxiliary variables are well correlated with study variables, the variance of the calibrated estimator of the covariance is lower. Using auxiliary variables the calibrated estimator Sˆ wyz 1 1 1 wk y k wi yi z k wi z i of the covariance S yz is defined under the N 1 ks N is N is following conditions: a) the weights wk estimate the known covariance S x y xz without error: Sˆ wx y xz S x y xz (1) b) the distance between the design weights d k and calibrated weights wk is minimal according to some loss function L. 3 Examples of distance measures Let us introduce free additional weights qk , k 1,..., N . One can modify calibrated estimators by choosing q k . A number of a different estimators can be derived as a special case of the calibrated estimator by choosing weights q k . We can also put qk 1 for all k. The following loss functions can be considered: L1 ks ( wk d k ) 2 ( wk d k ) 2 w w 1 , L2 k log k ( wk d k ), L3 2 , d k qk d k qk qk ks q k ks d w (w d k ) 2 1 1 L4 k log k ( wk d k ), L5 k , L6 qk d k qk wk q k ks ks ks q k 2 wk 1 , dk 2 1 L7 ks q k wk 1 . d k The functions L1 L5 are mentioned in (see J.-C. Deville, C.-E. Särndal 1992, 376–382 p.). The distance measures L6 and L7 are introduced in (A. Plikusas 2003, 543–547 p.). 4 Results Let us introduce some notations ˆ w1 1 N w x ks k yk , ˆ w2 1 N w x ks k zk , N̂ w Nˆ bi x yi xzi w 2 x yi ˆ w 2 xzi ˆ w1 ˆ w1ˆ w 2 , N w ks k , B N 1S x y xz 2 N Nˆ w ˆ w1 ˆ w 2 d k x yk x zk . ks Proposition 1. The weights wi , which satisfy the calibration equation (1) and minimize the loss function L1 ( wk d k ) 2 can be expressed as wi d i i , with d k qk ks 1 i 1 B d k bk qk x yk x zk bi qi . ks Proposition 2. The weights wi , which satisfy the calibration equation (1) and minimize the loss function L3 2 ( wk d k ) 2 ks qk 1 2 here is a properly chosen root of the equation 1 wk q k2 bk2 x yk x zk 2 wk q k bk x yk x zk B 0 . 4 ks ks Proposition 3. The weights wi , which satisfy the calibration equation (1) and minimize the loss 1 function L6 ks q k 2 wk 1 can be expressed as wi d i i , with dk 1 i 1 B d k2 bk qk x yk x zk d i bi qi ks Proposition 4. The weights wi , which satisfy the calibration equation (1) and minimize the loss 2 1 function L7 ks q k 2 can be expressed as wi d i i , with i 1 bi qi 1 , wk 2 1 can be expressed as wi d i i , with i 1 d i bi qi 1 , d k here is a properly chosen root of the equation wk d k2 q k2 bk2 x yk x zk 2 2 wk d k q k bk x yk x zk B 0 . ks ks References J.-C. Deville, C.-E. Särndal, Calibration estimators in survey sampling, Journal of the American Statistical Association, 87, 376-382 (1992). S. Lundström, Calibration as Standard Method for Treatment of Nonresponse, Doctoral dissertation, Stockholm University (1997). A. Plikusas, Calibrated weights for the estimators of the ratio, Lith. Math. J., 43, 543-547 (2003). C.-E. Särndal, B. Swensson, J.Wretman, Model Assisted Survey Sampling, Springer-Verlag, NewYork.