Download Computational rank-based statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

German tank problem wikipedia , lookup

Regression toward the mean wikipedia , lookup

Data assimilation wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Article type: Advanced Review
Computational rank-based statistics
Joseph W. McKean, [email protected] Western Michigan University
Jeff T. Terpstra, [email protected] North Dakota State University
John D. Kloke, [email protected] Pomona College
Keywords
least absolute deviation/least absolute value, newton method/newton raphson method, rank
method/rank statistics/rank tests, robust regression, statistical software
Abstract
This review discusses two algorithms which can be used to compute rank-based regression
estimates. For completeness, a brief overview of rank-based inference procedures in the context
of a linear model is presented. The discussion includes geometry, estimation, inference, and
diagnostics. In regard to computing the rank-based estimates, we discuss two approaches. The
first approach is based on an algebraic identity which allows one to compute the (Wilcoxon)
regression routine. The other approach is a Newton type algorithm. In
estimates using a
addition, we discuss how rank-based inference can be generalized to nonlinear and random
effects models. Some simple examples using existing statistical software are also presented for
the sake of illustration and comparison.
Traditional least squares (LS) procedures offer the user an encompassing methodology for
analyzing models, linear or non-linear. These procedures are based on the simple premise of
fitting the model by minimizing the Euclidean distance between the vector of responses and the
model. Besides the fit, the LS procedures include diagnostics to check the quality of fit and an
array of inference procedures including confidence intervals (regions) and tests of hypotheses.
LS procedures, though, are not robust. One outlier can spoil the LS fit, its associated inference
and even its diagnostic procedures (i.e. methods which should detect the outliers).
Rank-based procedures also offer the user a complete methodology. The only essential change
is to replace the Euclidean norm by another norm, so that the geometry remains the same. As
with the LS procedures, these rank-based procedures offer the user diagnostic tools to check the
quality of fit and associated inference procedures. Further, in contrast to the LS procedures, they
are robust to the effect of outliers. They are generalizations of simple nonparametric rank
procedures such as the Wilcoxon one and two-sample methods and they retain the high
efficiency of these simple rank methods. Further, depending on the knowledge of the underlying
error distribution, this rank-based analysis can be optimized by the choice of the norm (scores).
Weighted versions of the fit can obtain high (50%) breakdown.
1
Rank-Based Inference for Linear Models
For brevity, we discuss rank-based inference in terms of a linear model. This analysis, though,
has been generalized to many different models. For instance, time series models are discussed
in Terpstra et al. (2000, 2001a, 2001b, 2001), nonlinear models are discussed in Abebe and
McKean (2007), and random effect models are discussed in Kloke et al. (2008).
What follows then is a very brief description of the rank-based procedures for linear models. For
more details, see Chapters 3-5 of Hettmansperger and McKean (1998). Let denote a
vector of responses and assume that it follows the linear model,
(1)
where
denotes a
vector of ones, is the intercept parameter,
is a
matrix of
predictors, is a
vector of regression coefficients, and is a
vector of random errors.
The design matrix
can contain incidence vectors as well as continuous predictors (covariates).
Let be the range of
and assume without loss of generality that
is centered and has full
column rank, so that the dimension of is .
The geometry of the rank-based procedures is the same as least squares (LS), except that
instead of the Euclidean norm, the following pseudo-norm is used:
(2)
where the scores are generated as
for a nondecreasing function
, defined
on the interval
, and
is the rank of
among
. We assume without loss
of generality that the scores sum to zero. Two of the most popular score functions are the
) and the
(
).
Wilcoxon (
The rank-based estimate of
is given by
(3)
is a norm
is a convex function of ; hence,
exists. Because the scores sum
Since
to zero and the ranks are invariant to a constant shift, the intercept cannot be estimated using the
norm. Instead, it is usually estimated as the median of the residuals. That is,
, where
is the th row of .
The rank-based estimate generalizes robust estimates in the simple location problems. For
is the incidence vector of the two samples. If
example, in the two samples location problem
Wilcoxon scores are used, then
is the Hodges-Lehmann estimate (i.e. the median of all
differences between the samples).
Let
denote the probability density function (pdf) of
. Then, under regularity conditions,
(4)
where
and
,
,
,
,
.
2
It follows from this asymptotic theory that the asymptotic relative efficiency between the rank, where
denotes the variance of . This is the
based and least squares estimates of is
same efficiency as in the simple location problems. In particular, for Wilcoxon scores and normal
errors, this efficiency is 0.955. For error distributions with thicker tails, this efficiency can be quite
, appropriate scores can
high. Furthermore, depending on the knowledge of the error pdf
result in asymptotically efficient estimates.
, where
is a
matrix of full row
Next, consider the general linear hypothesis,
rank. Denote the reduced model subspace by
. There are
several rank-based tests: a Wald-type test based on the asymptotic distribution of
, an
aligned rank (scores-type) test, and a reduction in dispersion test. The reduction in dispersion test
is given by
, where
and
denotes the distance
between and the corresponding subspace. While
is asymptotically
under
, small
sample studies indicate that
should be compared with the same -critical values as the LS
test. For the aligned rank test, the reduced model residuals are obtained and then the test
statistic is an adjusted gradient test based on the ranks of these residuals. In the literature, LS is
often used for the reduced model fit. In this case, the aligned rank test is not robust. A better
choice is to use the rank-based estimate for the reduced model fit. Rank-based confidence
intervals for linear functions of are the same as that for LS except the estimate of is replaced
by the estimate of .
This rank-based inference inherits the good efficiency properties of the estimates discussed
above. Further, the influence functions of both estimates and tests are bounded in (response)
space. They are not, however, bounded in
(factor) space. However, this is irrelevant for
designed experiments. The rank-based high breakdown procedures discussed in the sequel have
high breakdown and bounded influence in both and
spaces and are therefore more
appropriate for observational studies.
Diagnostics for the rank-based analysis are discussed in McKean et al. (1993). In particular,
studentized residuals are presented, which for each point adjusts for both its error variation and
its location in factor space. These performed well in a Monte Carlo study for autoregressive rankbased procedures as discussed by Terpstra et al. (2003).
The main computational problem for these rank-based procedures concerns the computation of
. We discuss two algorithms. The first algorithm is for the Wilcoxon scores for which a simple
identity enables the computation to be carried out as an
estimate on differences of residuals.
The other algorithm is a Newton type algorithm which can be used for general scores.
Computing Rank-based Regression Estimates
-Algorithms for Wilcoxon and Sign Scores
The Wilcoxon estimate of
is defined to be a minimum of the following dispersion function:
(5)
where
and
denotes the rank of
be shown (e.g. Hettmansperger, 1984) that
among
. However, it can
(6)
3
for all and . Note that multiplying the right-hand side of (6) by a
where
constant does not change the estimate of . Hence, for the Wilcoxon estimate, we can assume
the weights are equal to one without loss of generality. This algebraic identity can be exploited to
readily compute the Wilcoxon estimate of . For example, to compute the Wilcoxon estimate one
can simply use a
regression routine with
and
playing the role of the
response variables and design points, respectively. Recall that the
regression estimate
minimizes
, where
corresponds to the
norm.
Incidentally, it is interesting to point out that the
regression estimate can be viewed as a rankbased estimate as well. For instance, the identity on page 188 of Hettmansperger and McKean
regression estimate essentially corresponds to
(1998) illustrates that the
with
.
Many of the mainstream statistical software packages have the ability to compute
estimates.
For example, in SAS,
regression estimates can be computed using the IML procedure and the
LAV subroutine. In S-PLUS,
regression estimates are computed via the l1fit function. To
our knowledge, R does not provide an explicit
regression function. Nevertheless, since
regression estimates are equivalent to (median) quantile regression estimates, the quantreg
package written by Roger Koenker can be used to calculate the Wilcoxon estimate. We refer the
interested reader to Koenker (2005) for more information on quantile regression. This R
implementation is discussed in Terpstra and McKean (2005) as the Wilcoxon estimate is a
special case of their so-called class of weighted Wilcoxon (WW1) estimates.
Weighted Wilcoxon Estimates
Briefly, a WW-estimate corresponds to a minimum of (6) where the weights are no longer
constant. In terms of computing the WW-estimate there is essentially no difference; simply use a
regression routine with
and
playing the role of the response variables
and design points, respectively. This is basically how weighted LS regression estimators are
calculated.
Of course, the simplest weighting scheme corresponds to
(i.e. the Wilcoxon estimate). The
Wilcoxon estimate, though, is not robust against outlying points in the design. However,
robustness against outlying design points can be achieved by selecting an appropriate weighting
are typically referred to as Mallows weights and are
scheme. Weights of the form
discussed in Sievers (1983), Naranjo and Hettmansperger (1994), and Hettmansperger and
McKean (1998). As discussed by Naranjo and Hettmansperger, an appropriately chosen set of
Mallows weights can yield a bounded influence function and positive breakdown point. For
yields the well-known Theil
instance, for the simple linear regression model,
(1950) estimate; which has a 33% breakdown point (e.g. Theorem 5.7.3 of Hettmansperger and
McKean).
A 50% breakdown estimate can be obtained by allowing the weights to depend on both the
design points and the residuals from an initial fit. These weights are typically referred to as
Schweppe weights. As an example, let
(7)
denotes a robust version of squared Mahalanobis distance, and
. The tuning constant is set at
and
4
(8)
denotes the
residual based on an initial estimate. Thus, these weights
Lastly,
incorporate information from both the design points and the responses via the initial residuals.
The weights in (7) and (8) were suggested by Chang et al. (1999) and are further discussed in
section 5.8.1 of Hettmansperger and McKean (1998). This particular WW-estimate is referred to
as the HBR-estimate since it acquires a 50% breakdown point provided the initial estimates (i.e.
, , and ) are 50% breakdown estimates. These weights, as well as others, are available in R
via the suite of functions discussed by Terpstra and McKean (2005).
A Newton Algorithm for General Scores
A finite algorithm to minimize
is discussed in Osborne (1985). However, the algorithm we
describe in this section minimizes
using a Newton type of algorithm based on the
asymptotic quadraticity of
. It is generally much faster than algorithms based on the
gradient and is currently used in MINITAB (i.e. the RREGRESS command) and RGLM2 (Kapenga,
McKean, and Vidmar, 1988).
Our discussion is brief; more details can be found in Section 3.7 of Hettmansperger and McKean
.
(1998). The algorithm begins with an initial estimate (e.g. least squares) which we denote as
Based on the asymptotic quadraticity of
, the first Newton step is given by
(9)
where
denotes the negative of the gradient of
. The subsequent iterated estimates are
often called -step estimates. Under regularity conditions, it can be shown that
; see McKean and Hettmansperger (1978). At that time, this was an
important result for computing R-estimates. For instance, one could simply use one or two steps.
Because of the speed of modern computing, the importance of (9) now is that it is the basis for a
Newton algorithm to obtain fully iterated estimates. This more formal algorithm written in terms of
is presented next.
a QR-decomposition of
Consider the QR-decomposition of
which is given by
, where is an
orthogonal matrix and is an
upper triangular matrix of rank ; see Stewart (1973). Writing
where
is
, it follows that the columns of
form an orthonormal basis for ,
the column space of . In particular the projection matrix onto is given by
. The
3
software package LAPACK is a collection of subroutines which efficiently computes QRdecompositions and it further has routines which obtain projections of vectors.
Note that we can write the th Newton step in terms of residuals as
, where
denotes the vector whose th component is
. Let
denote the dispersion function evaluated at
. The Newton step is a step
along the direction
. If
the step is successful;
from
otherwise, a linear search can be made along the direction to find a value which minimizes
.
This would then become the th step residual. Such a search can be performed using methods
such as false position as discussed in Chapter 3 of Hettmansperger and McKean (1998).
Stopping rules can be based on the relative drop in dispersion. For instance, one can stop when
, where
is a specified tolerance. A similar stopping rule can be
based on the relative size of the step. Upon stopping at step , obtain the fitted value
and then obtain the estimate of by solving
. This is the algorithm for the
5
package RGLM, except as the more steps fail, an average over past history of directions is used.
Also, Abebe and McKean (2007) discuss the convergence of this algorithm in a probability sense.
The QR-decomposition can readily be used to form a reduced model design matrix for the
reduction in dispersion and the aligned rank tests of general linear hypotheses of the form
. Let
be a
matrix whose columns consist of an orthonormal basis for the
space
. Then, the matrix
is a design matrix for the reduced model, which is
easily computed by LAPACK routines.
Rank-Based Estimation for Other Models
Nonlinear Models
The algorithms discussed in the previous section can also be used for nonlinear models. In this
, where is a vector with components
for
case the model is of the form
some specified function . The rank-based estimate is
. Since the
model is nonlinear, this is not necessarily a convex function of , but the usual Gauss-Newton
type of algorithm can be used to obtain the rank-based fit. Recall that this is an iterated algorithm
evaluated at the current estimate to
which uses the Taylor Series expansion of the function
obtain the estimate at the next iteration. Thus, each of the iterations consists of the rank-based fit
of a linear model. As such, the fits can be computed using the above algorithms; see Abebe and
McKean (2007) for details.
Random Effect Models
Here, we briefly look at a mixed model which includes a random effect. Consider an experiment
observations. Within block , let
,
,
performed over random blocks, where block has
and
denote respectively the
vector of responses, the
design matrix and the
vector of errors. Let
denote a vector of
ones. Then, the model for
is
(10)
where is the random effect. For a matrix formulation of the model, let
total sample size. Let
,
,
. We can now write the mixed model as
denote the
, and
(11)
where
is the
incidence matrix denoting block membership.
For this article, we are interested in the computation of rank-based estimates of the vector of fixed
effects, . We present two estimators. Note that for the estimation of the fixed effects, the
geometry is the same as that for the linear model in (1). Hence, our first estimator is the rankbased estimator given in (3),
. Its computation remains the same as discussed above. For this
model, asymptotic theory is presented in Kloke et al. (2008). Correct standardization of the
associated inference, of course, involves estimation of variance components which are also
discussed in Kloke et al.
For the second estimator, consider the case where the fixed effects model is complete within
each block. For example, this occurs in a multi-center clinical design where the block is
synonymous to center. The rank-based estimator
can be used. However, the following
estimator is invariant to the random effects. For each center
, consider Jaeckel’s
(1972) dispersion function given by
6
(12)
where
is the pseudo-norm in equation (2), but here over the space
. That is,
(13)
where
denotes the rank of among
. We use the same scores for each center.
Since the rankings are within center, it is easy to see from (13) that
; hence,
is invariant to the random effect . Our objective function is the
dispersion function given by
(14)
where the subscript
stands for multiple rankings (i.e. a ranking for each center). Thus,
is invariant to the vector of random effects . Since the functions
are convex, their
sum is also. An estimator of the fixed effects is given by the value that minimizes
.
That is,
(15)
This estimator was proposed by Rashid et al. (2008). Asymptotic theory follows if the number of
for each . Since
is convex, there is software
centers is held fixed and
. The function
can be minimized using the Nelder-Mead
available to compute
algorithm (Olsson and Nelson, 1975) or a quasi-Newton algorithm. Both algorithms do not require
second derivatives. In SAS, we can use the subroutines NLPNMS (for the Nelder-Mead algorithm)
and NLPQN (for the quasi-Newton algorithm) of PROC NLP to minimize the dispersion function. In
the statistical software R, the non-linear minimization function NLM can be used. Also, based on
the theory of the estimator, a Newton step algorithm can be written similar to the algorithm
discussed above for R estimates; see Kloke and McKean (2004).
Examples
Quadratic Regression Data
The purpose of this example is to illustrate the computational approaches discussed above. More
specifically, we compute the Wilcoxon estimate for a simulated data set using the R code
discussed by Terpstra and McKean (2005), MINITAB’s RREGRESS command and RGLM. In
addition, we compute the least squares estimate for the sake of comparison.
The model for the data is a polynomial model of degree two. That is,
(16)
where the distribution of is a mixture of a standard normal distribution (75%) and a normal
distribution with mean zero and standard deviation 26 (25%). This type of distribution is
commonly referred to as a contaminated normal distribution in the robust and nonparametric
literature. Thus, the tails of this distribution are heavier than the standard normal distribution. The
generated observations for this example are given in the Supplementary Information
section.
The estimates and corresponding standard errors are given in Table 1.
7
Table 1: Wilcoxon and Least Squares Estimates for the Quadratic Regression Data
Estimate
SE( )
SE( )
SE( )
Wilcoxon
-0.0870
1.2516
0.9420
0.5482
0.1561
0.0514
Least Squares
-4.6357
10.4246
4.6754
4.5860
-0.3275
0.4304
The Wilcoxon estimates that appear in Table 1 were obtained using the R code by Terpstra and
McKean (2005). The estimates produced by MINITAB’s RREGRESS command and RGLM are the
same (to three decimal places) as those given in Table 1 and are therefore not presented. We
note that the RGLM estimates are the same to four decimal places. Thus, the three packages
used to compute the Wilcoxon estimate are in agreement.
On the other hand, the least squares estimate is quite different from the Wilcoxon estimate. Most
notably is the estimate pertaining to the coefficient on the quadratic term (i.e. ). The inference
based on the Wilcoxon estimate indicates that this coefficient is significantly different from zero
(as it should be) and is very close to the true value of the parameter (i.e. 0.15). In contrast, the
least squares estimate is not significantly different from zero and is, in fact, negative. The fits and
residual plots given in the Supplementary Information section indicate that the 19th observation is
largely responsible for the poor fit of the least squares estimate. As previously mentioned, the
reason that this does not happen for the Wilcoxon estimate is that the influence function for the
Wilcoxon estimate is bounded in .
Lastly, we also tested
using the Wilcoxon version of the reduction in dispersion test
given by
above. Again, we used the R code of Terpstra and McKean (2005) which produced a
test statistic value of 10.9369 and a p-value of 0.0042. We note that the (standard normal) pvalue for the Wald test, based on the information given in Table 1, is 0.0024. Hence, contrary to
the least squares procedures, the reduction in dispersion test and the Wald test are not
equivalent. Nevertheless, both tests are in agreement as to the significance of the result.
Stars Data
In this example we analyze the well-known stars data, which appears in many textbooks and
journal articles pertaining to robust regression techniques. Briefly, the data set consists of two
measurements on 47 stars. The first measurement is the independent variable and represents
the logarithm of the temperature of the star while the second measurement is the response
variable which represents the logarithm of the light intensity of the star. The actual data, along
with a more detailed description, is given in Rousseeuw and Leroy (1987).
The stars data set is commonly used to illustrate the effects of bad leverage points on different
is outlying
types of estimates. Here, a bad leverage point denotes a point where the value of
with respect to the other design points and the value of is inconsistent with the fit of the majority
of data. The stars data contains at least four such points. In fact, all four of these values relate to
giant stars.
In what follows we will compare the Wilcoxon (WIL) estimate to the HBR weighted Wilcoxon
estimate in (7) and (8). All of the analyses were done using the R code of Terpstra and McKean
(2005). To begin, the estimates and corresponding standard errors are given in Table 2.
Table 2: Wilcoxon and HBR Estimates for the Stars Data
Estimate
Wilcoxon
SE( )
7.2029
1.3370
SE( )
-0.4766
0.3083
8
HBR
-3.4692
1.6824
1.9167
0.3896
To summarize, the Wilcoxon and HBR estimates are clearly different. Most notably, the sign on
the slope parameter for the Wilcoxon estimate is negative while the sign of the slope parameter
for the HBR estimate is positive. As the top plot in Figure 2 of the Supplementary Information
section indicates, the HBR fit is undoubtedly the better fit for this data set. The Wilcoxon estimate
is heavily influenced by the four giant stars. Of course, this illustrates the well-known fact that the
Wilcoxon estimate is not robust against outliers in the design as its influence function is
unbounded in .
It is clear from this example that the Wilcoxon estimate and a corresponding WW-estimate can be
quite different. Thus, it is desirable to have a diagnostic that compares the Wilcoxon estimate
for all
) to a WW-estimate (
for some
). This can be accomplished using
(
the following diagnostic
(17)
and
denote the parameter estimates (including the intercept) for the Wilcoxon
Here,
and WW-estimate, respectively. The variance-covariance matrix in (17) is an estimate of the
variance-covariance matrix given in (4). The two fits are declared different if the diagnostic
. See, for example, McKean et al. (1996a, 1996b). As the bottom plot in
exceeds
Figure 2 of the Supplementary Information section indicates, the Wilcoxon and HBR estimates
are clearly different in this regard.
In such instances, it is often informative (especially in multiple linear regression problems) to
determine which fits are different. A diagnostic which compares the individual fits is given by
(18)
where the denominator of (18) corresponds to the estimated standard error of
. Here,
and
denote the Wilcoxon and WW fits for the th case. Hettmansperger and McKean
(1998) suggest using
as a benchmark for declaring two individual fits different.
Again, referring to the bottom plot in Figure 2 of the Supplementary Information section we see
that the Wilcoxon and HBR fits are vastly different for the stars data set. Lastly, we note that the
intent of these diagnostics is to determine differences in fits as opposed to determining which fit is
better. The question as to which fit is better should be answered via a standard residual analysis
based on the studentized residuals.
Conclusion
As discussed in this review, rank-based analyses for linear and nonlinear models can range from
highly efficient to highly robust, depending on the score and/or weight functions employed. With
the exception of MINITAB’s RREGRESS command, this valuable class of estimators has yet to be
implemented in any mainstream statistical software packages. Nevertheless, Wilcoxon estimates,
and more generally weighted Wilcoxon estimates, can be easily implemented in any software
regression estimate. Most notably is the collection of
package that has the ability to compute a
functions discussed by Terpstra and McKean (2005). However, other statistical software
packages can be used to obtain these analyses as well. For example, in S-PLUS,
regression
estimates are computed via the l1fit function. In SAS,
regression estimates can be
computed using the IML procedure and the LAV subroutine. Furthermore, the HBR weights
9
defined in (7) and (8) can be computed using calls to MAD, MCD, and LTS. On the other hand,
rank-based estimates that utilize a nonlinear score function can be computed using the Newton
algorithm discussed in this review. As previously mentioned, the web-based RGLM program and
MINITAB’s RREGRESS command implement this algorithm. Furthermore, both programs have the
capability of producing a complete inference, including residual analyses.
Notes
1
The R code for the WW-estimates can be found on the web at: http://www.jstatsoft.org/v14/i07.
2
RGLM can be accessed via the web at: http://www.stat.wmich.edu/slab/RGLM/.
3
LAPACK can be found on the web at: http://www.netlib.org/lapack/.
References
Abebe, A. & McKean, J. W. (2007). Highly efficient nonlinear regression based on the Wilcoxon
norm. In Umbach, D. E. (Ed.), A Festschrift in Honor of Mir Masoom Ali, (pp. 340–357).
Chang, W. H., McKean, J. W., Naranjo, J. D., & Sheather, S. J. (1999). High-breakdown rank
regression. Journal of the American Statistical Association, 94(445), 205–219.
Hettmansperger, T. P. (1984). Statistical inference based on ranks. Wiley Series in Probability
and Mathematical Statistics: Probability and Mathematical Statistics. New York: John Wiley &
Sons Inc.
Hettmansperger, T. P. & McKean, J. W. (1998). Robust nonparametric statistical methods,
volume 5 of Kendall’s Library of Statistics. London: Edward Arnold.
Jaeckel, L. A. (1972). Estimating regression coefficients by minimizing the dispersion of the
residuals. Annals of Mathematical Statistics, 43, 1449–1458.
Kapenga, J. A., McKean, J. W., & Vidmar, T. J. (1988). RGLM: users manual. American
Statistical Association short course on robust statistical procedures for the analysis of linear and
nonlinear models, New Orleans.
Kloke, J. D. & McKean, J. W. (2004). Rank-based analyses of one-way repeated measures
models. In ASA Proceedings of the Joint Statistical Meetings, (pp. 1980–1985). American
Statistical Association.
Kloke, J. D., McKean, J. W., & Rashid, M. (2008). R-estimates of the fixed effects in a mixed
model. Manuscript submitted for publication.
Koenker, R. (2005). Quantile regression, volume 38 of Econometric Society Monographs.
Cambridge: Cambridge University Press.
McKean, J. W. & Hettmansperger, T. P. (1978). A robust analysis of the general linear model
based on one step R-estimates. Biometrika, 65(3), 571–579.
McKean, J. W., Naranjo, J. D., & Sheather, S. J. (1996a). Diagnostics to detect differences in
robust fits of linear models. Computational Statistics, 11(3), 223–243.
10
McKean, J. W., Naranjo, J. D., & Sheather, S. J. (1996b). An efficient and high breakdown
procedure for model criticism. Communications in Statistics: Theory and Methods, 25, 2575–
2595.
McKean, J. W., Sheather, S. J., & Hettmansperger, T. P. (1993). The use and interpretation of
residuals based on robust estimation. Journal of the American Statistical Association, 88(424),
1254–1263.
Naranjo, J. D. & Hettmansperger, T. P. (1994). Bounded influence rank regression. Journal of the
Royal Statistical Society. Series B. Methodological, 56(1), 209–220.
Olsson, D. M. & Nelson, L. S. (1975). The Nelder-Mead simplex procedure for function
minimization. Technometrics, 17, 45–51.
Osborne, M. R. (1985). Finite algorithms in optimization and data analysis. Wiley Series in
Probability and Mathematical Statistics: Applied Probability and Statistics. Chichester: John Wiley
& Sons Ltd.
Rashid, M. M., McKean, J. W., & Kloke, J. D. (2008). R-estimates and associated inferences for
mixed models with covariates in a multi-center clinical trial. Manuscript submitted for publication.
Rousseeuw, P. J. & Leroy, A. M. (1987). Robust regression and outlier detection. Wiley Series in
Probability and Mathematical Statistics: Applied Probability and Statistics. New York: John Wiley
& Sons Inc.
Sievers, G. L. (1983). A weighted dispersion function for estimation in linear models.
Communications in Statistics. A. Theory and Methods, 12(10), 1161–1179.
Stewart, G. W. (1973). Introduction to matrix computations. New York: Academic Press.
Computer Science and Applied Mathematics.
Terpstra, J. T. & McKean, J. W. (2005). Rank-based analyses of linear models using R. Journal
of Statistical Software, 14(7), 1–26.
Terpstra, J. T., McKean, J. W., & Anderson, K. (2003). Studentized autoregressive time series
residuals. Computational Statistics, 18(1), 123–141.
Terpstra, J. T., McKean, J. W., & Naranjo, J. D. (2000). Highly efficient weighted Wilcoxon
estimates for autoregression. Statistics, 35(1), 45–80.
Terpstra, J. T., McKean, J. W., & Naranjo, J. D. (2001a). GR-estimates for an autoregressive time
series. Statistics & Probability Letters, 51(2), 165–172.
Terpstra, J. T., McKean, J. W., & Naranjo, J. D. (2001b). Weighted Wilcoxon estimates for
autoregression. Australian & New Zealand Journal of Statistics, 43(4), 399–419.
Terpstra, J. T. & Rao, M. B. (2001). Generalized rank estimates for an autoregressive time series:
a U-statistic approach. Statistical Inference for Stochastic Processes, 4(2), 155–179.
Theil, H. (1950). A rank-invariant method of linear and polynomial regression analysis (parts 1-3).
Ned. Akad. Wetensch. Proc. Ser. A, 53, 386–392, 521–525, 1397–1412.
Supplementary Information
11
Table 3. Quadratic Regression Data Set
1.0
1.5
2.0
2.5
y
-0.0966
-0.1054
2.2856
4.0856
3.8817
x
3.0
3.5
4.0
4.5
5.0
y
2.2940
4.1968
6.1877
7.9462
7.3224
x
5.5
6.0
6.5
7.0
7.5
y
9.3391
11.1738
13.0702
12.8161
15.7984
x
8.0
8.5
9.0
9.5
10.0
y
19.8282
37.0221
21.5988
-42.5976
24.9298
20
0
-40
-20
y
0
-40
-20
y
20
40
0.0
40
x
0
2
4
6
8
10
0
2
6
8
10
20
25
x
Wilcoxon
0
-80
-40
Stud. Residuals
0
-4 -3 -2 -1
Stud. Residuals
1
20
2
x
Least Squares
4
-5
0
5
Fits
10
0
5
10
15
Fits
Figure 1. Fits (row 1) and residual plots (row 2) corresponding to the least squares
(column 1) and Wilcoxon (column 2) estimates of the quadratic regression data.
12
6.0
5.0
4.0
Log Light Intensity
3.6
3.8
4.0
4.2
4.4
4.6
Log Temperature
2
-2
CFIT
6
CFITS for WIL and HBR
TDBETA: 65.96 Benchmark: 0.34
0
10
20
30
40
CASE
Figure 2. These plots compare the Wilcoxon and HBR estimates for the stars data. The top
plot shows the Wilcoxon (dotted line) and HBR (solid line) fits along with the stars data.
The bottom plot compares the Wilcoxon and HBR fits for each point via the CFITS
diagnostic.
Reviewer Suggestions
Name
Affiliation
e-mail
Asheber Abebe
Auburn University
[email protected]
Suzanne R. Dubnicka
Kansas State University
[email protected]
Tom Hettmansperger
Penn State University
[email protected]
Simon Sheather
Texas A&M University
[email protected]
13