Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
M243 PPSS Washington’s Birthday 1. Two inference problems. (a) Can we generalize our conclusions to a larger group of units? (b) Can we infer cause and effect in the relationship that we have found? 2. Population A population is a well-defined collection of individuals (the possible cases). 3. Sample A sample is a subset of the population. (a) We want our sample to be representative of our population. (b) A method of choosing a sample that tends to choose a sample that is unrepresentative of the population in some way is biased. (c) To ensure an unbiased method, we introduce randomness. (d) A simple random sample of size n is a sample produced by a method that ensures that each group of n individuals in the population is equally likely to be the sample. 4. Parameter A parameter is a numerical characteristic of (a model of) the population. Parameters are (almost) always denoted by Greek letters. 5. Statistic A statistic is a numerical characteristic of the sample. A statistic is computed from the data. Statistics are never denoted by Greek letters. 6. Error. Sampling error is the error in estimating the parameter (using statistics) that is due to the fact that we are using a sample rather than the the population. (We will never know how large the sampling error is.) Useful R > sample(x,5,replace=F) Homework 1. Read Chapter 12, 287–294. 2. Practice problems (due Thursday, February 25) 12.1,3,5,7, 9 3. Problems to turn in (due Friday, February 26) 12.26 M243 PPSS Washington’s Birthday > plot(Hand~Height) > l=lm(Hand~Height) > l Call: lm(formula = Hand ~ Height) Coefficients: (Intercept) -4.7883 Height 0.3756 > summary(l) Call: lm(formula = Hand ~ Height) Residuals: Min 1Q -2.99817 -1.12483 Median 0.00183 3Q 1.12407 Max 3.25072 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -4.78832 5.00777 -0.956 0.347 Height 0.37555 0.07213 5.206 1.75e-05 *** --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 1.534 on 27 degrees of freedom Multiple R-squared: 0.501, Adjusted R-squared: 0.4825 F-statistic: 27.11 on 1 and 27 DF, p-value: 1.750e-05 > cor(Hand,Height) [1] 0.7077955 > anova(l) Analysis of Variance Table Response: Hand Df Sum Sq Mean Sq F value Pr(>F) Height 1 63.779 63.779 27.105 1.750e-05 *** Residuals 27 63.531 2.353 --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 > sd(residuals(l)) [1] 1.506309 > sd(Hand) [1] 2.132322 > plot(residuals(l)~Height) 1 Three most important descriptions of relationship: 1. We would predict an increase of .38 cm in hand span for each increase of 1 in of height. 2. Height “explains” 50% of the variation in hand span, at least for these individuals. 3. We would predict an increase in .7 of a standard deviation in hand span for each increase of 1 standard deviation in height.