Download Notes - Wharton Statistics

Statistics 112 Notes 1 Reading: Review Chapter 2. I. Basic Goal of Statistical Inference Population Inference about population using statistical tools Sample of Data Example: The data set birthweight.JMP contains a sample of the birthweights of 1236 babies born in the United States from the Child Health and Development Studies. From this sample, we would like to know something about the population distribution of birthweights in the United States – this would help doctors to judge when a baby has an abnormally low or large birthweight. II. Properties of Random Variable A random variable is a variable whose value is a numerical outcome of a random phenomenon. Examples: 1. Toss two fair coins. The number of heads Y in the two tosses is a random variable. 2. Observe the birthweight of a randomly chosen baby from the U.S. population. The baby’s birthweight is a random variable. Probability distribution of a random variable: The proportion of times the random variable will take on each of its possible values in repeated repetitions of the random phenomenon. Example 1: For tossing two fair coins independently, 1 1 1 P(Y  0)  , P(Y  1)  , P(Y  2)  4 2 4 Independence of random variables: Captures the idea that two random variables X and Y are unrelated, that knowing the value of X does not help to predict Y . The formal definition is that X and Y are independent if the chance that simultaneously X  x and Y  y can be found by multiplying the separate probabilities: P( X  x, Y  y )  P( X  x) * P(Y  y ) for every x, y . Check your understanding: For the population of people, do you think X  height and Y  weight are independent? For undergraduates, is it plausible that X  age and Y  gender are independent? If I flip two fair coins, a dime and a quarter, so that 1 P( HH )  P( HT )  P(TH )  P(TT )  , then is it true or 4 false that getting a head on the dime is independent of getting a head on the quarter? Expected value (mean) of a random variable: The mean value of the random variable over repeated repetitions of the random phenomenon. The expected value of a random variable is the sum of its possible values weighted by their probabilities. Example 1 continued: For tossing two fair coins independently, 1 1 1 E (Y )  0   1  2   1 , 4 2 4 so I expect 1 head when I flip two fair coins. I might actually get 0 heads or 2 heads, but 1 heads is what is expected on average. Variance and standard deviation: The standard deviation of a random variable Y measures how far Y typically is from its expectation E (Y ) . Being too high is as bad as being too low – we care about errors and don’t care about their signs. So we look at the squared difference between Y and E (Y ) , 2 namely D  {Y  E (Y )} , which is, itself, a random variable. The variance of Y is the expected value of D and the standard deviation is the square root of the variance, Var (Y )  E[{Y  E (Y )}2 ] and SD(Y )  Var (Y ) . Example 1 continued: Toss two fair coins independently. 1 1 1 P(Y  0)  , P(Y  1)  , P(Y  2)  , E (Y )  1 . 4 2 4 D  {(Y  E (Y ))2 } takes the value (0  1)2  1 with 2 probability ¼, the value (1  1)  0 with probability ½ and 2 the value (2  1)  1 with probability ¼. The variance of Y is the expected value of D namely: 1 1 1 1 Var (Y )  E ( D)  1*  0*  1*  . 4 2 4 2 So the standard deviation is 1 SD(Y )  Var (Y )   0.707 . 2 So when I flip two fair coins, I expect one head but often I get 0 or 2 heads instead, and the typical deviation from what I expect is 0.707 heads. This 0.707 reflects the fact that I get exactly what I expect, namely 1 head, half the time, but I get 1 more than I expect a quarter of the time, and one less than I expect a quarter of the time. Check your understanding: If a random variable has zero variance, how often does it differ from its expectation? Consider the height Y of a randomly chosen adult male in the U.S. What is a reasonable number for E (Y ) ? Pick one: 4 feet, 5’9’’, 7 feet. What is a reasonable number for SD (Y ) ? Pick one: 1 inch, 4 inches, 3 feet. III. Normal distribution Continuous random variable: A continuous random variable can take values with any number of decimals, like 1.2361248912. Weight measured perfectly, with all the decimals and no rounding, is a continuous random variable. Because it can take so many different values, each value winds up having probability zero. If I ask you to guess someone’s weight, not approximately to the nearest millionth of a gram, but rather exactly to all the decimals, there is no way you can guess correctly – each value with all the decimals has probability zero. But for an interval, say the nearest kilogram, there is a nonzero chance that you can guess correctly. This idea is captured by the density function. Practical Note: We often will model a random variable as being continuous even if there are many values it can take on even if there are only a finite number of values it can take on. For example, it is reasonable to model child’s birthweight in ounces as a continuous random variable. Density functions: A density function defines probability for a continuous random variable. It attaches zero probability to every number, but positive probability to ranges (e.g., nearest kilogram). The probability that the random variable Y takes values between 3.9 and 6.2 is the area under the density function between 3.9 and 6.2. The total area under the density function is 1. Normal distribution: A random variable is said to have a Normal distribution if it has the Normal density, which is the familiar bell shaped curve: The standard Normal distribution has expected value 0 and standard deviation 1. The probability that a random variable with a standard Normal distribution takes on values between -1 and 1 is about 2/3 and the probability that a standard normal variable takes on values between -2 and 2 is about .95 (To be more precise, there is a 95% chance that a standard Normal random variable will be between -1.96 and 1.96). If Z is a standard Normal random variable and  and   0 are two numbers, then Y     Z has the normal distribution with mean  and standard deviation  . The density function for Y continues to be bell shaped. For a random variable Y with mean  and standard deviation  , we can find the probability that Y is between a and b using the standard normal density: a b P ( a  Y  b)  P ( a     Z  b)  P ( Z  ).   Check your understanding: A company that offers an expensive stereo component is considering offering a warranty on the component. Suppose the population of lifetimes of the components is a normal distribution with a mean of 84 months and a standard deviation of 7 months. What is the probability that a randomly chosen stereo’s component will last between 86 and 90 months? IV. Inference. The population is typically described by certain parameters about which we would like to make inferences based on a sample. Suppose a population is such that a randomly chosen member of a population has a normal distribution with mean  and standard deviation  . The parameters  and  describe the population. We obtain a random sample from the population Y1 , , Yn . A random sample means that we draw at random individuals from the population with equal probability. We would like to make inferences about  and  : Point Estimates – best estimates of  and  . 95% Confidence Intervals for  – interval that is likely to contain the true  , interval will contain the true in 95% of random samples Point estimates: 1 n ˆ   i 1 Yi n 1 n 2 ˆ ˆ  ( Y   )  i n  1 i 1 ˆ ˆ   2 Approximate 95% CI: n JMP computes these estimates automatically. Click Analyze, then Distribution. Put Y variable in Y, Columns. For birthweight data, Distributions Birthweight 50 60 70 80 90 110 130 150 170 Quantiles 100.0% 99.5% 97.5% 90.0% 75.0% 50.0% 25.0% 10.0% 2.5% 0.5% 0.0% maximum quartile median quartile minimum 176.00 169.82 155.08 142.30 131.00 120.00 108.25 97.00 81.00 65.56 55.00 Moments Mean Std Dev Std Err Mean upper 95% Mean lower 95% Mean N 119.57686 18.236452 0.5187177 120.59453 118.5592 1236 ˆ  119.58, ˆ  18.24 95% CI : (118.56,120.59) Check your understanding: Based on these estimates, what is the approximate probability that a baby will have birthweight less than 100 ounces? Preview of the rest of the course: While understanding the distribution of birthweights is of some interest, a more interesting question is how do birthweights vary with certain charactertics, such as whether the mother smokes. This course will focus on making inferences about the mean of a response variable Y (e.g., birthweight) for the subpopulation of individuals with covariates X 1 , , X p (e.g., whether mother smokes, gestation length) and making inferences about how the mean of Y changes as X 1 , , X p change.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Notes - Wharton Statistics