Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Hypothesis Testing CJ 526 Probability Review P = number of times an even can occur/ Total number of possible event Bounding rule of probability Minimum value is 0 Maximum value is 1 Probability Probability of an event NOT occurring is the complement of an event Probability of an illness = .2 Probability that illness will not occur = 1=probability of event or 1 - .2 = .8 Odds of an event is the ratio Odds of illness = .2/.8 or 1 to 4 or odds of not getting ill are 4 to 1 Addition rule of p What is the probability of either one event OR another occurring? If the events are mutually exclusive, simply add the probabilities (Venn diagram) What is the p of having a boy or a girl? P = 5. + .5 = 1 Multiplication rule What is the probability of A and B occurring? If the events are independent of one another, they can be multiplied What is the p of having both schizophrenia and epilepsy? Probability distributions A probability distribution is theoretical—we expect it based on the laws of probability That is different from an empirical distribution—one which we actually observe Normal probability distribution Probability distribution for continuous events Probability of an event occurring is higher in the center of the curve Declines for events at each of the two ends (tails) of the distribution Neither of the tails touches the x axis (infinity) Normal distribution Theoretical probability distribution Unimodal, symmetrical, bell-shaped curve Symmetrical: draw a line down the center, left and right halves would be mirror images Can be expressed as a mathematical formula (p. 220) Normal distribution Family of normal distributions Dependent on mean and SD (Illustrate) More spread out: larger SD Narrower: smaller SD Variations Skewness Skewed to the right or the left, as opposed to symmetry Kurtosis: degree of “peakedness” or “flatness” Area under the normal curve Remember that for any continuous distribution there is a mean and SD Example: Mean = 10 and SD = 2 If the distribution is not skewed, the majority (2/3) of scores will be from 8 to 12 8 and 12 are each one SD from the mean See p. 225 Area under the normal curve If a distribution is normal, we can express standard deviation in terms of z scores A z score = (a score – the mean)/SD If we convert all our raw scores to z scores, then we get what is call the standard normal distribution It STANDARDIZES our scores Standard normal distribution Then distributions of different measures can be compared against one another The standard normal distribution has a mean of 0 and an SD of one If you use the formula for z scores, all the scores can be converted If a distribution has a mean of 10, the z score for 10 will be (10-10)/SD = 0 Standard normal distribution If a distribution has a mean of 10 and an SD of 2, the z score for 12 would be z = (12-10)/2 = 1 The z score for 8 would be z = (8-10)/2 = -1 The negative and positive sign have meaning: a + sign means a score is above the mean Standard normal distribution A minus sign means the score is less than the mean The z score also tell about magnitude—the larger the z score, the further from the mean, and the smaller the z score, the closer to the mean Standard normal distribution We can also make statements about where an individual score is in relation to the rest of the distribution .3413 (or 34.13%) of scores will fall between the mean and 1 SD .3413 (or 34.13%) of scores will fall between the mean and – 1 SD Standard normal distribution .6826 (0r 68.26) of scores will be between -1 and + 1 SD on a normal distribution Thus, when we see a mean and SD, if it is normally distributed, about 2/3 of the scores will fall between the mean – the SD and the mean + the SD Standard normal distribution 50% of the scores will be above the mean 50% of the scores will be below the mean .1359 (13.59%) will fall between -1 and -2 SD and between +1 and +2 SD .0215 (2.15%) will fall between -2 and -3 SD and +2 and +3 SD See p. 223, illustrate Standardized normal distribution Tells us about any distribution Example of IQ scores, mean = 100, SD = 15 About 2/3 between 85 and 115 Less (13.5%) between 115 and 130, and 70 and 85 About 2% between 130 and 145, and 55 and 70 Standardized normal SAT scores, mean = 500, SD = 100 Illustrate Use of z table, p. 724 Reading the table Utility of the normal distribution Use of the normal distribution underlies many statistical tests Many variables not normally distributed However, the normal distribution useful anyway because of the apparently validity of the Central Limit Theorem Sampling distributions To understand the Central Limit Theorem, need to understand sampling distributions Say we draw many samples, and calculate a statistic for each sample, such as a mean When we draw the samples, the mean will not be the same each time—there will be variation Sampling distributions If you were to obtain some measure on several samples of patients with the same disorder, there would be variation in the mean of the measure for each sample. There is an actual mean for the entire population of patients that have the disorder, but that is not known, because we don’t have measures for the whole population Sampling distributions However, we could obtain means based on a large number of samples Central limit theorem: if an infinite number of random samples of size n are drawn from a population, the sampling distribution of the sample means will itself approach being normally distributed (even if the measure is not itself normally distributed) Number of subjects With sample sizes greater than 100, the Central Limit Theorem can be used If the measure is not terribly skewed, then samples could be around 50 With sample sizes of less than 50, the central limit theorem probably should not be used. Application of the central limit theorem (ex)