* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download statistics
Survey
Document related concepts
Transcript
The Role of Statistics Sexual Discrimination Problem A large company had to downsize and fire 10 employees. Of these 10 employees, 5 were women. However, only 1/3 of the company’s employees were women. This discrepancy has led the women who were fired to file a sexual discrimination lawsuit. Do they have a legitimate claim? The Role of Statistics What are the two possibilities in this case? they have a legitimate claim: the company fired a higher proportion of women on purpose they don’t have a legitimate claim: this could have occurred by random chance The Role of Statistics Which of the two possibilities can we actually assess? not the first one we cannot know what the boss was thinking however, we can estimate the probability of getting a result as surprising as this by random chance The Role of Statistics Simulate the firing by using a population of beads to represent the population of the company white = women black = men Draw 10 beads at random and count the number of women fired (# of white beads). The Role of Statistics Collect class data and estimate the probability of having 5 or more women fired by random chance (company is telling the truth). The Role of Statistics Does this give evidence of discrimination (the women were fired on purpose)? NO! Since it is somewhat likely to get 5 or more women by random chance alone, we do not have evidence that women were discriminated against. The Role of Statistics Summary: Based on the makeup of the company, we would expect to have 3 or 4 women fired. However, firing 5 or more women could have occurred by random chance so we should not decide the company is guilty. How many women fired would make you suspicious? In statistics, it is always possible that we make the wrong decision. More on this later… The Role of Statistics STATISTICS is the science of collecting, analyzing, and drawing conclusions from data. Statistics is also the art of distilling meaning from data. The POPULATION OF INTEREST is the entire collection of individuals or objects about which information is desired. The Role of Statistics When you study an entire population, it is called a CENSUS. A SAMPLE is a subset of the population, selected for study in some prescribed manner. The Role of Statistics DESCRIPTIVE statistics is the branch of statistics that studies methods for summarizing data. INFERENTIAL statistics is the branch of statistics which involves generalizing about a population based on information from a sample of that population. Statistical INFERENCE is the process of drawing these generalizations. The Role of Statistics A VARIABLE is any characteristic whose value may change from one individual to another. Ex: DATA results from making observations on one or more variables. It is important to remember that a set of information is not data unless it comes in a context. The Role of Statistics A DISTRIBUTION shows the values a variable can take and how often it takes those values. Ex: The Role of Statistics A UNIVARIATE data set consists of observations on a single variable. Ex: A BIVARIATE data set consists of observations of two variables for each member of the sample. Ex: The Role of Statistics A variable is CATEGORICAL (or qualitative) if the possible responses fall into categories. Ex: A variable is NUMERICAL (or quantitative) if the possible responses are numerical in nature. Ex: The Role of Statistics Quantitative variables usually include units, which tell how the variable was measured. For example, if you are told the weight of an animal is 12, you wouldn’t know very much until you were informed of the unit (e.g. tons or milligrams). The Role of Statistics Observations of categorical data are usually recorded with words (e.g. Honda, brown), but can also be recorded with numbers. Area codes are an example. Living in the 626 area code isn’t necessarily better than living in the 310 area code, even though it is higher numerically. In cases like these, the numbers are just labels for different categories. The Role of Statistics Many variables can be used as a categorical variable or a quantitative variable. For example, scores on the STAR test are recorded numerically, but also placed into categories such as “proficient” and “basic”. The Role of Statistics Numerical data is DISCRETE if the possible values are isolated points on the number line. Ex: Numerical data is CONTINUOUS if the possible values form an entire interval on the number line. Ex: In general, you MEASURE continuous variables and COUNT discrete variables. The Role of Statistics For each of the following variables, determine if they are categorical or numerical. If it is numerical, determine if it is continuous or discrete: length of a pen type of pen number of pens in a box The Role of Statistics For each of the following variables, determine if they are categorical or numerical. If it is numerical, determine if it is continuous or discrete: color of pants number of pockets length of inseam The Role of Statistics For each of the following variables, determine if they are categorical or numerical. If it is numerical, determine if it is continuous or discrete: subject of book number of pages area of a page