Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter one: The Nature of Probability and Statistics What are statistics? Statistics is the science of conducting studies to collect, organize, summarize, analyze and draw conclusions from data. Always ask the 3 w’s: Who, what and why? 1-1: Descriptive and inferential statistics Variable: is a characteristic or attribute that can assume different values. Data: Are the values (measurements or observations) that the variables can assume. Random Variables: Variables whose values are determined by chance. Data set: Collection of data values Data value (Datum): each value in the data set. Two types of statistics Descriptive statistics: consists of the collection, organization, summarization and presentation of data. Inferential statistics: consists of generalizing from samples to populations, performing estimations and hypotheses tests, determining relationships among variables, and making predictions. (Inferential statistics uses probability(the chance of something occurring) Sample vs Population Population: consists of all subjects (human or otherwise) that are being studied. Sample: is a group of subjects selected from a population. Hypothesis testing An area of inferential statistics that is used in decision making process for evaluating claims about a population based on information obtained from a process from samples. 1-2 Variables and types of Data Two types of data: Qualitative vs quantitative Qualitative variables can be placed into distinct categories, according to some characteristic or attribute Quantitative variables: are numerical and can be ordered or ranked. Two types of quantitative: Discrete variables: assume values that can be counted Continuous variables: can assume an infinite number of values between any two specific values. They are obtained by measuring. They often include fractions and decimals. Boundaries see handout Levels of measurement Nominal Ordinal Interval Ratio Nominal (sounds like names) Categorical/qualitative Consists of a set of categories that has different labels Often dichotomous (ie. Biological sex or yes of no questions) Another example: country of residency ADVANCED: What kind of statistics use nominal scales? Pearson chi-square (both independent and dependent are measured on a nominal scale like handedness and dyslexia) Ordinal: (sounds like order) Categorical/qualitative A set of categories organized in an ordered sequence. Ranks; Likert scale (An order exists but unknown quantitative differences: ie: name your five closest friends) ADVANCED: Spearman correlation(Likert items) Interval (numerical scale with a meaningful order) Quantitative difference between numbers within the scale reflect equal differences in magnitude. Limitations: no zero (ie: Celsius (zero is not the absence of temp) Calendar (zero is not absence of time), SAT (lowest score 200) IQ (lowest is 40) ADVANCED: Pearson Correlation (IQ and SAT scores), (TEMP and SAT) Ratio(interval + natural zero point) Can express differences between two values as a ratio (can multiply or divide values) {with interval you can add and subtract but cannot multiple or divide} ie: height or weight, # of times out of the country, # of items recalled on a memory test, reaction time.. ADVANCED: Pearson Correlation ***note many statisticians do not differentiate between ratio and interval. 1-3 Data Collection and Sampling techniques Name and define the four basic sampling methods: Random: selected by using chance (think of the rectangles) systematic: Selecting every kth subject (students entering the classroom) Stratified: Dividing the population into groups (called strata) think of Freshmen, Sophomore, Junior, Senior. Cluster: Geographic area of schools in a large school district. Another popular technique convenience sample. 1-4 Observational and experimental studies An observational study: the researcher merely observes what is happening or what has happened in the past and tries to draw conclusions based on these observations. An experimental study: the researcher manipulates one of the variables and tries to determine how the manipulation influences other variables. [Quasiexperimental study: using intact groups] WATCH Brown eyes BLUE eyes (https://youtu.be/KpRQ0-ZGNZk) Variable review Independent variable also called the explanatory variable is the one being manipulated Dependent variable also called the outcome variable is the resultant variable. (dependent is the one that is studied to see if it has changed significantly due to the manipulation of the independent variable) Treatment group and the control group Confounding variable is one that influences the dependent or outcome variable but was not separated from the independent variable. 1-5 Uses and misuses of statistics Recall the 3 w’s Suspect samples Ambiguous averages Changing the subject Detached statistics Implied connections Misleading graphs Faulty survey questions Suspect Samples Too small of a sample Bad selection of sample (convenience sampling) Ambiguous Averages Measures of central tendency are mean, median, mode and midrange. When someone says average what are they talking about. Real estate example Changing the subject Different values are used to represent the same data. Using percentages vs actual numbers for wow factor Detached Statistics Advil works 3 times faster. 1/3 fewer calories Low fat Implied Connections “Eating fish may help to reduce your cholesterol” Studies suggest that using our exercise machine will reduce your weight Taking calcium will lower blood pressure in some people. Misleading graphs Faulty survey questions Do you feel that the school should build a new football stadium? Vs Do you favor increasing school taxes for a new athletic field?