Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 in 8 women (12.5%) of women get breast cancer, so P(breast cancer if female) = 0.125 1 in 800 (0.125%) of men get breast cancer, so P(breast cancer if male) = 0.00125 Statistics: Unlocking the Power of Data Lock5 Two-Way Table Create a two-way table, with 1000 each of males and females. Gender/C Breast Cancer ancer No Breast Total Cancer Female Male Total 875 998.75 1873.75 0.125*1000 = 125 0.00125*1000 = 1.25 126.25 1000 1000 2000 What’s the overall (unconditional) probability of breast cancer? Statistics: Unlocking the Power of Data Lock5 Conditional Probability Gender/ Breast Cancer No Breast Total Cancer Cancer Female Male Total 125 1.25 126.25 875 998.75 1873.75 1000 1000 2000 What’s P(breast cancer if female)? What’s P(female if breast cancer)? P(A if B) is NOT the same as P(B if A)!!! Statistics: Unlocking the Power of Data Lock5 Odds Ratio The odds ratio (OR) is the ratio of the odds of an event in one group to the odds of an event in another group Odds ratio for breast cancer comparing females to males: odds of getting breast cancer for males OR = odds of getting breast cancer for females Statistics: Unlocking the Power of Data Lock5 Odds Ratio odds of getting breast cancer for males OR = odds of getting breast cancer for females P(breast cancer if female) 1- P(breast cancer if female) = P(breast cancer if male) 1- P(breast cancer if male) 1/ 8 1/ 7 11 / 8 = = = 114.14 1 / 800 1 / 799 1- 1 / 800 Statistics: Unlocking the Power of Data Lock5 Unit A Essential Synthesis Statistics: Unlocking the Power of Data Lock5 The Big Picture Population Sampling Sample Statistical Inference Statistics: Unlocking the Power of Data Descriptive statistics Lock5 Chapter 1: Data Collection Was the sample randomly selected? Yes No Possible to generalize to the population Should not generalize to the population Statistics: Unlocking the Power of Data Was the explanatory variable randomly assigned? Yes Possible to make conclusions about causality No Can not make conclusions about causality Lock5 Chapter 2: Descriptive Statistics Type of summary statistics and visualization methods depend on the type of variable(s) being analyzed (categorical or quantitative) Statistics: Unlocking the Power of Data Lock5 Variable(s) Visualization Summary Statistics Categorical bar chart, pie chart frequency table, relative frequency table, proportion, odds Quantitative dotplot, histogram, boxplot mean, median, max, min, standard deviation, z-score, range, IQR, five number summary Categorical vs Categorical side-by-side bar chart, two-way table, difference segmented bar chart in proportions, odds ratio Quantitative vs Categorical Overlaid histograms, parallel dotplots, side-by-side boxplots statistics by group, difference in means Quantitative vs Quantitative scatterplot correlation Statistics: Unlocking the Power of Data Lock5 Descriptive Statistics Think of a topic or question you would like to use data to help you answer. What would the cases be? What would the variables be? (Limit to one or two variables) Statistics: Unlocking the Power of Data Lock5 Descriptive Statistics How would you visualize and summarize the variable or relationship between variables? a) bar chart/pie chart, proportions, frequency table/relative frequency table, odds b) dotplot/histogram/boxplot, mean/median, sd/range/IQR, five number summary c) side-by-side or segmented bar charts, difference in proportions, two-way table, odds ratio, conditionals d) side-by-side boxplot, difference in means e) scatterplot, correlation Statistics: Unlocking the Power of Data Lock5