Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction Chapter 1 Descriptive statistic: is the collection, presentation and description of data in form of ______, ______, and _________________that provide meaningful information about the data. Inferential statistic: deals with the _________ of data as well as drawing ________ and making generalizations based on data for a larger group of subjects. Individual: ___________________________________________________________________________ Observation: _________________________________________________________________________ Random variables: Variables: qualitative (categorical) versus quantitative (numerical) Categorical variable: individuals can be placed into one of several distinct categories. Nominal versus ordinal categorical variables Nominal: ________ not meaningful Ordinal: _________ may be meaningful Quantitative variables: take numerical values for which arithmetic operations such as adding and averaging make sense. Population: ________________________________________________________________________ Sample: ___________________________________________________________________________ Numerical summaries for populations and samples: “parameters” VS “statistics” Sample statistic: ____________________________________________________________________ Population parameter: _______________________________________________________________ Descriptive Statistics, chapter 2-4 Displaying distribution with graphs: i. categorical data A. bar charts: shows the amount of data that belong to each category as proportionally sized ___________________. B. pie charts: shows the amount of data that belong to each category as a proportional slice of the circle. C. pareto charts: arrange bar chart with respect to their magnitude, i.e. from ___________________ in order to highlight the categories with the highest frequencies: ii. Quantitative data Histogram time-plots box-plots iii. describing distributions A. center: associated with locating the “middle” of the data Mean: Median: Mode: How to find the median, when the sample size n is odd/even Odd: Even: B. spread Range: ______ - ______ IQR: ______ - ______ 5 number summary: _____, ______, ______, ______, ______ Boxplot: a graphical display of the 5-number summary is also called ________. C. shape (modality & skewness) A histogram with one mode is called_________, two is _________, and with three or more mode is ___________. Right-skewed D. outliers: 1.5 * (Q3-Q1) E. variance symmetric left-skewed F. Standard deviation: G. Measures of variability: the most common measures of variability are the ________, the _____________, the __________, and the ______________. Choosing appropriate measures of center and spread depending on the shape of the distribution If the data are reasonably symmetric and no outliers are present, _________ and ________ can be used. If the data are skewed and /or outliers are present, use the _________ and ___________ which you then combine into the 5- number summary. The Normal Distribution Model, Chapter 12 Notation: X ∼ N (___, ____) o Characterizing parameters: ____________, _________________ o Overall shape: _________________________________________ o Area under each curve: ____ o Relationship between height and variance of the distribution: o The higher the distribution, the________________________________________ 68-95-99.7 rule (empirical rule); note that results based on the 68-95-99.7 rule are approximations to what we would get using z-score calculations and Table A. o The 68-95-99.7 applies to _____ normal distribution (i.e. for any choice of µ and σ) o 68%: o 95%: o 99.7% Standard Normal distribution, Z ∼ N (___, ___) o Standardizing: transform any normal distribution with _____ and __________ to a standard normal distribution. i.e. from ______________ to __________________. o z-scores: z = ___________ o observations _________ than the mean are __________, observations _________ than the mean are _________. back transformation: x = __________ (finding a x-value for a given area under the normal curve) Studies and Sampling chapter 13 Survey: Simple random sample: Stratified samples: Systematic sampling: Cluster sampling: Convenience sampling: Voluntary response sample: Accuracy VS precision The choice of the sample, i.e. how were data collected determines the _________ of the sample. ________ relates to how well the sample represents the population and thus how well the sample statistic represents the population parameter The sample size, i.e. how large the sample is determines the __________ of the parameter estimate calculated from the sample _______ relates to how much variability there is in the estimate. The less variability, the more precise the estimate. Accuracy and precision are different: a large sample size does not _______________________, but reduces variability in the estimate yielding more precision. It does not matter how precise our estimate is if the estimate is not accurate. Difference between μ and x̄ Population mean μ: Sample mean x̄: