Download 1st exam review sheet

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Categorical variable wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Gibbs sampling wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Introduction Chapter 1
Descriptive statistic: is the collection, presentation and description of data in form of ______, ______,
and _________________that provide meaningful information about the data.
Inferential statistic: deals with the _________ of data as well as drawing ________ and making
generalizations based on data for a larger group of subjects.
Individual: ___________________________________________________________________________
Observation: _________________________________________________________________________
Random variables:
Variables: qualitative (categorical) versus quantitative (numerical)
Categorical variable: individuals can be placed into one of several distinct categories.
Nominal versus ordinal categorical variables
Nominal: ________ not meaningful
Ordinal: _________ may be meaningful
Quantitative variables: take numerical values for which arithmetic operations such as adding and
averaging make sense.
Population: ________________________________________________________________________
Sample: ___________________________________________________________________________
Numerical summaries for populations and samples: “parameters” VS “statistics”
Sample statistic: ____________________________________________________________________
Population parameter: _______________________________________________________________
Descriptive Statistics, chapter 2-4
Displaying distribution with graphs:
i. categorical data
A. bar charts: shows the amount of data that belong to each category as proportionally sized
___________________.
B. pie charts: shows the amount of data that belong to each category as a proportional slice of
the circle.
C. pareto charts: arrange bar chart with respect to their magnitude, i.e. from
___________________ in order to highlight the categories with the highest frequencies:
ii. Quantitative data



Histogram
time-plots
box-plots
iii. describing distributions
A. center: associated with locating the “middle” of the data



Mean:
Median:
Mode:
How to find the median, when the sample size n is odd/even


Odd:
Even:
B. spread




Range: ______ - ______
IQR: ______ - ______
5 number summary: _____, ______, ______, ______, ______
Boxplot: a graphical display of the 5-number summary is also called ________.
C. shape (modality & skewness)

A histogram with one mode is called_________, two is _________, and with three or more
mode is ___________.
Right-skewed
D. outliers: 1.5 * (Q3-Q1)
E. variance
symmetric
left-skewed
F. Standard deviation:
G. Measures of variability: the most common measures of variability are the ________, the
_____________, the __________, and the ______________.
Choosing appropriate measures of center and spread depending on the shape of the distribution


If the data are reasonably symmetric and no outliers are present, _________ and ________ can
be used.
If the data are skewed and /or outliers are present, use the _________ and ___________ which
you then combine into the 5- number summary.
The Normal Distribution Model, Chapter 12




Notation: X ∼ N (___, ____)
o Characterizing parameters: ____________, _________________
o Overall shape: _________________________________________
o Area under each curve: ____
o Relationship between height and variance of the distribution:
o The higher the distribution, the________________________________________
68-95-99.7 rule (empirical rule); note that results based on the 68-95-99.7 rule are
approximations to what we would get using z-score calculations and Table A.
o The 68-95-99.7 applies to _____ normal distribution (i.e. for any choice of µ and σ)
o 68%:
o 95%:
o 99.7%
Standard Normal distribution, Z ∼ N (___, ___)
o Standardizing: transform any normal distribution with _____ and __________ to a
standard normal distribution. i.e. from ______________ to __________________.
o z-scores: z = ___________
o observations _________ than the mean are __________, observations _________ than
the mean are _________.
back transformation: x = __________ (finding a x-value for a given area under the normal curve)
Studies and Sampling chapter 13
Survey:
Simple random sample:
Stratified samples:
Systematic sampling:
Cluster sampling:
Convenience sampling:
Voluntary response sample:
Accuracy VS precision

The choice of the sample, i.e. how were data collected determines the _________ of the
sample.
________ relates to how well the sample represents the population and thus how well the
sample statistic represents the population parameter

The sample size, i.e. how large the sample is determines the __________ of the parameter
estimate calculated from the sample
_______ relates to how much variability there is in the estimate. The less variability, the more
precise the estimate.
Accuracy and precision are different: a large sample size does not _______________________, but
reduces variability in the estimate yielding more precision. It does not matter how precise our estimate
is if the estimate is not accurate.
Difference between μ and x̄
Population mean μ:
Sample mean x̄: