Download REVIEW: Midterm Exam

REVIEW: Midterm Exam Spring 2012 Introduction • - Important Definitions: Data Statistics A Population A census A sample Types of Data • Parameter (Describing a characteristic of the Population) • Statistic (Describing a characteristic of the Sample) -QUALITATIVE DATA (Categorical or Attribute Data) -QUANTITATIVE DATA: - Discrete - Continuous Levels of Measurement: - Nominal Ordinal Interval Ratio Design of Experiments • An observational study (don’t attempt to modify the subjects) • An experiment (treatment group vs. control group) Types of Observational Studies: • Cross-sectional • Retrospective (or case-control) • Prospective (or longitudinal or cohort) Problems • Confounding (confusion of variables effects) How to solve this problem?: • Blinding (placebo effect, single-blind, double-blind) • Blocking • Randomization: • Completely randomized design • Randomized block design Sampling strategies • • • • • • Random sample Simple random sample (assumed throughout the book) Systematic sampling Convenience sampling Stratified sampling Cluster sampling • Sampling Error :difference between sampling result and the true population result • Nonsampling error: Sample data incorrectly collected Important characteristics of data • • • • • Center Variation Distribution Outliers Time Frequency distribution • Counts of data values individually or by groups of intervals • Other forms: • Relative frequency distribution (divide each class frequency by the total of all frequencies) • Cumulative frequency distribution (cumulative totals) • Histogram: Graphical representation of the frequency distribution Other graphs • • • • • • • • Relative frequency histogram Frequency polygon Dotplots Steam-Leaf plots Pareto chart Pie Charts Scatter Diagrams Time series graphs Examples: Histogram and Scatter plot Measures of center • Sample Mean: ̅ = ∑ • Median: Middle value • Mode: Most frequent value • Bimodal • Multimodal • No mode • Midrange= Skewed distributions Measures of Variation • Range= (Maximum value-Minimum value) • Sample standard deviation: Variation from the mean = ∑( − ̅ ) −1 • Population standard deviation: = • Sample variance: ∑( − ) ∑ ( − ̅ ) = −1 • Population Variance: = ∑()మ Measures of Variation (Cont.) • Sample Coefficient of Variation: = . 100% ̅ • Population coefficient of variation: = . 100% Range Rule of Thumb ≈ 4 • Minimum usual value: (mean)-2 x (standard deviation) • Maximum usual value: (mean)+2 x (standard deviation) Rule of data with Bell-Shaped distribution • About 68% of all values fall within 1 standard deviation of the mean • About 95% of all values fall within 2 standard deviations of the mean • About 99.7% of all values fall within 3 standard deviations of the mean Z Scores • Sample • Population − ̅ = − = Ordinary values: -2≤ z score≤2 Unusual value: z score < -2 or z score> 2 Quartiles and Percentiles • Quartiles: Separate a data set into four parts • Q1 (First): Separates bottom 25% of the sorted values from the top 75% • Q2 (Second): Same as the median • Q3 (Third): Separates bottom 75% of the sorted values from the top 25% • Percentiles: Separate the data into 100 parts (P1, P2, …, P99) Percentile value of x= . 100 • Intercuartile range= Q3-Q1 Boxplots Probability • Definitions: • An event • A simple event • The Sample Space • Notation • P: Probability • A,B and C: specific events • P(A): Probability of event A occurring Definitions of Probability • Frequency approximation: P(A)= = • Classical Approach: P(A)= • Subjective Probability • LAW OF LARGE NUMBERS: A procedure is repeated many times. Relative frequency probability tends to the actual probability Properties of probability • • • • Probability of an impossible event is 0 Probability of an event that is certain is 1 For any event A, 0≤P(A)≤1 P(Complement of event A)=P(̅) = 1 − () • Addition Rule: P(A or B)=P(in a single trial, event A occurs or event B occurs or they both occur)= P(A)+P(B)-P(A and B) Or P(A∪B)= P(A)+P(B)-P(A∩B) Events A and B are disjoint if P(A∩B)=0 Multiplication Rule • P(A and B)=P(event A occurs in the first trial and event B occurs in a second trial) • = . • Independent events: P(B|A)=P(B) • If A and B are independent: = . () • Conditional probability: = ( ) () Bayes Theorem • = .(|) . [ ̅ . ̅ ] Probability distributions • Definitions: • Random Variable (x): Numerical value given to an outcome of a procedure. Example: Number Mountain lions seen at UCSC campus last year • Probability distribution (P(x)): Gives the probability to each value of the random variable. • Types of random variables: • Discrete • Continuous Requirements of a Probability distribution • ∑ = 1(Discrete case) • 0≤P(x)≤1 • Expected value of a discrete random variable = [. ] Discrete Distributions: • Binomial • Poisson Binomial distribution • Requirements: • • • • Fixed number of trials Trials are independent Each trial can be a success or a failure Probabilities remain constant • Random variable: x=number of successes among n trials • = . . (You can also use the Binomial !! Table) • n= number of trials • p=probability of success in one trial • q=probability of failure in one trial (q=1-p) ! Mean ,Variance and Standard deviation of the Binomial distribution • Mean: = • Variance: = • Standard deviation: • Maximum usual value: + 2 • Minimum usual value: − 2 Poisson distribution • Requirements: • Random variable x is the number of occurrences of an event over some interval • The occurrences must be random • The occurrences must be independent . = ! The Poisson distribution only depends on (the mean of the process) Mean, Variance and Standard deviation of the Poisson distribution • Mean: • Variance: • Standard deviation: • Maximum usual value: + 2 • Minimum usual value: − 2 Continuous distributions • Uniform distribution • Normal distribution • Density curve: Graph of a continuous distribution • Properties: • Area below the curve is equal to 1 • All points in the curve are greater or equal than zero Uniform and Normal distributions Sampling distributions • Variation of the value of a statistics from sample to sample: Sampling variability • Sampling distribution of the sample mean • Sampling distribution of the sample proportion CENTRAL LIMIT THEOREM: • The random variable x has a distribution (normal or not) with mean and standard deviation • The distribution of the sample means will approach to a normal distribution as the sample size increases. Mean and standard deviation of the sample mean • Mean: ̅ = • Standard deviation: ̅ = Normal approximation to the Binomial • If np≥5 and nq≥5 a Binomial random variable x can be approximated with a Normal distribution with mean and standard deviation: • Mean: = • Standard deviation: Confidence Interval for the Population Proportion (p) • p=population proportion • ̂ = = sample proportion of successes • = 1- ̂ = sample proportion of failures Procedure to build a CI of confidence level (11) Check the normal approximation to the Binomial distribution (np≥5 and nq≥5 ) 2) Get the critical value / 3) Evaluate the margin of error: = / . 4) Confidence Interval: • ̂ − < < ̂ + • ̂ ± • (̂ − , ̂ + ) 5) Interpret results !⁄ Sample size for estimating proportion p • ̂ is given: [/ ] ̂ = • ̂ is not given: (̂ is assumed = 0.5) [/ ] 0.25 = Finding (point estimate) and E from the Confidence Interval Point estimate: • ̂ = " # "## ($" # "##) Margin of Error: • E= " # "## ($" # "##) Confidence Interval for the Population Mean ( • Check Requirements: • Sample is a simple random sample • Population standard deviation is known • Population is normally distributed or n>30 • Procedure 1) Check normality requirements 2) Get the critical value / 3) Evaluate the margin of error: = / . 4) Confidence Interval: • ̅ − < < ̅ + • ̅ ± • (̅ − ,̅ + ) 5) Interpret results Sample size for estimating Mean = % . Values of , ഀ⁄మ and E are given. Confidence Interval for the Population Mean ( • In this case we use the Student t distribution with n-1 degrees of freedom • Check Requirements: • Sample is a simple random sample • Population standard deviation is estimated by s (sample standard dev.) • Population is normally distributed or n>30 • Procedure 1) Check normality requirements 2) Get the critical value ఈ/ଶ with n-1 degrees of freedom 3) Evaluate the margin of error: = ఈ/ଶ . ௦ ௡ 4) Confidence Interval: • ̅ − < < ̅ + • ̅ ± • (̅ − ,̅ + ) 5) Interpret results Finding point estimate and E from Confidence Interval Point estimate of ߤ: !"#"$"% + (# & !"#"$"%) ̅ = 2 Margin of Error • E= " # "## ($" # "##)

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download REVIEW: Midterm Exam