Download probability

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Lecturer: Oliver F. Shyr
What
is statistics
 Collecting
informative data
 Display of data and charts
 Interpreting these data
 Drawing conclusions under study
Statistics
in our daily life
 Employment
rate
 Consumer price index (CPI)
 Gallup Poll
 Quality and productivity improvement
 Applications






of statistics
Education - Monitoring Performance of Training
Programs
Consumer Protection Programs - Monitoring
Advertisement Claims
Food Production – Plant Breeding
Construction Engineering – Performance of
Building Beams
Medicine Sciences - Comparison of Medical
Treatments
Business Management - Effectiveness of Sale
Promotion Strategies
 Basic
Concepts
Population (example: Census)
 Sample
 Random sample versus misleading sample

Call-in
 Telephone Interviews
 Mail-in
 Internet Polls
 Home Interviews
 Focus Group Interviews

 Objectives
of Statistics
To make inference about a population from
an analysis of information contained in
sample data and to assess the extent of
uncertainty involved in these inferences
 To design the process and extent of
sampling so that the observations form a
basis for drawing valid inferences

 Sample
Cost versus Sample Size: Using a
Random Number Table to Select a Sample
Types
of data
 Qualitative
or categorical data
 Quantitative or numerical data


Discrete variables
Continuous variables
 Summarization
& description of the
overall pattern

To detect outliers
 Computation

of numerical measures
Mean and variance
 Describing

data by tables and graphs (EXCEL)
Categorical data
Table of relative frequency
 Pie chart of relative percentages
 Pareto diagram of frequency distribution


Numerical data
Measures of center: mean versus median
 Measures of variation: variance and standard
deviation
 Frequency distribution for a continuous variable
(example: paying attention in class)
 Ordered data: lower quartile Q1 = 25th percentile,
second quartile Q2 = median, upper quartile Q3 =
75th percentile)

 Simpson’s

A

Aggregation of data from different sources may
draw reversal conclusion (pp. 85)
design experiment for making a comparison
Quitting smoking with medicated patch (pp. 89)
 Scatter

diagram of bivariate numerical data
A least square approach: regression analysis (pp. 91)
 The

paradox (Johnson & Bhattacharyya)
correlation coefficient
A measure of linear relation (pp. 95)
 Probabilities
express the chances of events
that cannot not be predicted with certainty
 An experiment

A process that has various outcomes
 The

The collection of all possible distinct outcomes
associated with an experiment
 An

sample space
event
A set of elementary outcomes with a designated
feature
 Examples
 Three



axioms of probability
The probability of an event must lie between 0
and 1
The probability of an event is the sum of the
probabilities assigned to all outcomes contained
in the event
The sum of the probabilities of all distinct
outcomes must be 1
 Methods


of assigning probability
Equally likely elementary outcomes: uniform
probability model
Probability as the long-run relative frequency
 Event



Complement
Union
Intersection
 Two

relations
laws of probability
Law of Complement
P( A)  1  P( A )

Addition Law
P( A  B)  P( A)  P( B)  P( A  B)
 Conditional

probability and inference
Multiplication law of probability
P( A  B)  P( A) P( B | A)  P( B) P( A | B)

Independence
P( B | A)  P( B)
P( A  B)  P( A) P( B)
 Bayes’s

Rule
The rule of total probability
P( A)  P( A | B) P( B)  P( A | B ) P( B )
 Random sampling from a finite population

The rule of combinations
 Definition


Probability function of a discrete RV
Probability function of a continuous RV
 Bernoulli


Examples: 5.11, 5.12
Sampling (key fact 5.5)
Geometric distribution


Trials
Binomial distribution


of a random variable
Example: hit a home run in a baseball game
Pascal distribution *

Example: food processing - a pack of good apples
 Poisson

Poisson distribution



Example: traffic volume passing a toll booth
Examples: 5.15, 5.16
Exponential distribution



Process
Example: headway of a city bus
Example: 7.85
Erlang distribution **

Example: headway of a Jitney (community bus)
 Expectation
and standard deviation of a
probability function



Poisson
Exponential
Erlang
 The
Poisson Approximation to the Binomial
Distribution *


Rail fatigue / pavement failure
Car accident
EXCEL
Find the PDF & CDF of the above distributions
 Use Random Number Generator to simulate
data from uniform & exponential distributions
 Calculate the means and the variances of
these distributions
 Compare the sample mean / variance with the
population mean / variance

 Variables

A
and Density Curves
Key Fact 6.1 & 6.2
Normally Distributed Variable

Gaussian function and Gaussian Bell Curve

http://en.wikipedia.org/wiki/Normal_distribution

Key fact 6.3
Example 6.1, 6.2

 Standard


Key fact 6.4
Examples 6.4, 6.8, 6.9
 Using

ND table
Normal Probability to Detect Outliers
Examples: 6.14, 6.15
 Normal
Approximation to the binomial
Distribution

Example 6.17, 6.18
 Other
Related Distributions

Log-normal distribution

http://en.wikipedia.org/wiki/Log-normal_distribution

Student’s t distribution

http://en.wikipedia.org/wiki/Student%27s_t-distribution

F distribution

http://en.wikipedia.org/wiki/F_distribution

c2 distribution function

http://en.wikipedia.org/wiki/Chi-squared_distribution
 Z-test

(Normal distribution) with known s2
Examples: 7.1, 7.3, 7.5
 Central



Limit Theorem
Sample mean follows a normal distribution
𝑥~𝑁[𝜇, 𝜎 2 /n]
Examples: 7.9, 7.10
If s2 is unknown, then
𝑠2 =
𝑛
𝑖=1
𝑥𝑖 − 𝑥 2/(n-1)
𝐸(𝑠 2 ) =(n-1) 𝜎 2
EXCEL
Redo examples: 6.11, 6.12, 6.14, 6.15, & 6.19
 Redo examples: 7.1, 7.5, 7.9, 8.2, 8.6, 8.9
 Prove Central Limit Theorem by showing
distribution charts for 100 sample means.

Each sample has a size of 40.
 Three types of random variables are tested:

Uniform distribution with X1 ∈ [0, 2];
 Exponential distribution with l =1;
 Log-normal distribution with m = 0, s = 1.

 Point
estimation of mean
Case Study: Chips Ahoy!
 Examples: 8.2, 8.3

 Margin

of Error and Sample Size
Examples: 8.6, 8.7
 T-test
under unknown sample variances
One-tailed and two tailed tests for small sample
 Examples: 8.9, 8.10

 Computer
Lab Exercise
Redo Case Study for Confidence Interval with s = 100
 Redo Example 8.9 for CI with & without outliers

 Case

Study: Gender & Sense of Direction
Applications of descriptive dada analysis diagrams
Type of Diagram
Normal prob. plot
Boxplot
Stem-and-leaf diagram
Normality Check

X

Symmetric Check
X


Outlier Check


X
 Null
and Alternative Hypotheses
Null hypothesis: the one to be tested
 Alternative hypothesis: an alternative to the null
 Hypothesis test: to decide whether the null should
be rejected in favor of the alternative
 Example 9.2, 9.3

 Logic
of hypothesis testing
Type I (a) and Type II (b) Errors
 Example 9.4
 Key fact 9.1 & 9.2
 Level of significance

 Hypotheses
Testing
Critical-Value approach and rejection region
 Examples 9.5, 9.6
 P-Value approach
 Examples 9.7, 9.8, 9.10

Hypothesis
tests for one population
mean when s2 is known
 Examples
9.11, 9.13, and 9.14 (EXCEL)
Hypothesis
tests for one population
mean when s2 is unknown
 P-value
for a t-test (small sample)
 Examples 9.15, 9.16, and 9.17 (EXCEL)
Non-parametric
method
 The
Wilcoxon Signed-Rank Test
 Example 9.18, 9.19
Type
II Error Probabilities
 Power
and Power Curves
 Examples 9.23, 9.24, 9.25
Which
 Table
Lab
Procedure Should Be Used?
9.18
Exercise (EXCEL)
 Redo
Case Study
 Redo examples 9.21 and 9.22
 Review Problems from Chapter 8
 Review Problems from Chapter 9
 Sampling


Distribution of the Difference
Case Study: HRT and Cholesterol
Examples: 10.1, 10.2
 Inferences
for two Population Means, Using
Independent Samples: Standard Deviations
Assumed Equal

Examples: 10.3, 10.4, 10.5 (Excel)
 Inferences
for two Population Means, Using
Independent Samples: Standard Deviations
Not Assumed Equal

Examples: 10.6, 10.7, 10.8 (Excel)
 The


Mann-Whitney Test
Using the Mann-Whitney Table
Examples 10.10, 10.11, 10.12, 10.13 (Excel)
 Inferences
for Two Population Means, Using
Paired Samples


The paired t-test
Examples: 10.16, 10.17, 10.18 (Excel)
 The

Paired Wilcoxon Signed-Rank Text
Examples: 10.19, 10.20 (Excel)
 Which
Procedure Should Be Used?
 Inferences
for One Population Standard
Deviation


Case Study: Speaker Woofer Driver Manufacturing
Examples: 11.1, 11.3, 11.5, 11.7, 11.8 (Excel)
 Inferences
for One Population Standard
Deviations: Using Independent Samples

Examples: 11.10, 11.12, 11.14, 11.6 (Excel)
 Confidence
Intervals for One Population
Proportion


Case Study: Health Care in the USA
Examples: 12.1, 12.3, 12.5
 Hypothesis
Tests for One Population
Proportion

Examples: 12.6, 12.7
 Inferences

for Two Population Proportions
Examples: 12.8, 12.9, 12.10, 12.11
 Chi-Square

Case Study: Eye and Hair Color
 Chi-Square

Distribution
Goodness-of-Fit Test
Examples: 13.3, 13.4
 Contingency

Examples: 13.5, 13.6, 13.7 (Excel)
 Chi-Square

Independence Test
Examples: 13.9, 13.10
 Chi-Square

Tables and Association
Homogeneity Test
Examples: 13.12, 13.13

Reviews for Quiz (2)
 Review
Problems from
 Review Problems from
 Review Problems from
 Review Problems from
Chapter 10
Chapter 11
Chapter 12
Chapter 13