Download Probability Functions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Summary
 Trend Analysis
 Dependency between two variables
 Simple Linear Regression Model
 Scatter plot and bivariate regression model
 Equation and estimation
 Least Square Method
 Understand the process of derivation
 What is the method of least square
 Estimation of unknown a, b
 Coefficient of Determination
 Meaning of Coefficient
 How to interpret
GIS and Spatial Analysis
1
HYPOTHESIS TESTERS I
1. Probability Concept (3)
2. Probability Functions (4)

Binominal Distribution

Poisson Distribution

Normal Distribution
Geography
KHU
Jinmu Choi
3. Confidence Intervals (2)
4. Hypothesis Testing (2)
Summary and Next…
GIS and Spatial Analysis
2
Probability Issue

Sampling Error


Statistical Test


Statistics calculated from the sample are not identical
to the parameters calculated from the population
If the difference is too large to be considered a
sampling error
Comparison of two samples

If two samples come from the same population/ from
different population

The difference is statistically insignificant
GIS and Spatial Analysis
3
Probability Concept

Question: How common the event is




Theft or Burglary
Sample: four Observations
Sample space: all possible outcome: 16 cases
Random variable: the presence of theft: Y
GIS and Spatial Analysis
4
Event Sequence


Order is important because of different location
of an event
Permutation: to make an order


r is the # of choice, n is # of observation
y or n case, 4 houses theft
n


r
r
n

Combination: to choose some cases
n
n!
C    
 r  r!(n  r )!
n
r

Select r objects out of
a total of n objects
GIS and Spatial Analysis
5
Probability Functions

Frequency distribution



From the number of each case
How likely get a low or high number of Y’s during the
survey
Probability distribution


Frequency / total possible outcome
Probability function: use mathematical function to
represent probability distribution

Probability distribution with a dice?

Discrete variable vs. continuous variable
GIS and Spatial Analysis
6
Binomial Distribution

Characteristics


Discrete random variables with only two possible
outcomes
The probability of success or failure for each trial is
known, and the same for all trials




All trials are independent
All trials are subject to the same probability distribution
The number of trials is fixed or known
Computation with given probability p for a trial
n x
p( x)    p (1  p) ( n  x )
 x
GIS and Spatial Analysis
7
Poisson Distribution

Characteristics



Count variable; testing spatial point patterns
Long term average: # of theft in the past 10 years
The probability of having a given # of occurrences



With an average 4 of thefts in a year, calculate the
probability of 1 (2,3,4,5,6…) theft per year
The event should be random and independent
Computation with average occurrence (λ) and #
of occurrence (x) for probability
e   x
p( x) 
x!
GIS and Spatial Analysis
8
Normal Distribution

Characteristics



p( x) 
1
2
Continuous variable
Computation with mean μ and std. σ
2
e
 ( x   ) 2 / 2 2
Z-score : to compare normal distributions

Standard normal distribution: Convert each normal
distribution using z score
xi  x
zi 
S
GIS and Spatial Analysis
9
Central Limit Theorem



How much difference between sample means is reasonable
when the samples supposed to come from the same
population?
iid property of observation in the population



X i  X j , i  j일때
K set of Sample X
independent: large observations in the population, so
independent each other
identical distribution: all observations come from the same
distribution (population)
Central limit theorem


tells
With enough sample sets, the mean of the sample means
approach the true population mean
The variance of the sample means follow a normal
distribution
GIS and Spatial Analysis
10
Confidential Intervals

Standard error: variance of sample means S / n




Estimated by only one set of sample observations
To determine likelihood that the true population is within
a given range around a sample mean
Follows standard normal distribution
Confidence interval


Significance level:  = 1 - confidence interval
Standard error =
z-score of a given sample mean

90%: 1.64, 95%: 1.96, 99%: 2.57

S 
S 

prob  x  1.96



x

1
.
96


  0.95
n
n 


GIS and Spatial Analysis
11
Hypothesis Testing


If the two samples from different populations
Testing procedure

Null hypothesis (H0): we want to reject


To choose and compute the suitable test statistic





To compare two sample means with small size: t-test
To compare two frequency distribution: the chi-square statistic
To determine the probability to reject null hypothesis


To make alternate hypothesis (Ha or H1) true
Reject: Statistic value >
critical value at
the significance level 
Conclusion: Reject or not
Type I Error: H0 true, reject
Type II Error: H0 false,
to reject
GISfail
and Spatial
Analysis
12
Summary
 Probability concepts
 How common the event is
 Permutation or combination
 Probability functions
 Binomial distribution: binary category data
 Poisson distribution: countable data
 Normal distribution: continuous variable with real data
 Confidential intervals
 Central limit theorem: sample mean approach population mean
 Standard error is z score in standard normal distribution
 Confidence interval: 1-significance level
 Hypothesis testing
 Null hypothesis -> Test Statistics ->
The Probability at the significance level -> Reject or not
GIS and Spatial Analysis
13
Next


Lab6: Standard Normal Probability and Test of
Variance
Lecture 7: Hypothesis Testers 2
Difference in Variance and mean
(Ch 4, pp.146-164)
GIS and Spatial Analysis
14