Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Summary Trend Analysis Dependency between two variables Simple Linear Regression Model Scatter plot and bivariate regression model Equation and estimation Least Square Method Understand the process of derivation What is the method of least square Estimation of unknown a, b Coefficient of Determination Meaning of Coefficient How to interpret GIS and Spatial Analysis 1 HYPOTHESIS TESTERS I 1. Probability Concept (3) 2. Probability Functions (4) Binominal Distribution Poisson Distribution Normal Distribution Geography KHU Jinmu Choi 3. Confidence Intervals (2) 4. Hypothesis Testing (2) Summary and Next… GIS and Spatial Analysis 2 Probability Issue Sampling Error Statistical Test Statistics calculated from the sample are not identical to the parameters calculated from the population If the difference is too large to be considered a sampling error Comparison of two samples If two samples come from the same population/ from different population The difference is statistically insignificant GIS and Spatial Analysis 3 Probability Concept Question: How common the event is Theft or Burglary Sample: four Observations Sample space: all possible outcome: 16 cases Random variable: the presence of theft: Y GIS and Spatial Analysis 4 Event Sequence Order is important because of different location of an event Permutation: to make an order r is the # of choice, n is # of observation y or n case, 4 houses theft n r r n Combination: to choose some cases n n! C r r!(n r )! n r Select r objects out of a total of n objects GIS and Spatial Analysis 5 Probability Functions Frequency distribution From the number of each case How likely get a low or high number of Y’s during the survey Probability distribution Frequency / total possible outcome Probability function: use mathematical function to represent probability distribution Probability distribution with a dice? Discrete variable vs. continuous variable GIS and Spatial Analysis 6 Binomial Distribution Characteristics Discrete random variables with only two possible outcomes The probability of success or failure for each trial is known, and the same for all trials All trials are independent All trials are subject to the same probability distribution The number of trials is fixed or known Computation with given probability p for a trial n x p( x) p (1 p) ( n x ) x GIS and Spatial Analysis 7 Poisson Distribution Characteristics Count variable; testing spatial point patterns Long term average: # of theft in the past 10 years The probability of having a given # of occurrences With an average 4 of thefts in a year, calculate the probability of 1 (2,3,4,5,6…) theft per year The event should be random and independent Computation with average occurrence (λ) and # of occurrence (x) for probability e x p( x) x! GIS and Spatial Analysis 8 Normal Distribution Characteristics p( x) 1 2 Continuous variable Computation with mean μ and std. σ 2 e ( x ) 2 / 2 2 Z-score : to compare normal distributions Standard normal distribution: Convert each normal distribution using z score xi x zi S GIS and Spatial Analysis 9 Central Limit Theorem How much difference between sample means is reasonable when the samples supposed to come from the same population? iid property of observation in the population X i X j , i j일때 K set of Sample X independent: large observations in the population, so independent each other identical distribution: all observations come from the same distribution (population) Central limit theorem tells With enough sample sets, the mean of the sample means approach the true population mean The variance of the sample means follow a normal distribution GIS and Spatial Analysis 10 Confidential Intervals Standard error: variance of sample means S / n Estimated by only one set of sample observations To determine likelihood that the true population is within a given range around a sample mean Follows standard normal distribution Confidence interval Significance level: = 1 - confidence interval Standard error = z-score of a given sample mean 90%: 1.64, 95%: 1.96, 99%: 2.57 S S prob x 1.96 x 1 . 96 0.95 n n GIS and Spatial Analysis 11 Hypothesis Testing If the two samples from different populations Testing procedure Null hypothesis (H0): we want to reject To choose and compute the suitable test statistic To compare two sample means with small size: t-test To compare two frequency distribution: the chi-square statistic To determine the probability to reject null hypothesis To make alternate hypothesis (Ha or H1) true Reject: Statistic value > critical value at the significance level Conclusion: Reject or not Type I Error: H0 true, reject Type II Error: H0 false, to reject GISfail and Spatial Analysis 12 Summary Probability concepts How common the event is Permutation or combination Probability functions Binomial distribution: binary category data Poisson distribution: countable data Normal distribution: continuous variable with real data Confidential intervals Central limit theorem: sample mean approach population mean Standard error is z score in standard normal distribution Confidence interval: 1-significance level Hypothesis testing Null hypothesis -> Test Statistics -> The Probability at the significance level -> Reject or not GIS and Spatial Analysis 13 Next Lab6: Standard Normal Probability and Test of Variance Lecture 7: Hypothesis Testers 2 Difference in Variance and mean (Ch 4, pp.146-164) GIS and Spatial Analysis 14