Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
How To Conduct Good Experiments? Ernesto Costa DEI/CISUC [email protected] DEI/CISUC http://www.dei.uc.pt/~ernesto © 2003 Ernesto Costa Evonet Summer School - Parma 1 Summary What is the goal of this talk? Background DEI/CISUC Probabilities Random Variables and Probability distributions Inferential Statistics Applying the Theory Conclusions © 2003 Ernesto Costa Evonet Summer School - Parma 2 What is the goal of this talk? I don’t know! I have been asked to give a talk on that subject… I do know! EC is (much) an experimental discipline Most of our work is to compare things DEI/CISUC Algorithms Parameters settings What is a fair comparison? © 2003 Ernesto Costa Evonet Summer School - Parma 3 What is the goal of this talk? Looking for EC papers DEI/CISUC One problem One run Several runs 10, 20, 30? Use average values Use average of the bests Use the mean Use the mean and the standard deviation Use Confidence Levels / Intervals © 2003 Ernesto Costa Evonet Summer School - Parma 4 What is the goal of this talk? What is a good experiment? Identify independent and dependent variables Mutation rate fitness Different crossover operators fitness Evolution and Learning # of survivors DEI/CISUC Identify the conditions of the experiment Initial conditions Number of runs Parameters Settings Identify the kind of Statistics you will need Descriptive Inferential Non parametric © 2003 Ernesto Costa Evonet Summer School - Parma 5 Background Probabilities Experiment: procedure whose variable result cannot be predicted ahead of time. Tossing a coin, rolling a dice DEI/CISUC Sample Space: set of possible outcomes of an experiment. {Heads, Tails} {1,2,3,4,5,6} Event: subset of the sample space {Heads} {1,3,5} © 2003 Ernesto Costa Evonet Summer School - Parma 6 Background Probabilities Probability of an Event Measure the likelihod that the event will occur Tossing a (fair) coin: probability(outcome=heads) =1/2 Axioms DEI/CISUC P(E)0 P(S)=1 For mutually exclusive events © 2003 Ernesto Costa P Ei P( Ei ) i 1 i 1 Evonet Summer School - Parma 7 Background Probabilities Example What is the probability of when rolling two dice the sum of the two outcomes equal 7? 1/6 Working Methodology © 2003 Ernesto Costa Sample Space Event Prob. Assign. Two Dice Experiment Number DEI/CISUC Experiment 8 6 4 2 0 Tw o Dice Experiment 1 2 3 4 5 6 7 8 9 10 11 12 Sum Evonet Summer School - Parma 8 Probabilities Example: A family has two children. Knowing that one is a boy what is the probability that they have two boys? DEI/CISUC 1/3 Definition: Let E and F be two events, with p(F)>0. The conditional probability of E given F, p(E|F), is defined as: © 2003 Ernesto Costa p(E F) p(E | F) p(F) Evonet Summer School - Parma 9 Probabilities Example: A building has two lifts. One is used by 45% of the residents And the other by 55%. The first one, 5% of the time have problems, while The second 8% of the time can let you in trouble. Knowing that one lift had a problem , what is the probability of being lift number 1? 33,8% DEI/CISUC Theorem of Bayes: p( A1 | B) © 2003 Ernesto Costa p( B | A1 ) p( A1 ) p( B | A1 ) p( A1 ) p( B) p( B | A1 ) p( A1 ) p( B | A2 ) p( A2 ) Evonet Summer School - Parma 10 Random Variables and Probability Distributions Random Variables Definition: A random variable, X, is a function from the sample space of an experiment to the set of real numbers. S X DEI/CISUC s SX 0 1 2 X(s) 3 A RV is a function … and is not random!!! © 2003 Ernesto Costa Evonet Summer School - Parma 11 Random Variables and Probability Distributions Working Methodology Experiment Sample Space Event Prob. Assign. Random Variable Prob. Distribution DEI/CISUC Example Toss coin (3x) Experiment 8 possibilities Sample Space Event X(HHT)=2 Random Variable © 2003 Ernesto Costa # Heads f(xi)=p(X=xi) Prob. Assign. Xf(xi) Prob. Distrib. Evonet Summer School - Parma 12 Random Variables and Probability Distributions DEI/CISUC Example: Suppose you toss a coin three times. Let X(t) denote the number of heads that appear when t is the result. Então X(t): © 2003 Ernesto Costa Probability Distribution f(xi) X(HHH) = 3 X(HHT) = X(HTH) = X(THH) = 2 X(TTH) = X(THT) = X(HTT) = 1 X(TTT) = 0 0,4 0,35 0,3 0,25 0,2 0,15 0,1 0,05 0 0 1 2 3 X Probabilty Distribution Evonet Summer School - Parma 13 Random Variables and Probability Distributions Types of Random Variables Discrete Probability Mass Function P( X x) p( x) 0 p( x) 1 x Continuous DEI/CISUC Probability Density Function (pdf) f(x) f ( x ) 0, x f ( x)dx 1 b P(a X b) f ( x)dx © 2003 Ernesto Costa a 0 x1 x2 x Evonet Summer School - Parma 14 Random Variables and Probability Distributions Measures of Random Variables Location Mean E ( X ) xp( x) E( X ) x xf ( x)dx Dispersion Variance V ( X ) 2 ( x ) 2 p ( x) DEI/CISUC x V (X ) 2 2 ( x ) f ( x) Standard Deviation © 2003 Ernesto Costa V (X ) Evonet Summer School - Parma 15 Random Variables and Probability Distributions Independence of Random Variables Two random Variables, X and Y, over the same sample space S, are said to be independent iff: p( X (s) r1 Y (s) r2 ) p( X (s) r1 )* p(Y (s) r2 ) Theorem of the Product DEI/CISUC E ( X * Y ) E ( X )* E (Y ) Theorem of Sum V ( X Y ) V ( X ) V (Y ) © 2003 Ernesto Costa Evonet Summer School - Parma 16 Random Variables and Probability Distributions Discrete Probability Distributions Binomial Distribution Domain {0,1,2,…n} Probability mass function p( X xi ) pi Cin p i q ni n E ( X ) Cin p i q n i i np Mean np i 0 2 V ( X ) E ( X 2 ) E ( X ) 2 npq Variance npq Binomial Distribution P=0.3 0,3 0,25 0,25 0,2 0,2 Series1 0,15 0,1 Probability Probability DEI/CISUC Binomial Distribution P=0.5 0,15 Series1 0,1 0,05 0,05 0 0 1 2 3 4 5 6 7 Values x © 2003 Ernesto Costa 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 Values x Evonet Summer School - Parma 17 Random Variables and Probability Distributions Discrete Probability Distributions Poisson Distribution Approach the Binomial Distribution Domain {0,1,2,3,...} i l l Probability mass function p( X i) pi e i! Mean: l Variance: l © 2003 Ernesto Costa l=6 Poisson distribution 0,1 Series1 0,05 0 1 2 3 4 5 6 7 Values 8 9 10 11 12 Probability 0,15 l=8,4 Poisson Distribution 0,2 Probability DEI/CISUC l=np 0,16 0,14 0,12 0,1 0,08 0,06 0,04 0,02 0 Series1 1 2 3 4 5 6 7 8 9 10 11 12 Values Evonet Summer School - Parma 18 Random Variables and Probability Distributions Continuous Probability Distributions Normal (Gaussian) Distribution ( x )2 1 f ( x) e 2 2 N (3, 2) 0.25 0.2 2 0.15 0.1 0.05 DEI/CISUC -4 -2 2 Standard Normal Distribution © 2003 Ernesto Costa 4 6 8 10 N (0,1) 0.4 0.3 0.2 0.1 -3 -2 -1 1 2 3 Evonet Summer School - Parma 19 Random Variables and Probability Distributions Continuous Probability Distributions Converting a normal distribution to a standard normal distribution X a random Variable with Mean Standard Deviation σ Using a translation DEI/CISUC Defining a new Random variable © 2003 Ernesto Costa Z X Evonet Summer School - Parma 20 Random Variables and Probability Distributions Continuous Probability Distributions Student’s t-Distribution Approximates the standard normal distribution N(0,1) f ( x) 1 x2 B( , ) 2 2 DEI/CISUC Degrees of freedom (df), Mean 0, >1 Variance /(2), >2 -3 N(0,1) =10 1 1 © 2003 Ernesto Costa 0.4 0.3 2 0.2 =5 =1 0.1 -2 -1 1 2 Evonet Summer School - Parma 3 21 Background Statistics Goal: to apply probability theory to data analysis How? Model the data (population) by mean of a probability distribution Use a sample of the data instead of the all population DEI/CISUC Estimate the population parameters (, σ, p) using correspondent sample statistics (x, s, p̂ ) population parameters © 2003 Ernesto Costa sample x σ s p p̂ statistics Evonet Summer School - Parma 22 Background Statistics Unbiased estimator A statistics with mean value equal to the population parameter being estimated DEI/CISUC Point Estimators Interval Estimators © 2003 Ernesto Costa Evonet Summer School - Parma 23 Background Sample distribution of the sample mean and the Central Limit Theorem Consider a population with mean and standard deviation σ. Let X denote the mean of the observations in random samples of size n. Then: DEI/CISUC x E (X ) x n When the population distribution is normal, the sampling distribution of X is also normal for any sample size n (Central Limit Theorem) When n is sufficient large (n>30) the sampling distribution is well aproximated by a normal curve, even if the population distribution is not itself normal © 2003 Ernesto Costa Evonet Summer School - Parma 24 Background Sample distribution of the sample mean Unbiased estimators Mean x E (X ) Standard Deviation DEI/CISUC ˆ x s © 2003 Ernesto Costa 2 ( x x ) i i n 1 (n-1) are the degrees of freedom (df) Evonet Summer School - Parma 25 Background Sample distribution of the sample mean and the Central Limit Theorem Consequence For a large sample or population whose distribution is normal: X Z x x DEI/CISUC has (approximately) a standard normal (Z) distribution. © 2003 Ernesto Costa Evonet Summer School - Parma 26 Background Confidence Intervals – one sample Estimate the mean The population standard deviation, σ, is known; The sample mean from a random sample, X The sample size is large (>30) is known, The one sample Z confidence interval is DEI/CISUC x Z critical _ value n Example: for an 95% confidence interval Z=1.96. © 2003 Ernesto Costa Evonet Summer School - Parma 27 Background Confidence Intervals – one sample Example: we want a confidence level of 90% Look into a N(0,1) For a CL of 90%, we have to isolate the area of 5% to the left and to the right of the bell shaped normal distribution. The confidence interval will be given by DEI/CISUC x Z 0.1 n 2 Looking in a table for the value of Z we obtain Z=1.65 © 2003 Ernesto Costa Evonet Summer School - Parma 28 Background Confidence Intervals – one sample What does it means interval of 95%? having a confidence DEI/CISUC That there is a probability of 95% that the true mean (population) is in the interval? NO!! Mean that 95% of all possible samples result in an interval that includes the true mean! © 2003 Ernesto Costa Evonet Summer School - Parma 29 Background Confidence Intervals – one sample Estimate the mean The population standard deviation, is NOT known; The sample mean from a random sample, is known, X The sample size is large (>30) OR the population distribution is normal The one sample t confidence interval is s x tcritical _ value n DEI/CISUC where the t critical value is based on (n-1) degrees of freedom (df). Example: for an 95% confidence interval and 19 df t=2.09. The Student T Distribution can be used for small samples assuming that the population distribution is approximately normal © 2003 Ernesto Costa Evonet Summer School - Parma 30 Background DEI/CISUC Hypothesis Testing – one sample A hypothesis is a claim about the value of one or more population characteristics. A test procedure is a method for using sample data to decide between to competing claims about population characteristics. (= 100 or 100) Method by contradiction: we assume a particular hypothesis. Using the sample data we try to find out if there is convincing evidence to reject this hypothesis in favor of a competing one © 2003 Ernesto Costa Evonet Summer School - Parma 31 Background DEI/CISUC Hypothesis Testing – one sample The null hypothesis, H0, is a claim about a population characteristic that is initially assumed to be true. Ha is the alternative hypothesis or competing claim. Testing H0 versus Ha can lead to the conclusion the H0 must be rejected or we fail to reject H0. I that last case we cannot say that H0 is accepted! © 2003 Ernesto Costa Evonet Summer School - Parma 32 Background Hypothesis Testing – one sample Errors Type I error Rejecting H0 when H0 is true The probability of a type I error, , is called Level of Significance of the test. DEI/CISUC Type II error Failing to reject H0 when H0 is false The probability of a Type II error is denoted by . There is a tradeoff between and : making type I error very small increase the probability of type II error. © 2003 Ernesto Costa Evonet Summer School - Parma 33 Background DEI/CISUC Hypothesis Testing – one sample Test Statistic (Z,t): function of the sample data on which a decision about reject or fail to reject H0 is based; p-value (observed significance level): is the probability, assuming that H0 is true, of obtaining a test statistics at least as inconsistent with H0 as what actually resulted. Decision about H0: comparing the p-value with the chosen . Reject H0 if p-value © 2003 Ernesto Costa Evonet Summer School - Parma 34 Background Hypothesis Testing – one sample DEI/CISUC Hypothesis Testing – principles What is the population parameter (mean,…) State the H0 and Ha Define the significance level The assumptions for the test are reasonable (big sample,…) Calculate the test statistic (Z,…) Calculate the associated p-value State the conclusion (reject if p-value ,…) © 2003 Ernesto Costa Evonet Summer School - Parma 35 Background Hypothesis Testing – one sample DEI/CISUC Example Population parameter the mean, H0: =100, Ha: 100 Significance level =0.01 n=40 is large From the sample: x =105,3, σ=8.4 105,3 100 z 3.99 8.4 40 From the z-curve we know that the p-value 0 Therefore the null hypothesis, H0, is rejected with a significance level of 0.01. © 2003 Ernesto Costa Evonet Summer School - Parma 36 Background Comparing Two Populations based on independent samples Use the sample distribution of the difference of the sample means: x1 x2 Properties The mean of the difference is equal to the difference of the means x x 1 2 1 2 The variance of the difference is equal to the sum of the individuals variances. Thus, the standard deviation: DEI/CISUC x x 1 2 2 2 1 2 n1 n2 The sampling distribution of the difference of the sample means, can be considered approximately normal (each n large, each sample mean come from a population (approximately) normal © 2003 Ernesto Costa Evonet Summer School - Parma 37 Background Confidence interval for the mean of x x 1 2 1 2 Assumptions The two samples are independently random samples Sample sizes are both large (n >30) OR the population distributions are (approximately) normal. Formulas DEI/CISUC x1 x2 tcritical _ value df (V1 V2 ) V 2 1 2 2 1 2 n1 n2 s s 2 V 2 2 where V1 s 2 1 n1 V2 s 2 2 n2 n1 1 n2 1 © 2003 Ernesto Costa Evonet Summer School - Parma 38 Background Hypothesis Test Same procedure, only the formulas are different! Z Test Large samples OR Population distributions are (at least approximately) normal DEI/CISUC z © 2003 Ernesto Costa x1 x2 ( 1 2 ) 2 2 1 2 n1 n2 Evonet Summer School - Parma 39 Background Hypothesis Test t test Large samples OR Population distributions normal AND the random samples are independent t x1 x2 ( 1 2 ) 2 2 1 2 n1 n2 DEI/CISUC s s © 2003 Ernesto Costa df (V1 V2 ) V 2 1 2 V 2 2 where V1 s 2 1 n1 V2 s 2 2 n2 n1 1 n2 1 Evonet Summer School - Parma 40 Applying the Theory The Busy Beaver Problem Two algorithms A standard GA A standard GA + local learning (Baldwin Effect) Goal: good quality machines DEI/CISUC Who is better? Comparing the means! H0:1= 2 (no improvement!!!), Ha: 1≠ 2 Confidence level, =0.01 Assuming that the population distributions are normal Number of (independent) runs = 30 for each case Use t test © 2003 Ernesto Costa Evonet Summer School - Parma 41 Applying the Theory The Busy Beaver Problem From the samples (# good machines) sga=0.1 be=0.23 Sga2=0.093 Sbe2=0.185 DEI/CISUC From the formulas df=53 t=1.35 p-value2*0.1=0.2 Conclusion With =0.01and p-value =0.2, the null hypothesis H0 cannot be rejected © 2003 Ernesto Costa Evonet Summer School - Parma 42 Applying the Theory Function Optimization Two different GAs applied to function optimization A standard GA using a 2 point CXover A modified GA using transformation Goal: find the minimum The Schwefel Function DEI/CISUC Minimum = 0 1500 500 1000 500 250 0 -500 0 -250 -250 0 250 500 © 2003 Ernesto Costa -500 Evonet Summer School - Parma 43 Applying the Theory Function Optimization DEI/CISUC Who is better? Two point Crossover or Transformation? Comparing the means of the best fit! H0:1= 2 (no improvement!!!), Ha: 1≠ 2 Confidence level, =0.05 Assuming the population distributions are normal Number of (independent) runs = 30 for each case Use t test © 2003 Ernesto Costa Evonet Summer School - Parma 44 Applying the Theory Function Optimization From the samples (fitness of the best individuals) sga=5.4838 tr=0.0768 Sga2=149.788 Str2=0.02958 DEI/CISUC From the formulas df=29 t=2.42 p-value2*0.012=0.024 Conclusion With =0.05 and p-value =0.024, the null hypothesis H0 is rejected. © 2003 Ernesto Costa Evonet Summer School - Parma 45 Conclusions This is a very simple presentation Assuming Normal distributions There are many others In many situations we cannot assume a normal distribution DEI/CISUC Many things left unmentioned More than two populations Analysis of Variance (ANOVA) Regression and Correlation Non parametric methods © 2003 Ernesto Costa Evonet Summer School - Parma 46 DEI/CISUC Want to know more? Paul Cohen, Empirical Methods for Artificial Intelligence. MIT Press, Boston, 1995 James Kennedy and Russell Eberhart, Swarm Intelligence (Appendix A),Morgan Kaufman, 2001. Roxy Peck, Chris Olsen and Jay Devore, Introduction to Statistics and Data Analysis,Duxbury, 2001. Mark Wineberg and Steffen Christensen, Using Appropriate Statistics, GECCO’2003 Tutorial. © 2003 Ernesto Costa Evonet Summer School - Parma 47