* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lysbilde 1
		                    
		                    
								Survey							
                            
		                
		                
                            
                            
								Document related concepts							
                        
                        
                    
						
						
							Transcript						
					
					Inferential statistics PSY 4010 Central concepts in inferential statistics:  Sampling error  Sampling distribution  Standard error  Null hypothesis and alternative hypothesis  Level of significance  Type I and Type II error  One-tailed and two-tailed tests  Degrees of freedom  Parametric and non-parametric statistical tests  Effect size Sample and population Population Sample Example: IQ-mean score in population and sample  The population mean IQ-score equals 100 (=100) and the standard deviation is 15 ( = 15)  You draw three samples consisting of 25 randomly selected persons from this population and estimates the mean IQ score in each sample : X Sample 1: Sample 2 Sample 3 103 101 98 Sampling error (coincidence results in deviation from population mean score) 103-100 = 3 101-100 = 1 98-100 = -2 Sampling distribution and standard error  Sampling distribution:  Distribution of the mean values of an infinitive number of samples of the same size drawn from the same population  Can also be other measures then mean values, e.g. Correlation coefficients, regression coefficients  The standard deviation of such a sampling distribution is called the standard error  A very important measure, an estimate of variability in mean scores due to chance (sampling error) Standard error The standard error is a function of two things:    : How large the standard deviation in the population is  N: The size of the sample   X N Examples based on samples drawn from a population with a standard deviation() of 15 (and mean of 100) 15 15  5 9 3 15 15   3 25 5 N=9 X   N = 25 X  N = 100 X  15 15   1,5 100 10 Sampling distribution at different sample sizes Infinitive number of samples randomly drawn from a population with N = 100  = 100 and standard deviation = 15 N = 25 N=9 85 90 95 100  105 110 115 Sampling distribution and standard error 50 % of the samples mean values is under the population mean 13,6% 0,1 % 50 % is over 34,1% 34,1% 13,6% 2,2 % 2,2 % -3 X -2  X -1  X  +1  X +2  X 0,1 % +3  X Example: IQ and breast-feeding  The population mean score on IQ for 12 years old is 100 and the standard deviation is 15  A researcher suspects that breast-feeding can affect IQ  A sample of 25 12-year olds being breast-fed up to six months of age have a mean IQ-score of 103  How probable is it that this sample has sampling error? X = 103 due to Testing hypotesis Null hypothesis (H0): The population of children being breast-fed up to 6 months of age does not have a different mean IQ score from other children I.e.: the difference from the population mean score is due to sampling error Alternative hypothesis (H1): The population of children being breast-fed up to 6 mnds of age does have a different mean IQ score from the population of other children How probable is it to obtain a difference of 3 points or more in mean score due to sampling error/pure chance? This is referred to as the p-value Sampling distribution when the standard error equals 3 X Sample 15  3 25 13,6% 0,1 % 34,1% 13,6% 2,2 % 2,2 % -3 X 91 34,1% X = 103 -2  X 94 -1  X 97  100 +1  X 103 +2  X 106 0,1 % +3  X 109 How probable is it that the results is due to random variation (sampling error)? In our example: a X of 103 or higher will appear in 15,9 % (p= 0.159) of all the N = 25 samples we draw from a population with  =100,  = 15 Thus, the probability of sampling error is 15,9 % Significance level The limit we set in order to reject H0 is called significance level () : - Convention: if the probability sampling error is less than 5 %, we reject the Null hypothesis. - If the probability of sampling error is 5 % or more, keep H0 - We usually symbolize this as  = 0.05 - We can also set the level to 1 % or lower ( = 0.01) - Based on the results, we…….keep H0 One-tailed and two-tailed tests  A one-tailed test: the difference is in an expected direction: H0 : (The population) of children who are breast-fed up to 6 mnds of age have higher mean IQ-score than other children  A two-tailed test H1 : (The population) of children who are breast-fed up to 6 mnds of age have a different mean IQ-score than other children (Thus, we open up for the possibility that the mean IQ-score of breast-fed children can be either lower or higher than in the population of other children)  Important to decide upon one- ore two-tailed test before the test is conducted! Consequences of choosing a one-tailed or a two-tailed test 5% One-tailed Two-tailed Enhalet test Tohalet test Rejection area 5% 1.65 Tohalet test Critical value 2,5% -1.96 2,5% 1.96 Task 1 We now have increased our a sample to 100 children who have been breast-fed up to 6 mnds of age. The sample’s mean score on IQ is the same: 103 If you choose a level of significance of 5 % ( = .05), do you reject or keep H0? Type I and Type II error We can never be 100% sure that we do the right thing when rejecting or keeping H0: In the population (the true world) H0 is true The sample value is due to sampling error H0 is false The sample is drawn from a population with a different mean value Decision: Keep H0 Correct decision Type II-error Decision: Reject H0 Type I-error (equals α) Correct decision THUS: We do not say that H0 is true or false, or that H1 is so. What do we do when we do not know the population values?  We use the sample’s standard deviation (s) as an estimate of the population‘s standard deviation ()  Standard error if we know population standard  deviance: X  N  In practice: (small) samples often underestimate the standard deviation in the population  Therefore, this is taken into consideration in the test for significance applied  Most applied; the student t- distribution  Standard error if we do not know population standard deviance: s  s X N The Student t distribution  The Student t distribution is different for different sample sizes  Sample sizes are represented as the degrees of freedom (df)  df = N -1  A sample of 10 has (10-1) = 9 degrees of freedom  Must take this into consideration  The more degrees of freedom, the more identical to the Z-distribution the Student t distribution will be The t distribution Different samples sizes (df) have different critical values When the population’s standard deviation is not known Example: do drivers’ mean speed deviate from the speed limit when it is raised to 100 km/h on a road section? You measured the speed of 30 cars. These have: sX  t  s 4   0,73 N 30 X   96  100   5,47 sX 0,73 X  96 km / h s  4  What is the critival value for H0: µ = 100 H1: µ ≠ 100 rejecting H0 at a 5 % level?  df = N-1 = 30-1 = 29 The t distribution For a two-tailed test with 5 % level of significance and 29 df, The critical value is +/- 2.045 Tohalet test -5,47 2,5% -2,045 2,5% 2,045 Our estimated t-value is in the rejection area, and we reject H0 Thus, we believe that the real driving speed is below 100 km/h Difference in mean score between two samples, no information about population values Is the difference between the experimental group (N =8) and control group (N =8) on mean depression score after treatment statistically significant? Xexp .group  34,125 s  4,9 Null hypothesis (H0): Xcontrolgroup  40,25 s  4,4  exp group   controlgroup Alternative hypothesis (H1):  exp group   controlgroup How probable is it that the difference in due to sampling error? Enhalet test t X exp .group  X controlgroup 2 s exp .group N exp .group t  2 s control group  34,125  41,25 5% N controlgroup  6,125  2,629 2,33 Tohalet test H0 is rejected, we believe that the difference between the experimental group and the control group is present in the population -2,629 2,5% -2,145 4,9 2 4,4 2  8 8 2,5% 2,145 i.e.: training seems to work! Degrees of freedom (df): Nexp.group -1 + Ncontrol group -1 = 8-1 + 8-1 = 14 Critcal value for a two-tailed test df = 14,  = 0.05: +/- 2.145 Parametric tests  Parametric tests are based upon three main assumptions 1. The sample(s) is randomly drawn from the population 2. The values are normally distributed in the population 3. If two or more samples are compared to each other, they must be drawn from populations with equal variances This are very rigid assumption. However, parametric tests are quite robust to violation of assumption 2 and 3 Examples of parametric tests Applied when we know the population values (mean score, standard deviation, or percentage etc.)  Z-test Applied when we do not know the population values  t-test (difference in mean scores between groups, correlation and regression coefficients)  F-test (Analysis of variance) Non-parametric tests  Applied when assumptions of parametric tests are violated  Or when dependent variables are on a ordinal/nominal level  Basically the same logic is applied as for significance testing using parametric tests Example of a non-parametric test: the chisquare test (2)  Is being found guilty or not for violent crimes dependent upon skin color? Not guilty Guilty Total Light skin 70 30 100 Dark skin 30 70 100 Total 100 100  Both variables are measured on a nominal level, and mean and standard deviation cannot be estimated  In this case we use the chi- square test (2) to determine whether the difference is significant or not Core of the chi-square test Calculate the expected values (E) which symbolize the values if there were no relationship between the two variables Not guilty Guilty Light skin Observed (O) =70 Expected (E) = 50 O = 30 E = 50 Dark skin O = 30 E = 50 O = 70 E = 50  Compare these to the observed values (O) using this formula: 2 (O  E )   E 2 2   (O  E)2 E 2 (70  50) 2  (30  50) 2  (70  50) 2  (30  50) 1600    32 50 50  We must also estimate the number of freedom: df = (the number of columns -1) + (the number of rows-1) df = (2-1) + (2-1) = 1  And next find the critical value of 2 at a 5 % level of significance  H0: there is no association between skin color and being found guilty  H1: there is an association between skin color and being found guilty The 2 distribution The critical value of 2 (df =1) = 3.84 Our estimated 2 value is 32, thus much larger than 3.84 Thus, H0 is rejected Level of significance and practical importance/significance  A statistical significant result is not necessary of large practical importance  The main reason: statistical significant result is strongly influenced by the size of the sample(s)  Large samples = easy to obtain significant results (i.e. easier to reject H0)  Small samples = difficult to obtain significant results  Useful to include a measure of effect size also  Focusing on how large the difference is/ how strong the association between the variables are Effect size Several types: For differences in mean  D-value: (difference relative to standard deviation) d X exp .group  X controlgroup s exp .group 34,123  40,25  6,125 d   1,25 4,9 4,9 Interpretation of d d= 0, no difference +/- 0.20: small difference +/- 0.50: moderate difference +/- .80: large difference For measures of association and explained variance:  r, r2 and R2  Eta2 Random sampling 1.Simple randomized sampling  All members of the population have an equal chance of being drawn 2. Systematic sampling  Selected using a certain key  E.g.. Each 50th person over 18 year 3. Stratified randomized sampling Random selection within subgroups of the population 4. Proportionate sampling. Drawing certain proportions of the sample from subgroups of the population 5. Cluster sampling. Drawing all members of randomly selected groups from the population (e.g. school classes) Non-random samples 1. Convenience sampling  Students attending a lecture, stopping people on the street, voluntary participants 2. Quota sampling  Recruit volunteers, but make sure that certain characteristics are represented in certain proportions (e.g. equal number of each gender, age etc.)
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            