Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics for EES 2. Standard error Dirk Metzler http://evol.bio.lme.de/_statgen May 9, 2011 Contents 1 Histograms: Densities or Numbers? 2 Computing σ with n or n − 1? 3 Mean values are usually nice but sometimes mean example: picky wagtails example: spider men & spider women example: copper-tolerant browntop bent 4 The standard error SE example: drought stress in sorghum general consideration Histograms: Densities or Numbers? Contents 1 Histograms: Densities or Numbers? 2 Computing σ with n or n − 1? 3 Mean values are usually nice but sometimes mean example: picky wagtails example: spider men & spider women example: copper-tolerant browntop bent 4 The standard error SE example: drought stress in sorghum general consideration Histograms: Densities or Numbers? 0 2 4 6 8 Number Number vs. Density 0 1 2 3 4 5 6 Histograms: Densities or Numbers? 0 2 4 6 8 0 1 0 1 2 3 4 5 6 8 4 0 Number 12 Number Number vs. Density 2 3 4 5 6 7 Histograms: Densities or Numbers? 0 2 4 6 8 0 1 2 3 4 5 6 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 8 4 0.3 0.0 Density 0.6 0 Number 12 Number Number vs. Density Histograms: Densities or Numbers? 0 2 4 6 8 Number Number vs. Density 0 1 2 3 4 5 4 8 12 Histograms with unequal intervals should show densities, not numbers! 0 Number 6 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0.3 0.0 Density 0.6 0 Computing σ with n or n − 1? Contents 1 Histograms: Densities or Numbers? 2 Computing σ with n or n − 1? 3 Mean values are usually nice but sometimes mean example: picky wagtails example: spider men & spider women example: copper-tolerant browntop bent 4 The standard error SE example: drought stress in sorghum general consideration Computing σ with n or n − 1? Simulated population (N=10000 adults) 20 22 24 26 Length [cm] 28 30 Computing σ with n or n − 1? Simulated population (N=10000 adults) Mean: 25.13 20 22 24 26 Length [cm] 28 30 Computing σ with n or n − 1? Simulated population (N=10000 adults) Mean: 25.13 Standard deviation: 1.36 20 22 24 26 Length [cm] 28 30 Computing σ with n or n − 1? Simulated population (N=10000 adults) Mean: 25.13 Standard deviation: 1.36 20 22 24 26 28 30 28 30 Length [cm] Sample from the population (n=10) ● ● ●● ● ● ●● ● ● 20 22 24 26 Length [cm] Computing σ with n or n − 1? Simulated population (N=10000 adults) Mean: 25.13 Standard deviation: 1.36 20 22 24 26 28 30 28 30 Length [cm] Sample from the population (n=10) M: 24.43 ● ● ●● ● ● ●● ● ● 20 22 24 26 Length [cm] Computing σ with n or n − 1? Simulated population (N=10000 adults) Mean: 25.13 Standard deviation: 1.36 20 22 24 26 28 30 28 30 Length [cm] Sample from the population (n=10) M: 24.43 SD with (n−1): 1.15 20 ● ● ●● ● ● ●● ● ● 22 24 26 Length [cm] Computing σ with n or n − 1? Simulated population (N=10000 adults) Mean: 25.13 Standard deviation: 1.36 20 22 24 26 28 30 28 30 Length [cm] Sample from the population (n=10) M: 24.43 SD with (n−1): 1.15 SD with n: 1.03 20 ● ● ●● ● ● ●● ● ● 22 24 26 Length [cm] Computing σ with n or n − 1? Simulated population (N=10000 adults) Mean: 25.13 Standard deviation: 1.36 20 22 24 26 28 30 28 30 Length [cm] Sample from the population (n=10) M: 24.43 SD with (n−1): 1.15 SD with n: 1.03 20 ● ● ●● ● ● ●● ● ● 22 24 26 Length [cm] Another sample from the population (n=10) ● 20 22 ● ● ●● ● ● 24 ● ● 26 Length [cm] ● 28 30 Computing σ with n or n − 1? Simulated population (N=10000 adults) Mean: 25.13 Standard deviation: 1.36 20 22 24 26 28 30 28 30 Length [cm] Sample from the population (n=10) M: 24.43 SD with (n−1): 1.15 SD with n: 1.03 20 ● ● ●● ● ● ●● ● ● 22 24 26 Length [cm] Another sample from the population (n=10) M: 24.92 ● 20 22 ● ● ●● ● ● 24 ● ● 26 Length [cm] ● 28 30 Computing σ with n or n − 1? Simulated population (N=10000 adults) Mean: 25.13 Standard deviation: 1.36 20 22 24 26 28 30 28 30 Length [cm] Sample from the population (n=10) M: 24.43 SD with (n−1): 1.15 SD with n: 1.03 20 ● ● ●● ● ● ●● ● ● 22 24 26 Length [cm] Another sample from the population (n=10) M: 24.92 SD with (n−1): 1.61 20 ● 22 ● ● ●● ● ● 24 ● ● 26 Length [cm] ● 28 30 Computing σ with n or n − 1? Simulated population (N=10000 adults) Mean: 25.13 Standard deviation: 1.36 20 22 24 26 28 30 28 30 Length [cm] Sample from the population (n=10) M: 24.43 SD with (n−1): 1.15 SD with n: 1.03 20 ● ● ●● ● ● ●● ● ● 22 24 26 Length [cm] Another sample from the population (n=10) M: 24.92 SD with (n−1): 1.61 SD with n: 1.45 20 ● 22 ● ● ●● ● ● 24 ● ● 26 Length [cm] ● 28 30 Computing σ with n or n − 1? 0.8 0.0 Density 1000 samples, each of size n=10 0.5 1.0 1.5 2.0 2.5 2.0 2.5 0.8 0.0 Density SD computed with n−1 0.5 1.0 1.5 SD computed with n Computing σ with n or n − 1? 0.8 0.0 Density 1000 samples, each of size n=10 0.5 1.0 1.5 2.0 2.5 2.0 2.5 0.8 0.0 Density SD computed with n−1 0.5 1.0 1.5 SD computed with n Computing σ with n or n − 1? 0.8 0.0 Density 1000 samples, each of size n=10 0.5 1.0 1.5 2.0 2.5 2.0 2.5 0.8 0.0 Density SD computed with n−1 0.5 1.0 1.5 SD computed with n Computing σ with n or n − 1? Computing σ with n or n − 1? The standard deviation σ of a random variable with n equally probable outcomes x1 , . . . , xn (z.B. rolling a dice) is clearly defined by v u n u1 X t (x − xi )2 . n i=1 Computing σ with n or n − 1? Computing σ with n or n − 1? The standard deviation σ of a random variable with n equally probable outcomes x1 , . . . , xn (z.B. rolling a dice) is clearly defined by v u n u1 X t (x − xi )2 . n i=1 If x1 , . . . , xn is a sample (the usual case in statistics) you should rather use the formula v u n u 1 X t (x − xi )2 . n−1 i=1 Mean values are usually nice but sometimes mean Contents 1 Histograms: Densities or Numbers? 2 Computing σ with n or n − 1? 3 Mean values are usually nice but sometimes mean example: picky wagtails example: spider men & spider women example: copper-tolerant browntop bent 4 The standard error SE example: drought stress in sorghum general consideration Mean values are usually nice but sometimes mean Mean and SD. . . characterize data well if the distribution is bell-shaped Mean values are usually nice but sometimes mean Mean and SD. . . characterize data well if the distribution is bell-shaped and must be interpreted with caution in other cases Mean values are usually nice but sometimes mean Mean and SD. . . characterize data well if the distribution is bell-shaped and must be interpreted with caution in other cases We will exemplify this with textbook examples from ecology, see e.g. M. Begon, C. R. Townsend, and J. L. Harper. Ecology: From Individuals to Ecosystems. Blackell Publishing, 4 edition, 2008. Mean values are usually nice but sometimes mean Mean and SD. . . characterize data well if the distribution is bell-shaped and must be interpreted with caution in other cases We will exemplify this with textbook examples from ecology, see e.g. M. Begon, C. R. Townsend, and J. L. Harper. Ecology: From Individuals to Ecosystems. Blackell Publishing, 4 edition, 2008. When original data were not available, we generated similar data sets by computer simulation. So do not believe all data points. Mean values are usually nice but sometimes mean example: picky wagtails Contents 1 Histograms: Densities or Numbers? 2 Computing σ with n or n − 1? 3 Mean values are usually nice but sometimes mean example: picky wagtails example: spider men & spider women example: copper-tolerant browntop bent 4 The standard error SE example: drought stress in sorghum general consideration Mean values are usually nice but sometimes mean example: picky wagtails Wagtails eat dung flies Predator Prey White Wagtail Motacilla alba alba Dung Fly Scatophaga stercoraria image (c) by Artur Mikołajewski image (c) by Viatour Luc Mean values are usually nice but sometimes mean example: picky wagtails Conjecture Size of flies varies. efficiency for wagtail = energy gain / time to capture and eat lab experiments show that efficiency is maximal when flies have size 7mm N.B. Davies. Prey selection and social behaviour in wagtails (Aves: Motacillidae). J. Anim. Ecol., 46:37–57, 1977. Mean values are usually nice but sometimes mean example: picky wagtails 100 50 0 number 150 available dung flies 4 5 6 7 8 length [mm] 9 10 11 Mean values are usually nice but sometimes mean example: picky wagtails 50 100 mean= 7.99 0 number 150 available dung flies 4 5 6 7 8 length [mm] 9 10 11 Mean values are usually nice but sometimes mean example: picky wagtails 150 100 sd= 0.96 50 mean= 7.99 0 number available dung flies 4 5 6 7 8 length [mm] 9 10 11 Mean values are usually nice but sometimes mean example: picky wagtails 40 30 20 10 0 number 50 60 captured dung flies 4 5 6 7 8 length [mm] 9 10 11 Mean values are usually nice but sometimes mean example: picky wagtails captured dung flies 40 30 20 10 0 number 50 60 mean= 6.79 4 5 6 7 8 length [mm] 9 10 11 Mean values are usually nice but sometimes mean example: picky wagtails captured dung flies 10 20 30 40 sd= 0.69 0 number 50 60 mean= 6.79 4 5 6 7 8 length [mm] 9 10 11 Mean values are usually nice but sometimes mean example: picky wagtails 0.5 dung flies: available, captured 0.1 0.2 0.3 available 0.0 fraction per mm 0.4 captured 4 5 6 7 8 length [mm] 9 10 11 Mean values are usually nice but sometimes mean example: picky wagtails numerical comparison of size distributions captured available mean 0.5 dung flies: available, captured 0.1 0.2 0.3 available 0.0 fraction per mm 0.4 captured 4 5 6 7 8 length [mm] 9 10 11 Mean values are usually nice but sometimes mean example: picky wagtails numerical comparison of size distributions captured mean available < 0.5 dung flies: available, captured 0.1 0.2 0.3 available 0.0 fraction per mm 0.4 captured 4 5 6 7 8 length [mm] 9 10 11 Mean values are usually nice but sometimes mean example: picky wagtails numerical comparison of size distributions captured mean 6.29 < available 7.99 0.5 dung flies: available, captured 0.1 0.2 0.3 available 0.0 fraction per mm 0.4 captured 4 5 6 7 8 length [mm] 9 10 11 Mean values are usually nice but sometimes mean example: picky wagtails numerical comparison of size distributions captured mean 6.29 < sd available 7.99 0.5 dung flies: available, captured 0.1 0.2 0.3 available 0.0 fraction per mm 0.4 captured 4 5 6 7 8 length [mm] 9 10 11 Mean values are usually nice but sometimes mean example: picky wagtails numerical comparison of size distributions captured mean 6.29 < sd < available 7.99 0.5 dung flies: available, captured 0.1 0.2 0.3 available 0.0 fraction per mm 0.4 captured 4 5 6 7 8 length [mm] 9 10 11 Mean values are usually nice but sometimes mean example: picky wagtails numerical comparison of size distributions captured mean 6.29 < sd 0.69 < available 7.99 0.96 0.5 dung flies: available, captured 0.1 0.2 0.3 available 0.0 fraction per mm 0.4 captured 4 5 6 7 8 length [mm] 9 10 11 Mean values are usually nice but sometimes mean example: picky wagtails Interpretation The birds prefer dung-flies from a relatively narrow range around the predicted optimum of 7mm. Mean values are usually nice but sometimes mean example: picky wagtails Interpretation The birds prefer dung-flies from a relatively narrow range around the predicted optimum of 7mm. The distributions in this example were bell-shaped, and the 4 numbers (means and standard deviations) were appropriate to summarize the data. Mean values are usually nice but sometimes mean example: spider men & spider women Contents 1 Histograms: Densities or Numbers? 2 Computing σ with n or n − 1? 3 Mean values are usually nice but sometimes mean example: picky wagtails example: spider men & spider women example: copper-tolerant browntop bent 4 The standard error SE example: drought stress in sorghum general consideration Mean values are usually nice but sometimes mean example: spider men & spider women Nephila madagascariensis image (c) by Bernard Gagnon Mean values are usually nice but sometimes mean Simulated Data: 70 sampled spiders mean size: 21.05 mm sd of size :12.94 mm example: spider men & spider women Mean values are usually nice but sometimes mean example: spider men & spider women 3 2 1 0 Frequency 4 5 6 ????? 0 10 20 30 size [mm] 40 50 Mean values are usually nice but sometimes mean example: spider men & spider women 8 6 4 2 0 Frequency 10 12 14 Nephila madagascariensis (n=70) 0 10 20 30 size [mm] 40 50 Mean values are usually nice but sometimes mean example: spider men & spider women 12 14 Nephila madagascariensis (n=70) 8 6 4 2 0 Frequency 10 mean= 21.06 0 10 20 30 size [mm] 40 50 Mean values are usually nice but sometimes mean example: spider men & spider women males 2 4 6 8 females 0 Frequency 10 12 14 Nephila madagascariensis (n=70) 0 10 20 30 size [mm] 40 50 Mean values are usually nice but sometimes mean example: spider men & spider women males 2 4 6 8 females 0 Frequency 10 12 14 Nephila madagascariensis (n=70) 0 10 20 30 size [mm] 40 50 Mean values are usually nice but sometimes mean example: spider men & spider women Conclusion from spider example If data comes from different groups, it may be reasonable to compute mean an sd separately for each group. Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent Contents 1 Histograms: Densities or Numbers? 2 Computing σ with n or n − 1? 3 Mean values are usually nice but sometimes mean example: picky wagtails example: spider men & spider women example: copper-tolerant browntop bent 4 The standard error SE example: drought stress in sorghum general consideration Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent Copper Tolerance in Browntop Bent Browntop Bent Agrostis tenuis Copper Cuprum image (c) Kristian Peters Hendrick met de Bles Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent A.D. Bradshaw. Population Differentiation in agrostis tenius Sibth. III. populations in varied environments. New Phytologist, 59(1):92 – 103, 1960. T. McNeilly and A.D Bradshaw. Evolutionary Processes in Populations of Copper Tolerant Agrostis tenuis Sibth. Evolution, 22:108–118, 1968. Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent A.D. Bradshaw. Population Differentiation in agrostis tenius Sibth. III. populations in varied environments. New Phytologist, 59(1):92 – 103, 1960. T. McNeilly and A.D Bradshaw. Evolutionary Processes in Populations of Copper Tolerant Agrostis tenuis Sibth. Evolution, 22:108–118, 1968. Again, we have no access to original data and use simulated data. Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent Adaptation to copper? root length indicates copper tolerance Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent Adaptation to copper? root length indicates copper tolerance measure root lengths of plants near copper mine Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent Adaptation to copper? root length indicates copper tolerance measure root lengths of plants near copper mine take seeds from clean meadow and sow near copper mine Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent Adaptation to copper? root length indicates copper tolerance measure root lengths of plants near copper mine take seeds from clean meadow and sow near copper mine measure root length of these “meadow plants” in copper environment Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent 100 Browntop Bent (n=50) 60 40 20 0 density per cm 80 Copper Mine Grass 0 50 100 root length (cm) 150 200 Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent 10 20 Grass seeds from a meadow 0 density per cm 30 40 Browntop Bent (n=50) 0 50 100 root length (cm) 150 200 Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent 10 20 Grass seeds from a meadow copper tolerant ? 0 density per cm 30 40 Browntop Bent (n=50) 0 50 100 root length (cm) 150 200 Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent 0.02 0.03 0.04 meadow plants 0.01 copper mine plants 0.00 density per cm 0.05 0.06 0.07 Browntop Bent (n=50) 0 50 100 root length (cm) 150 200 Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent 100 Browntop Bent (n=50) copper mine plants m+s 20 40 60 m 0 density per cm 80 m−s 0 50 100 root length (cm) 150 200 Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent 40 Browntop Bent (n=50) m−s 20 30 meadow plants 0 10 density per cm m+s m 0 50 100 150 root length (cm) 2/3 of the data within [m-sd,m+sd]???? No! 200 Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent Browntop Bent n=50+50 copper mine plants ● ● meadow plants ● ● 0 ● ● ● ●● 50 ● ●● ● ●● ● 100 root length (cm) ● 150 ● 200 Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent quartiles of root length [cm] min copper adapted 12.9 from meadow 1.1 Q1 median 80.1 100.8 13.2 16.0 Q3 120.9 19.6 max 188.9 218.9 Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent Conclusion from browntop bent example Sometimes the two numbers m and sd give not enough information. In this example the four quartiles max, Q1, median, Q3, max that are shown in the boxplot are more approriate. Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent Conclusions from this section Always visually inspect the data! Mean values are usually nice but sometimes mean example: copper-tolerant browntop bent Conclusions from this section Always visually inspect the data! Never rely on summarising values alone! The standard error SE Contents 1 Histograms: Densities or Numbers? 2 Computing σ with n or n − 1? 3 Mean values are usually nice but sometimes mean example: picky wagtails example: spider men & spider women example: copper-tolerant browntop bent 4 The standard error SE example: drought stress in sorghum general consideration The standard error SE The Standard Error sd SE = √ n describes the variability of the sample mean. n: sample size sd: sample standard deviance The standard error SE example: drought stress in sorghum Contents 1 Histograms: Densities or Numbers? 2 Computing σ with n or n − 1? 3 Mean values are usually nice but sometimes mean example: picky wagtails example: spider men & spider women example: copper-tolerant browntop bent 4 The standard error SE example: drought stress in sorghum general consideration The standard error SE example: drought stress in sorghum drought stress in sorghum V. Beyel and W. Brüggemann. Differential inhibition of photosynthesis during pre-flowering drought stress in Sorghum bicolor genotypes with different senescence traits. Physiologia Plantarum, 124:249–259, 2005. 14 sorghum plants were not watered for 7 days. in the last 3 days: transpiration was measured for each plant (mean over 3 days) the area of the leaves of each plant was determined The standard error SE example: drought stress in sorghum transpiration rate = (amount of water per day)/area of leaves ml cm2 · day Aim: Determine mean transpiration rate µ under these conditions. The standard error SE example: drought stress in sorghum If we hade many plants, we could determine µ quite precisely. The standard error SE example: drought stress in sorghum If we hade many plants, we could determine µ quite precisely. Problem: How accurate is the estimation of µ with such a small sample? (n = 14) The standard error SE example: drought stress in sorghum 4 3 2 1 0 frequency 5 6 drought stressed sorghum (variety B, n = 14) 0.08 0.10 0.12 0.14 Transpiration (ml/(day*cm^2)) 0.16 The standard error SE example: drought stress in sorghum 6 drought stressed sorghum (variety B, n = 14) 4 3 2 1 0 frequency 5 mean=0.117 0.08 0.10 0.12 0.14 Transpiration (ml/(day*cm^2)) 0.16 The standard error SE example: drought stress in sorghum Standard Deviation=0.026 4 3 2 1 0 frequency 5 6 drought stressed sorghum (variety B, n = 14) 0.08 0.10 0.12 0.14 Transpiration (ml/(day*cm^2)) 0.16 The standard error SE example: drought stress in sorghum transpiration data: x1, x2, . . . , x14 x = x1 + x2 + · · · + x14 /14 The standard error SE example: drought stress in sorghum transpiration data: x1, x2, . . . , x14 14 x = x1 + x2 + · · · + x14 1 X xi /14 = 14 i=1 The standard error SE example: drought stress in sorghum transpiration data: x1, x2, . . . , x14 14 x = x1 + x2 + · · · + x14 1 X xi /14 = 14 i=1 x = 0.117 The standard error SE example: drought stress in sorghum our estimation: µ ≈ 0.117 The standard error SE example: drought stress in sorghum our estimation: µ ≈ 0.117 how accurate is this estimation? The standard error SE example: drought stress in sorghum our estimation: µ ≈ 0.117 how accurate is this estimation? How much does x (our estimation) deviate from µ (the true mean value)? The standard error SE general consideration Contents 1 Histograms: Densities or Numbers? 2 Computing σ with n or n − 1? 3 Mean values are usually nice but sometimes mean example: picky wagtails example: spider men & spider women example: copper-tolerant browntop bent 4 The standard error SE example: drought stress in sorghum general consideration The standard error SE general consideration Assume we had made the experiment not just 14 times, The standard error SE general consideration Assume we had made the experiment not just 14 times, but repeated it 100 times, The standard error SE general consideration Assume we had made the experiment not just 14 times, but repeated it 100 times, 1000 times, The standard error SE general consideration Assume we had made the experiment not just 14 times, but repeated it 100 times, 1000 times, 1000000 times The standard error SE general consideration We consider our 14 plants as random sample from a very large population of possible values. The standard error SE general consideration population (all rates of transpiration) n= oo The standard error SE general consideration population (all rates of transpiration) n= oo sample n=14 The standard error SE general consideration population (all rates of transpiration) µ sample x The standard error SE general consideration We estimate the population mean µ by the sample mean x. The standard error SE general consideration Each new sample gives a new value of x. The standard error SE general consideration Each new sample gives a new value of x. x depends on randomness: it is a random variable The standard error SE general consideration Each new sample gives a new value of x. x depends on randomness: it is a random variable Problem: How variable is x? The standard error SE general consideration Each new sample gives a new value of x. x depends on randomness: it is a random variable Problem: How variable is x? More precisely: What is the typical deviation of x from µ? The standard error SE general consideration x = x1 + x2 + · · · + xn /n What does the variability of x depend on? The standard error SE general consideration 1. From the variability of the single observations x1, x2, . . . , xn The standard error SE general consideration x varies a lot 0.05 0.10 0.15 0.20 0.25 x varies little 0.05 0.10 0.15 0.20 0.25 The standard error SE general consideration mean = 0.117 x varies a lot ⇒ x varies a lot 0.05 0.10 0.15 0.20 0.25 mean = 0.117 x varies little ⇒ x varies little 0.05 0.10 0.15 0.20 0.25 The standard error SE general consideration 2. from the sample size n The standard error SE general consideration 2. from the sample size n The larger n, the smaller is the variability of x. The standard error SE general consideration To explore this dependency we perform a (Computer-)Experiment. The standard error SE general consideration Experiment: Take a population, draw samples and examine how x varies. The standard error SE general consideration We assume the distribtion of possible transpriration rates looks like this: The standard error SE general consideration 10 5 0 density 15 hypothetical distribution of transpiration rates 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 Transpiration (ml/(day*cm^2)) The standard error SE general consideration hypothetical distribution of transpiration rates 10 5 0 density 15 mean=0.117 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 Transpiration (ml/(day*cm^2)) The standard error SE general consideration hypothetical distribution of transpiration rates 10 5 0 density 15 SD=0.026 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 Transpiration (ml/(day*cm^2)) The standard error SE general consideration At first with small sample sizes: n=4 The standard error SE general consideration 10 5 0 Dichte 15 sample of size 4 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 Transpiration (ml/(Tag*cm^2)) The standard error SE general consideration 10 5 0 Dichte 15 second sample of size 4 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 Transpiration (ml/(Tag*cm^2)) The standard error SE general consideration 10 5 0 Dichte 15 third sample of size 4 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 Transpiration (ml/(Tag*cm^2)) The standard error SE general consideration 10 samples The standard error SE general consideration 0 2 4 6 8 10 10 samples 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 Transpiration (ml/(Tag*cm^2)) The standard error SE general consideration 50 samples The standard error SE general consideration 0 10 20 30 40 50 50 samples 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 Transpiration (ml/(Tag*cm^2)) The standard error SE general consideration How variable are the sample means? The standard error SE general consideration 0 2 4 6 8 10 10 samples of size 4 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 Transpiration (ml/(Tag*cm^2)) The standard error SE general consideration 0 2 4 6 8 10 10 samples of size 4 and the corresponding sample means 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 Transpiration (ml/(Tag*cm^2)) The standard error SE general consideration 0 10 20 30 40 50 50 samples of size 4 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 Transpiration (ml/(Tag*cm^2)) The standard error SE general consideration 0 10 20 30 40 50 50 samples of size 4 and the corresponding sample means 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 Transpiration (ml/(Tag*cm^2)) The standard error SE general consideration Sample mean Mean=0.117 SD=0.013 10 20 Population Mean=0.117 SD=0.026 0 Density 30 40 distribution of sample means (sample size n = 4) 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 Transpiration (ml/(day*cm^2)) The standard error SE general consideration population: standard deviation = 0.026 The standard error SE general consideration population: standard deviation = 0.026 sample means (n = 4): standard deviation = 0.013 The standard error SE general consideration population: standard deviation = 0.026 sample means (n = 4): standard deviation = 0.013 √ = 0.026/ 4 The standard error SE general consideration Increase the sample size from 4 to 16 The standard error SE general consideration 0 2 4 6 8 10 10 samples of size 16 and the corresponding sample means 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 Transpiration (ml/(day*cm^2)) The standard error SE general consideration 0 10 20 30 40 50 50 samples of size 16 and the corresponding sample means 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 Transpiration (ml/(day*cm^2)) The standard error SE general consideration Population Mean=0.117 SD=0.026 Sample mean Mean=0.117 SD=0.0064 40 20 0 Density 60 80 distribution of sample means (sample size n = 16) 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 Transpiration (ml/(day*cm^2)) The standard error SE general consideration population: standard deviation = 0.026 The standard error SE general consideration population: standard deviation = 0.026 sample mean (n = 16): standard deviation = 0.0065 The standard error SE general consideration population: standard deviation = 0.026 sample mean (n = 16): standard deviation = 0.0065√ = 0.026/ 16 The standard error SE general consideration General Rule Let x be the mean of a sample of size n from a distribution (e.g. all values in a population) with standard deviation σ. Since x depends on the random sample, it is a random variable. Its standard deviation σx fulfills σ σx = √ . n The standard error SE general consideration General Rule Let x be the mean of a sample of size n from a distribution (e.g. all values in a population) with standard deviation σ. Since x depends on the random sample, it is a random variable. Its standard deviation σx fulfills σ σx = √ . n Problem: σ is unknown The standard error SE general consideration General Rule Let x be the mean of a sample of size n from a distribution (e.g. all values in a population) with standard deviation σ. Since x depends on the random sample, it is a random variable. Its standard deviation σx fulfills σ σx = √ . n Problem: σ is unknown Idea: Estimate σ by sample standard deviation s: σ≈s The standard error SE σ σx = √ n general consideration The standard error SE general consideration σ s σx = √ ≈ √ n n The standard error SE general consideration σ s σx = √ ≈ √ =: SEM n n SEM stands for Standard Error of the Mean, or Standard Error for short. The standard error SE general consideration The distribution of x Observation Even if the distribution of x is asymmetric and has multiple peaks, The standard error SE general consideration The distribution of x Observation Even if the distribution of x is asymmetric and has multiple peaks, the distribution of x will be bell-shaped The standard error SE general consideration The distribution of x Observation Even if the distribution of x is asymmetric and has multiple peaks, the distribution of x will be bell-shaped (at least for larger sample sizes n.) The standard error SE general consideration The standard error SE µ−σ general consideration µ µ+σ The standard error SE general consideration µ µ−σ µ− √σ n µ+σ µ+ √σ n The standard error SE general consideration µ µ−σ µ− √σ n µ+σ µ+ √σ n The standard error SE general consideration √ √ Pr(x ∈ [µ − σ/ n), µ + (σ/ n)] ≈ µ µ−σ µ− √σ n 2 3 µ+σ µ+ √σ n The standard error SE general consideration The standard error SE general consideration x The standard error SE x− √s n general consideration x x+ √s n The standard error SE x− √s n general consideration x µ x+ √s n The standard error SE general consideration The distribution of x is approximately of a certain shape: the normal distribution. The standard error SE general consideration Density of the normal distribution 0.4 0.1 0.2 0.3 µ+σ 0.0 Normaldichte µ µ+σ −1 0 1 2 3 4 5 The standard error SE general consideration Density of the normal distribution 0.4 0.2 0.3 µ+σ 0.0 0.1 Normaldichte µ µ+σ −1 0 1 2 3 4 5 The normal distribution is also called Gauß distribution The standard error SE general consideration Density of the normal distribution The normal distribution is also called Gauß distribution (after Carl Friedrich Gauß, 1777-1855)