Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1/3/2017 Statistics and Probability Distributions Random Variables and Probability Distributions Certain probability distributions are assumed by many of the common statistical tests ANOVA assumes variables follow a normal distribution (need to meet assumptions to use ANOVA) Probability world data distributions fit many real- Ecological Analyses Ecological Analyses Random Variables Discrete Which sample is ‘better’? Random Variables e.g., 1,3,5 Presence versus Absence Number of Offspring born to swallows Continuous Random Variables Can have any value within an interval e.g., body mass, wing length Ecological Analyses Ecological Analyses Accuracy versus Precision Precision, Accuracy and Bias Accuracy is how close the estimated value is to the true value – this difference is the bias Precision is the variation in the measurement Your sample indicates precision, but you don’t know its accuracy! Precise Accurate Ecological Analyses Ecological Analyses 1 1/3/2017 Discrete Random Variable Distributions Bernoulli Bernoulli Random Variable Random Variables X ~ Bernoulli(p) Experiment has only two outcomes (e.g., organism present or absent) Bernoulli Random Variable describes the outcome of such an experiment The random variable X is distributed as a Bernoulli random variable with a single parameter ‘p’ Best example would be the toss of a ‘fair’ coin in which either outcome is equally likely (i.e., p =0.5) Ecological Analyses Ecological Analyses Bernoulli Random Variable Binomial Random Variable Might use a Bernoulli Random variable to look at the presence or absence of a species in a number of different locations (e.g., habitats, lakes) Many Bernoulli Trials = Binomial Random Variable Necessary because we would also want to involve replication in our experiments Ecological Analyses Ecological Analyses Binomial Random Variable Binomial Random Variable X ~ Bin(n,p) binomial Random Variable X is the number of successful results in n independent Bernoulli trials (parameters n and p) If n = 1, then the result is equivalent to a Bernoulli trial One of the most common types of random variables encountered in ecological studies The probability of obtaining X successes for a binomial random variable is: where n is the number of trials, X is the number of successful outcomes (X ≤ n) and n! is n factorial (i.e., n x (n-1) x (n-2) ... x 1) A Ecological Analyses Ecological Analyses 2 1/3/2017 Binomial Random Variable Binomial Coefficient Think of Consider the following set of five small mammals:{(red-backed vole), (meadow vole), (deer mouse), (short-tailed shrew), (jumping mouse)} How many unique pairs of small mammals can be formed from this set? as “n choose X”, which is known as the binomial coefficient Needed because there are many ways to obtain combinations and failures Ecological Analyses Ecological Analyses Binomial Coefficient 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Binomial Coefficient (red-backed vole),(meadow vole) (red-backed vole),(deer mouse) (red-backed vole),(short-tailed shrew) (red-backed vole),(jumping mouse)} (meadow vole),(deer mouse) (meadow vole),(short-tailed shrew) (meadow vole),(jumping mouse)} (deer mouse),(short-tailed shrew) (deer mouse),(jumping mouse)} (short-tailed shrew),(jumping mouse)} Using the binomial coefficient, we would set n = 5 and X = 2 and get 10 combinations: Ecological Analyses Ecological Analyses The Binomial Distribution Binomial Random Variable The By having a predicted (or theoretical) distribution, we can then see if our observed results ‘fit’ that distribution But first we need to be able to know about the available distributions Ecological Analyses following example (details on pages 31 and 32) illustrates taking the distributions of taking various X values out of 25 trials Ecological Analyses 3 1/3/2017 Calculating the Binomial Distribution The Binomial Distribution Ecological Analyses Ecological Analyses The Binomial Distribution Binomial Distribution with X~Bin(25,0.8) Probability distribution Symmetrical (both tails equal) True only when p = (1 - p) = 0.5 Ecological Analyses Ecological Analyses Poisson Random Variables Binomial Distribution with X~Bin(25,0.8) The Binomial distribution is appropriate when there is a fixed number of trials (n) and the probability of success is not too small Formula becomes awkward when n becomes large and p becomes small (i.e., for rare occurrences of animals or plants) Also need to be able to directly count the trails themselves Ecological Analyses Ecological Analyses 4 1/3/2017 Poisson Random Variables Poisson Random Variables Instead we frequently count the events that occur within a sample Suppose that you are using a number of quadrats to sample for the presence of animal damage Each occurrence represents the ‘success’ of an unobserved event Can’t really determine how many ‘trials’ have taken place Similar for trials in time: number of birds visiting a feeder over a period of time We use the Poisson Distribution Ecological Analyses Ecological Analyses Poisson Random Variables Poisson Random Variables X ~ Poisson() X is the number of occurrences of an event recorded in a sample of fixed area or during a fixed time interval Used when occurrences are rare (i.e., the most common number of counts in any sample is 0) X is the number of events in a sample X ~ Poisson() Described by a single parameter, is the average value of the number of occurrences of the event in each sample Ecological Analyses Ecological Analyses Poisson Random Variables Poisson Random Variables Suppose that the average number of damaged plants in a 10-m2 quadrat is 2 What are the chances that a single quadrat will contain 3 damaged plants? = 2, x = 3 Ecological Analyses = 2, x = 3 Ecological Analyses 5 1/3/2017 Poisson Random Variables Poisson Distributions The chances that a plot will contain no damage would be ( = 2, x = 0): Ecological Analyses Ecological Analyses Poisson Distributions = 0.1 = 0.5 = 2.0 Poisson Distributions = 1.0 Later we can test observed frequencies against these theoretical distributions to see if our predictions are met ... = 4.0 Ecological Analyses Expected value of a Discrete Random Variable The entire distribution can be summarized by determining the average value Straight averaging can be misleading with probability distributions, because we need to weight by their probabilities Ecological Analyses Ecological Analyses Variance of a Discrete Random Variable The variance of a random variable is a measure of how far the actual values or a random variable differ from the expected value Ecological Analyses 6 1/3/2017 Discrete Statistical Distributions Female horseshoe crabs with satellite males Ecological Analyses Female horseshoe crabs with satellite males Ecological Analyses Continuous Random Variables ecological variables are not discrete: Number of Satellite Males Most Body mass Wing length Concentrations of chemicals Heights and diameters of trees Within an interval, there are infinitely many possible values for a variable Female Carapace Width (mm) Ecological Analyses Uniform Random Variables Ecological Analyses Uniform Random Variables We break up the continuous variable into discrete intervals The sum of the probability of occurrence of all intervals will be 1.0 Ecological Analyses Ecological Analyses 7 1/3/2017 Uniform Random Variable Uniform Random Variable The probability that this uniform random variable X occurs in any subinterval f(x) is a probability density function (PDF) Assigning the P that a continuous variable X occurs within an interval I Ecological Analyses Ecological Analyses Probability Density and Cumulative Distribution Functions Cumulative Density Function F(y) = P(X < y) CDF represents the tail probability: the probability that a random variable X is less than or equal to some value y More when we look at statistical tests Ecological Analyses Ecological Analyses Normal (Gaussian) Random Variables Normal Random Variables X~N(,) E(x) = (x) = Symmetric around Ecological Analyses Standard Normal: X~N(0,1) Ecological Analyses 8 1/3/2017 Properties of the Normal Distribution Normal distributions can be added Properties of the Normal Distribution Normal Distributions can be transformed The sum of two independent normal random variables is also a normally distributed random variable E(X+Y) = E(X) + E(Y) (X+Y) = (X) + (Y) Ecological Analyses Properties of the Normal Distribution Ecological Analyses Log-normal and Exponential Distributions Normal Distributions can be standardized A special case of a transformation If a = 1/ and b = -1(/) E(Y) = a + b and 2(Y) = a22 For X~N(,), Y=(1/)X-/ = (X-)/ E(Y) = 0, 2(Y)=1 For each X, subtracted and divided by Ecological Analyses Ecological Analyses Continuous Statistical Distributions Central Limit Theorem Corner stone of probability and statistical analyses Standardizing any random variable that itself is a sum or average of a set of independent random variables results in a new random variable that is “nearly the same as” a standard normal one Ecological Analyses Ecological Analyses 9 1/3/2017 Central Limit Theorem Summary Allows us to use statistics that require a normal distribution even though the underlying data themselves may not be normally distributed ... Provided the samples size is large enough ... Ecological Analyses The distributions of random variables can be characterized by their expected values and variance Discrete: Bernoulli, Binomial, Poisson Continuous: Uniform, Normal, Exponential Ecological Analyses Summary The Central Limit Theorem asserts that the sum or averages of large, independent samples will follow a normal distribution if standardized For most ecological data, the Central Limit Theorem supports the use of statistical tests that assume normal distributions Ecological Analyses 10