Concepts in Probability The study of probability mostly deals with combining different events and studying these events alongside each other. How these different events relate to each other determines the methods and rules to follow when we're studying their probabilities. Events can be divided into two major categories dependent or Independent events. Independent Events When two events are said to be independent of each other, what this means is that the probability that one event occurs in no way affects the probability of the other event occurring. An example of two independent events is as follows; say you rolled a die and flipped a coin. The probability of getting any number face on the die in no way influences the probability of getting a head or a tail on the coin. Dependent Events When two events are said to be dependent, the probability of one event occurring influences the likelihood of the other event. For example, if you were to draw a two cards from a deck of 52 cards. If on your first draw you had an ace and you put that aside, the probability of drawing an ace on the second draw is greatly changed because you drew an ace the first time. Conditional Probability We have already defined dependent and independent events and seen how probability of one event relates to the probability of the other event. Having those concepts in mind, we can now look at conditional probability. Conditional probability deals with further defining dependence of events by looking at probability of an event given that some other event first occurs. Conditional probability is denoted by the following: The above is read as the probability that B occurs given that A has already occurred. The above is mathematically defined as: What is a Random Variable? When the numerical value of a variable is determined by a chance event, that variable is called a random variable. Discrete vs. Continuous Random Variables Random variables can be discrete or continuous. Discrete. Within a range of numbers, discrete variables can take on only certain values. Suppose, for example, that we flip a coin and count the number of heads. The number of heads will be a value between zero and plus infinity. Within that range, though, the number of heads can be only certain values. For example, the number of heads can only be a whole number, not a fraction. Therefore, the number of heads is a discrete variable. And because the number of heads results from a random process - flipping a coin - it is a discrete random variable. Continuous. Continuous variables, in contrast, can take on any value within a range of values. For example, suppose we randomly select an individual from a population. Then, we measure the age of that person. In theory, his/her age can take on any value between zero and plus infinity, so age is a continuous variable. In this example, the age of the person selected is determined by a chance event; so, in this example, age is a continuous random variable. Discrete Variables: Finite vs. Infinite Some references state that continuous variables can take on an infinite number of values, but discrete variables cannot. This is incorrect. In some cases, discrete variables can take on only a finite number of values. For example, the number of aces dealt in a poker hand can take on only five values: 0, 1, 2, 3, or 4. In other cases, however, discrete variables can take on an infinite number of values. For example, the number of coin flips that result in heads could be infinitely large. When comparing discrete and continuous variables, it is more correct to say that continuous variables can always take on an infinite number of values; whereas some discrete variables can take on an infinite number of values, but others cannot. Probability Distributions Uniform distribution (discrete) In probability theory and statistics, the discrete uniform distribution is a symmetric probability distribution whereby a finite number of values are equally likely to be observed; every one of n values has equal probability 1/n. Another way of saying "discrete uniform distribution" would be "a known, finite number of outcomes equally likely to happen". A simple example of the discrete uniform distribution is throwing a fair die. The possible values are 1, 2, 3, 4, 5, 6, and each time the die is thrown the probability of a given score is 1/6. If two dice are thrown and their values added, the resulting distribution is no longer uniform since not all sums have equal probability. Geometric Probability Distribution Geometric Setting A geometric setting arises when we perform independent trials of the same chance process and record the number of trials until a particular outcome occurs. The four conditions for a geometric setting are -Binary? The possible outcomes of each trial can be classified as “success” or “failure”. -Independent? Trials must be independent; that is, knowing the result of one trial must not have any effect on the result of any other trial. -Trials? The goal is to count the number of trials until the first success occurs. -Success? On each trial, the probability p of success must be the same. Geometric Random Variable and Geometric Distribution The number of trials Y that it takes to get a success in a geometric setting is a geometric random variable. The probability distribution of Y is a geometric distribution with parameter p, the probability of a success on any trial. The possible values of Y are 1, 2, 3…. Geometric Probability If Y has the geometric distribution with probability p of success on each trial, the possible values of Y are 1, 2, 3, …..If k is any one of these values, P(Y = k) = (1 – p)k-1 p The Binomial Probability Distribution A binomial experiment is one that possesses the following properties: 1. The experiment consists of n repeated trials; 2. Each trial results in an outcome that may be classified as a success or a failure (hence the name, binomial); 3. The probability of a success, denoted by p, remains constant from trial to trial and repeated trials are independent. The number of successes X in n trials of a binomial experiment is called a binomial random variable. The probability distribution of the random variable X is called a binomial distribution, and is given by the formula: P(X)=Cnxpxqn−x where n = the number of trials x = 0, 1, 2, ... n p = the probability of success in a single trial q = the probability of failure in a single trial (i.e. q = 1 − p) Cnx is a combination. P(X) gives the probability of successes in n binomial trials. Bell Curve, Normal Distribution Defined Definition: The term bell curve is used to describe the mathematical concept called normal distribution, sometimes referred to as Gaussian distribution. ‘Bell curve’ refers to the shape that is created when a line is plotted using the data points for an item that meets the criteria of ‘normal distribution’. The center contains the greatest number of a value and therefore would be the highest point on the arc of the line. This point is referred to the mean, but in simple terms it is the highest number of occurences of a element. ( statistical terms, the mode). The important things to note about a normal distribution is the curve is concentrated in the center and decreases on either side. This is significant in that the data has less of a tendency to produce unusually extreme values, called outliers, as compared to other distributions. Also the bell curve signifies that the data is symetrical and thus we can create reasonable expectations as to the possibility that an outcome will lie within a range to the left or right of the center, once we can measure the amount of deviation contained in the data . These are measured in terms of standard deviations. A bell curve graph depends on two factors, the mean and the standard deviation. The mean identifies the position of the center and the standard deviation determines the the height and width of the bell. For example , a large standard deviation creates a bell that is short and wide while a small standard deviation creates a tall and narrow curve. To understand the probability factors of a normal distribution you need to understand the following ‘rules’: 1. The total area under the curve is equal to 1 (100%) 2. About 68% of the area under the curve falls within 1 standard deviation. 3. About 95% of the area under the curve falls within 2 standard deviations. 4 About 99.7% of the area under the curve falls within 3 standard devations. Items 2,3 and 4 are sometimes referred to as the ‘empirical rule’ or the 68-95-99.7 rule. In terms of probability, once we determine that the data is normally distributed ( bell curved) and we calculate the mean and standard deviation, we are able to determine the probability that a single data point will fall within a given range of possibilities. Expected value. In probability theory, the expected value (or expectation, mathematical expectation, EV, mean, or first moment) refers, intuitively, to the value of a random variable one would "expect" to find if one could repeat the random variable process an infinite number of times and take the average of the values obtained. More formally, the expected value is a weighted average of all possible values. In other words, each possible value the random variable can assume is multiplied by its assigned weight, and the resulting products are then added together to find the expected value. The weights used in computing this average are the probabilities in the case of a discrete random variable (that is, a random variable that can only take on a finite number of values, such as a roll of a pair of dice), or the values of a probability density function in the case of a continuous random variable (that is, a random variable that can assume a theoretically infinite number of values, such as the height of a person. Expected Value If the outcomes of an experiment have values E1, E2, E3, E4, . . . , En, Then the Expected Value of the experiment is E1●P(E2) + E3●P(E4) + E5●P(E6) + . . . . +En●P(En) In other words . . . . Expected Value = Sum of all the products of the outcomes multiplied by their respective probabilities. What is a Confidence Interval? Statisticians use a confidence interval to describe the amount of uncertainty associated with a sample estimate of a population parameter. How to Interpret Confidence Intervals Suppose that a 90% confidence interval states that the population mean is greater than 100 and less than 200. How would you interpret this statement? Some people think this means there is a 90% chance that the population mean falls between 100 and 200. This is incorrect. Like any population parameter, the population mean is a constant, not a random variable. It does not change. The probability that a constant falls within any given range is always 0.00 or 1.00. The confidence level describes the uncertainty associated with a sampling method. Suppose we used the same sampling method to select different samples and to compute a different interval estimate for each sample. Some interval estimates would include the true population parameter and some would not. A 90% confidence level means that we would expect 90% of the interval estimates to include the population parameter; A 95% confidence level means that 95% of the intervals would include the parameter; and so on. Hypothesis Testing Hypothesis testing is the use of statistics to determine the probability that a given hypothesis is true. The usual process of hypothesis testing consists of four steps. 1. Formulate the null hypothesis (commonly, that the observations are the result of pure chance) and the alternative hypothesis (commonly, that the observations show a real effect combined with a component of chance variation). 2. Identify a test statistic that can be used to assess the truth of the null hypothesis. 3. Compute the P-value, which is the probability that a test statistic at least as significant as the one observed would be obtained assuming that the null hypothesis were true. The smaller the -value, the stronger the evidence against the null hypothesis. 4. Compare the -value to an acceptable significance value (sometimes called an alpha value). If , that the observed effect is statistically significant, the null hypothesis is ruled out, and the alternative hypothesis is valid.