Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Random Variables A random variable is numerical characteristic of each event in a sample space, or equivalently, each individual in a population. Examples include the number of correct answers when guessing on a multiple choice exam or the amount of money one spends on a weekend. These random variables are classified into two types: discrete or continuous. A discrete random variable has a countable set of distinct possible values, while a continuous random variable is such that any value (to any number of decimal places) within some interval is a possible value. A more readably defining difference would be that discrete random variables are counted and continuous random variables are measured. For instance, the number of beer bottles in a case of beer is discrete, but the volume of ounces is continuous (examine each bottle; do they have the exact same amount in each?) Probability Distributions Consider the 2010 World Cup. The number of goals scored in all games played was as follows: X = goals P(X = x) 0 0.11 1 0.26 2 0.20 3 0.22 4 0.11 5 0.08 6 0.0 7 0.02 Questions: 1. What total goals was most likely to occur? Answer: 1 since it had the highest probability i.e. outcome P(X =1) was 0.26 2. Are the total goals scored mutually exclusive? Answer: Yes, for instance for one game the total goals could not 4 and 5 for the same game. 3. Are the goals scored independent? Answer: No, they are not. Since they are mutually exclusive then by rule they would be dependent. Consider the probability of scoring 2 goals in a game and the probability of scoring 3 goals in a game. If you knew that the game ended with 2 goals, what is the probability that the game ended with 3 goals? Since you know, i.e. given, that 3 goals were scored then the probability of 2 goals being scored is 0. This P(2) = 0 does not equal P(2) = 0.20 and from probability rules, for these two events to be independent then P(2|3) = P(2) and this is not the case! 4. What do all of the probabilities sum to? Answer: They sum to one. 5. How would we find the probability for 6 goals being scored if this was not given? 1 Answer: We would add up the know probabilities and then subtract this sum from one. 6. What is the probability that for a randomly selected the total goals scored were 5 or better? Answer: This is asking to find P(X >= 5) = P(5 or 6 or 7) = P(5) + P(6) + P(7) = 0.08 + 0.0 + 0.02 = 0.10 Conversely, we could use the complement rule and find this from 1 – P(X < C) = 1 – P(0 or 1 or 2 or 3 or 4) = 1 – (0.11 + 0.26 + 0.22 + 0.20 +0.11) = 1 – 0.90 = 0.10 7. Looking at this distribution of goals scored, about what number of goals would you expect to see on average? Answer: Due to the weights of the grades you should expect the mean grade of between 2 and 3. 8. What is the mean or expected value? Answer: The typical method is to add up all of the grades and divide by the number of observations summed, but that method assumes that all outcomes are equally likely. Here that is not the case. The average is found by weighting the observations with the higher probability outcomes influencing or weighting the mean more than the lower probability outcomes. The formula for finding the expected value for a discrete probability distribution is to take each outcome times it respective probability and then summing these results. The formula for this method looks as follows: Expected Value of X = E(X) = ∑XiP(Xi) = (0)*(0.11) + (1)*(0.26) + (2)*(0.20) + (3)*(0.22) + (4)*(0.11) + (5)*(0.08) + (6)*(0.0) + (7)*(0.02) = 2.3 or somewhere midway between 2 and 3. 9. Since you would not expect games to end up with the same score (and obviously no game can have a total of 2.3 goals scored), there is some variability. How do we calculate this standard deviation for a discrete probability distribution? Answer: This found by taking the square root of the variance where the variance is Var(X) = ∑X2iP(Xi) – [E(X)]2. So the variance here would be found by: (0)2*(0.11) + (1)2*(0.26) + (2)2*(0.20) + (3)2*(0.22) + (4)2*(0.11) + (5)*(0.08) + (6)*(0.0) + (7)*(0.02) – (2.3)2 = 7.78 – 5.29 = 2.49 So the standard deviation is the square root of 2.49 or 1.58 Binomial Random Variable A specific type of discrete random variable is a binomial random variable. A binomial random variable to exist, the following conditions MUST be met: There are a fixed number of trials (a fixed sample size). 2 On each trial, the event of interest either occurs or does not, i.e. only two possible outcomes. The probability of occurrence (or not) is the same on each trial. Trials are independent of one another. Consider if our interest was simply whether or not no goals were scored (i.e. the game was a shutout). This has two outcomes: 0 goals or more than 0 goals. If we consider a situation where we randomly select three games and want to find the probability that one of the three games was a shutout, can the event “one game was a shutout” be considered a binomial random variable? Answer: We would first have to check the four conditions. Is there a fixed number of trials? Yes, we have a trial size of 3. In each trial are there only two possible outcomes? Yes, either the game had no goals or there were goals scored. Is the probability of the event happening the same for each trial? Yes the probability of a shutout is 0.11 for each game. Finally, are the trials independent? Yes, whether one game was scoreless would not affect whether the other games were scoreless. Since all conditions are satisfied, we have a binomial situation. 1. What is the probability that only one game of three was scoreless? Answer: the sample space would look like this where S = shutout and N = not shutout: SNN, NSN, NNS as these are the only three possible outcomes where only one student passed. Since the events are independent, P(S and N and N)) = P(S)* P(N)*P(N) = 0.11*0.89*0.89 = 0.087 and note that the probability of the other two events are identical. Therefore, the probability of only one student from these three passing the exam is 0.087 + 0.087 + 0.087 = 0.261 2. Is there an easier way to calculate this especially if we had a larger fixed number of trials? Answer: Yes, we can use the binomial formula. If we let x = number of outcomes of interest and “p” is the probability of x, then: P(X = x) = n! p x (1 p)n x and from our example in number 1 above: x!(n x)! 3! 0.111 (1 0.11) 31 = (3)*(0.11)(0.89)2 = 3*0.087 = 0.261 1!(3 1)! 3. What is the mean and standard deviation for a binomial random variable? Answer: The mean or expected value is simply found by taking n*p. So the mean would be 3*0.11 = 0.33. The standard deviation is taking found by np(1 p) = 3 * 0.11(1 0.11) =0.542 P(X=1) = 3