Download Random Variables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Random Variables
A random variable is numerical characteristic of each event in a sample space, or
equivalently, each individual in a population. Examples include the number of correct
answers when guessing on a multiple choice exam or the amount of money one spends on
a weekend.
These random variables are classified into two types: discrete or continuous. A
discrete random variable has a countable set of distinct possible values, while a
continuous random variable is such that any value (to any number of decimal places)
within some interval is a possible value. A more readably defining difference would be
that discrete random variables are counted and continuous random variables are
measured. For instance, the number of beer bottles in a case of beer is discrete, but the
volume of ounces is continuous (examine each bottle; do they have the exact same
amount in each?)
Probability Distributions
Consider the 2010 World Cup. The number of goals scored in all games played was as
follows:
X = goals
P(X = x)
0
0.11
1
0.26
2
0.20
3
0.22
4
0.11
5
0.08
6
0.0
7
0.02
Questions:
1. What total goals was most likely to occur?
Answer: 1 since it had the highest probability i.e. outcome P(X =1) was 0.26
2. Are the total goals scored mutually exclusive?
Answer: Yes, for instance for one game the total goals could not 4 and 5 for the
same game.
3. Are the goals scored independent?
Answer: No, they are not. Since they are mutually exclusive then by rule they
would be dependent. Consider the probability of scoring 2 goals in a game and the
probability of scoring 3 goals in a game. If you knew that the game ended with 2 goals,
what is the probability that the game ended with 3 goals? Since you know, i.e. given, that
3 goals were scored then the probability of 2 goals being scored is 0. This P(2) = 0 does
not equal P(2) = 0.20 and from probability rules, for these two events to be independent
then P(2|3) = P(2) and this is not the case!
4. What do all of the probabilities sum to?
Answer: They sum to one.
5. How would we find the probability for 6 goals being scored if this was not given?
1
Answer: We would add up the know probabilities and then subtract this sum from
one.
6. What is the probability that for a randomly selected the total goals scored were 5 or
better?
Answer: This is asking to find P(X >= 5) = P(5 or 6 or 7) = P(5) + P(6) + P(7) =
0.08 + 0.0 + 0.02 = 0.10 Conversely, we could use the complement rule and find this
from 1 – P(X < C) = 1 – P(0 or 1 or 2 or 3 or 4) = 1 – (0.11 + 0.26 + 0.22 + 0.20 +0.11) =
1 – 0.90 = 0.10
7. Looking at this distribution of goals scored, about what number of goals would you
expect to see on average?
Answer: Due to the weights of the grades you should expect the mean grade of
between 2 and 3.
8. What is the mean or expected value?
Answer: The typical method is to add up all of the grades and divide by the
number of observations summed, but that method assumes that all outcomes are equally
likely. Here that is not the case. The average is found by weighting the observations
with the higher probability outcomes influencing or weighting the mean more than the
lower probability outcomes. The formula for finding the expected value for a discrete
probability distribution is to take each outcome times it respective probability and then
summing these results. The formula for this method looks as follows:
Expected Value of X = E(X) = ∑XiP(Xi) = (0)*(0.11) + (1)*(0.26) + (2)*(0.20) +
(3)*(0.22) + (4)*(0.11) + (5)*(0.08) + (6)*(0.0) + (7)*(0.02) = 2.3 or somewhere
midway between 2 and 3.
9. Since you would not expect games to end up with the same score (and obviously no
game can have a total of 2.3 goals scored), there is some variability. How do we
calculate this standard deviation for a discrete probability distribution?
Answer: This found by taking the square root of the variance where the variance
is Var(X) = ∑X2iP(Xi) – [E(X)]2. So the variance here would be found by:
(0)2*(0.11) + (1)2*(0.26) + (2)2*(0.20) + (3)2*(0.22) + (4)2*(0.11) + (5)*(0.08) +
(6)*(0.0) + (7)*(0.02) – (2.3)2 = 7.78 – 5.29 = 2.49
So the standard deviation is the square root of 2.49 or 1.58
Binomial Random Variable
A specific type of discrete random variable is a binomial random variable. A binomial
random variable to exist, the following conditions MUST be met:

There are a fixed number of trials (a fixed sample size).
2



On each trial, the event of interest either occurs or does not, i.e. only two possible
outcomes.
The probability of occurrence (or not) is the same on each trial.
Trials are independent of one another.
Consider if our interest was simply whether or not no goals were scored (i.e. the game
was a shutout). This has two outcomes: 0 goals or more than 0 goals. If we consider a
situation where we randomly select three games and want to find the probability that one
of the three games was a shutout, can the event “one game was a shutout” be considered a
binomial random variable?
Answer: We would first have to check the four conditions. Is there a fixed
number of trials? Yes, we have a trial size of 3. In each trial are there only two possible
outcomes? Yes, either the game had no goals or there were goals scored. Is the
probability of the event happening the same for each trial? Yes the probability of a
shutout is 0.11 for each game. Finally, are the trials independent? Yes, whether one
game was scoreless would not affect whether the other games were scoreless. Since all
conditions are satisfied, we have a binomial situation.
1. What is the probability that only one game of three was scoreless?
Answer: the sample space would look like this where S = shutout and N = not
shutout: SNN, NSN, NNS as these are the only three possible outcomes where only one
student passed. Since the events are independent, P(S and N and N)) = P(S)* P(N)*P(N)
= 0.11*0.89*0.89 = 0.087 and note that the probability of the other two events are
identical. Therefore, the probability of only one student from these three passing the
exam is 0.087 + 0.087 + 0.087 = 0.261
2. Is there an easier way to calculate this especially if we had a larger fixed number of
trials?
Answer: Yes, we can use the binomial formula. If we let x = number of outcomes
of interest and “p” is the probability of x, then:
P(X = x) =
n!
p x (1  p)n  x and from our example in number 1 above:
x!(n  x)!
3!
0.111 (1  0.11) 31 = (3)*(0.11)(0.89)2 = 3*0.087 = 0.261
1!(3  1)!
3. What is the mean and standard deviation for a binomial random variable?
Answer: The mean or expected value is simply found by taking n*p. So the mean
would be 3*0.11 = 0.33. The standard deviation is taking found by
np(1  p) = 3 * 0.11(1  0.11) =0.542
P(X=1) =
3