Download Concepts in Probability - Glasgow Independent Schools

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Inductive probability wikipedia , lookup

Foundations of statistics wikipedia , lookup

Probability amplitude wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
Concepts in Probability
The study of probability mostly deals with combining different events and studying these events
alongside each other. How these different events relate to each other determines the methods and
rules to follow when we're studying their probabilities.
Events can be divided into two major categories dependent or Independent events.
Independent Events
When two events are said to be independent of each other, what this means is that the probability
that one event occurs in no way affects the probability of the other event occurring. An example
of two independent events is as follows; say you rolled a die and flipped a coin. The probability
of getting any number face on the die in no way influences the probability of getting a head or a
tail on the coin.
Dependent Events
When two events are said to be dependent, the probability of one event occurring influences the
likelihood of the other event.
For example, if you were to draw a two cards from a deck of 52 cards. If on your first draw you
had an ace and you put that aside, the probability of drawing an ace on the second draw is greatly
changed because you drew an ace the first time.
Conditional Probability
We have already defined dependent and independent events and seen how probability of
one event relates to the probability of the other event.
Having those concepts in mind, we can now look at conditional probability.
Conditional probability deals with further defining dependence of events by looking at
probability of an event given that some other event first occurs.
Conditional probability is denoted by the following:
The above is read as the probability that B occurs given that A has already occurred.
The above is mathematically defined as:
What is a Random Variable?
When the numerical value of a variable is determined by a chance event, that variable is
called a random variable.
Discrete vs. Continuous Random Variables
Random variables can be discrete or continuous.


Discrete. Within a range of numbers, discrete variables can take on only certain
values. Suppose, for example, that we flip a coin and count the number of heads.
The number of heads will be a value between zero and plus infinity. Within that
range, though, the number of heads can be only certain values. For example, the
number of heads can only be a whole number, not a fraction. Therefore, the
number of heads is a discrete variable. And because the number of heads results
from a random process - flipping a coin - it is a discrete random variable.
Continuous. Continuous variables, in contrast, can take on any value within a
range of values. For example, suppose we randomly select an individual from a
population. Then, we measure the age of that person. In theory, his/her age can
take on any value between zero and plus infinity, so age is a continuous variable.
In this example, the age of the person selected is determined by a chance event;
so, in this example, age is a continuous random variable.
Discrete Variables: Finite vs. Infinite
Some references state that continuous variables can take on an infinite number of values,
but discrete variables cannot. This is incorrect.


In some cases, discrete variables can take on only a finite number of values. For
example, the number of aces dealt in a poker hand can take on only five values: 0,
1, 2, 3, or 4.
In other cases, however, discrete variables can take on an infinite number of
values. For example, the number of coin flips that result in heads could be
infinitely large.
When comparing discrete and continuous variables, it is more correct to say that
continuous variables can always take on an infinite number of values; whereas some
discrete variables can take on an infinite number of values, but others cannot.
Probability Distributions
Uniform distribution (discrete)
In probability theory and statistics, the discrete uniform distribution is a symmetric
probability distribution whereby a finite number of values are equally likely to be
observed; every one of n values has equal probability 1/n. Another way of saying
"discrete uniform distribution" would be "a known, finite number of outcomes equally
likely to happen".
A simple example of the discrete uniform distribution is throwing a fair die. The possible
values are 1, 2, 3, 4, 5, 6, and each time the die is thrown the probability of a given score
is 1/6. If two dice are thrown and their values added, the resulting distribution is no
longer uniform since not all sums have equal probability.
Geometric Probability Distribution
Geometric Setting
A geometric setting arises when we perform independent trials of the same chance process and
record the number of trials until a particular outcome occurs. The four conditions for a geometric
setting are
-Binary? The possible outcomes of each trial can be classified as “success” or “failure”.
-Independent? Trials must be independent; that is, knowing the result of one trial must not have
any effect on the result of any other trial.
-Trials? The goal is to count the number of trials until the first success occurs.
-Success? On each trial, the probability p of success must be the same.
Geometric Random Variable and Geometric Distribution
The number of trials Y that it takes to get a success in a geometric setting is a geometric random
variable. The probability distribution of Y is a geometric distribution with parameter p, the
probability of a success on any trial. The possible values of Y are 1, 2, 3….
Geometric Probability
If Y has the geometric distribution with probability p of success on each trial, the possible values
of Y are 1, 2, 3, …..If k is any one of these values,
P(Y = k) = (1 – p)k-1 p
The Binomial Probability Distribution
A binomial experiment is one that possesses the following properties:
1. The experiment consists of n repeated trials;
2. Each trial results in an outcome that may be classified as a success or a failure
(hence the name, binomial);
3. The probability of a success, denoted by p, remains constant from trial to trial and
repeated trials are independent.
The number of successes X in n trials of a binomial experiment is called a binomial
random variable.
The probability distribution of the random variable X is called a binomial distribution,
and is given by the formula:
P(X)=Cnxpxqn−x
where
n = the number of trials
x = 0, 1, 2, ... n
p = the probability of success in a single trial
q = the probability of failure in a single trial
(i.e. q = 1 − p)
Cnx is a combination.
P(X) gives the probability of successes in n binomial trials.
Bell Curve, Normal Distribution Defined
Definition: The term bell curve is used to describe the mathematical concept called
normal distribution, sometimes referred to as Gaussian distribution. ‘Bell curve’ refers to
the shape that is created when a line is plotted using the data points for an item that meets
the criteria of ‘normal distribution’. The center contains the greatest number of a value
and therefore would be the highest point on the arc of the line. This point is referred to
the mean, but in simple terms it is the highest number of occurences of a element. (
statistical terms, the mode). The important things to note about a normal distribution is
the curve is concentrated in the center and decreases on either side. This is significant in
that the data has less of a tendency to produce unusually extreme values, called outliers,
as compared to other distributions. Also the bell curve signifies that the data is symetrical
and thus we can create reasonable expectations as to the possibility that an outcome will
lie within a range to the left or right of the center, once we can measure the amount of
deviation contained in the data . These are measured in terms of standard deviations. A
bell curve graph depends on two factors, the mean and the standard deviation. The mean
identifies the position of the center and the standard deviation determines the the height
and width of the bell. For example , a large standard deviation creates a bell that is short
and wide while a small standard deviation creates a tall and narrow curve.
To understand the probability factors of a normal distribution you need to understand the
following ‘rules’:
1. The total area under the curve is equal to 1 (100%)
2. About 68% of the area under the curve falls within 1 standard deviation.
3. About 95% of the area under the curve falls within 2 standard deviations.
4 About 99.7% of the area under the curve falls within 3 standard devations.
Items 2,3 and 4 are sometimes referred to as the ‘empirical rule’ or the 68-95-99.7 rule.
In terms of probability, once we determine that the data is normally distributed ( bell
curved) and we calculate the mean and standard deviation, we are able to determine the
probability that a single data point will fall within a given range of possibilities.
Expected value.
In probability theory, the expected value (or expectation, mathematical expectation,
EV, mean, or first moment) refers, intuitively, to the value of a random variable one
would "expect" to find if one could repeat the random variable process an infinite number
of times and take the average of the values obtained. More formally, the expected value is
a weighted average of all possible values. In other words, each possible value the random
variable can assume is multiplied by its assigned weight, and the resulting products are
then added together to find the expected value. The weights used in computing this
average are the probabilities in the case of a discrete random variable (that is, a random
variable that can only take on a finite number of values, such as a roll of a pair of dice),
or the values of a probability density function in the case of a continuous random variable
(that is, a random variable that can assume a theoretically infinite number of values, such
as the height of a person.
Expected Value
If the outcomes of an experiment have values E1, E2, E3, E4, . . . , En,
Then the Expected Value of the experiment is
E1●P(E2) + E3●P(E4) + E5●P(E6) + . . . . +En●P(En)
In other words . . . .
Expected Value = Sum of all the products of the outcomes multiplied by their respective
probabilities.
What is a Confidence Interval?
Statisticians use a confidence interval to describe the amount of uncertainty associated
with a sample estimate of a population parameter.
How to Interpret Confidence Intervals
Suppose that a 90% confidence interval states that the population mean is greater than
100 and less than 200. How would you interpret this statement?
Some people think this means there is a 90% chance that the population mean falls
between 100 and 200. This is incorrect. Like any population parameter, the population
mean is a constant, not a random variable. It does not change. The probability that a
constant falls within any given range is always 0.00 or 1.00.
The confidence level describes the uncertainty associated with a sampling method.
Suppose we used the same sampling method to select different samples and to compute a
different interval estimate for each sample. Some interval estimates would include the
true population parameter and some would not. A 90% confidence level means that we
would expect 90% of the interval estimates to include the population parameter; A 95%
confidence level means that 95% of the intervals would include the parameter; and so on.
Hypothesis Testing
Hypothesis testing is the use of statistics to determine the probability that a given
hypothesis is true. The usual process of hypothesis testing consists of four steps.
1. Formulate the null hypothesis (commonly, that the observations are the result of
pure chance) and the alternative hypothesis (commonly, that the observations show a
real effect combined with a component of chance variation).
2. Identify a test statistic that can be used to assess the truth of the null hypothesis.
3. Compute the P-value, which is the probability that a test statistic at least as significant
as the one observed would be obtained assuming that the null hypothesis were true. The
smaller the -value, the stronger the evidence against the null hypothesis.
4. Compare the -value to an acceptable significance value (sometimes called an alpha
value). If
, that the observed effect is statistically significant, the null hypothesis is
ruled out, and the alternative hypothesis is valid.