Download STA 291 Summer 2010

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Randomness wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Lecture 6
Dustin Lueker

Standardized measure of variation
◦ Idea
 A standard deviation of 10 may indicate great
variability or small variability, depending on the
magnitude of the observations in the data set

CV = Ratio of standard deviation divided by
mean
◦ Population and sample version
STA 291 Summer 2010 Lecture 6
2

Which sample has higher relative variability?
(a higher coefficient of variation)
◦ Sample A
 mean = 62
 standard deviation = 12
 CV =
◦ Sample B
 mean = 31
 standard deviation = 7
 CV =
STA 291 Summer 2010 Lecture 6
3

Experiment

Random (or Chance) Experiment

Outcome

Sample Space

Event

Simple Event
◦ Any activity from which an outcome, measurement, or other
such result is obtained
◦ An experiment with the property that the outcome cannot
be predicted with certainty
◦ Any possible result of an experiment
◦ Collection of all possible outcomes of an experiment
◦ A specific collection of outcomes
◦ An event consisting of exactly one outcome
STA 291 Summer 2010 Lecture 6
4
Examples:
Experiment
1. Flip a coin
2. Flip a coin 3 times
3. Roll a die
4. Draw a SRS of size
50 from a
population
Sample Space
1.
2.
3.
4.
STA 291 Summer 2010 Lecture 6
Event
1.
2.
3.
4.
5


Let A denote an event
Complement of an event A
A
S
◦ Denoted by AC, all the outcomes in the sample
space S that do not belong to the event A
◦ P(AC)=1-P(A)

Example
◦ If someone completes 64% of his passes, then what
percentage is incomplete?
STA 291 Summer 2010 Lecture 6
6


Let A and B denote two events
Union of A and B
◦ A∪B
◦ All the outcomes in S that belong to at least one of
A or B

Intersection of A and B
◦ A∩B
◦ All the outcomes in S that belong to both A and B
STA 291 Summer 2010 Lecture 6
7

Let A and B be two events in a sample space S
◦ P(A∪B)=P(A)+P(B)-P(A∩B)
A
B
S
STA 291 Summer 2010 Lecture 6
8

Let A and B be two events in a sample space S
◦ P(A∪B)=P(A)+P(B)-P(A∩B)
 At State U, all first-year students must take chemistry
and math. Suppose 15% fail chemistry, 12% fail math,
and 5% fail both. Suppose a first-year student is
selected at random, what is the probability that the
student failed at least one course?
STA 291 Summer 2010 Lecture 6
9


Let A and B denote two events
A and B are Disjoint (mutually exclusive)
events if there are no outcomes common to
both A and B
◦ A∩B=Ø
 Ø = empty set or null set

A
Let A and B be two disjoint
events in a sample space S
B
S
◦ P(A∪B)=P(A)+P(B)
STA 291 Summer 2010 Lecture 6
10
P( A  B)
P( A | B) 
, provided P( B)  0
P( B)
◦ Note: P(A|B) is read as “the probability that A
occurs given that B has occurred”
STA 291 Summer 2010 Lecture 6
11

The probability of an event occurring is
nothing more than a value between 0 and 1
◦ 0 implies the event will never occur
◦ 1 implies the event will always occur

How do we go about figuring out
probabilities?
STA 291 Summer 2010 Lecture 6
12


Can be difficult
Different approaches to assigning probabilities to
events
◦ Subjective
◦ Objective
 Equally likely outcomes (classical approach)
 Relative frequency
STA 291 Summer 2010 Lecture 6
13

Relies on a person to make a judgment on
how likely an event is to occur
◦ Events of interest are usually events that cannot be
replicated easily or cannot be modeled with the
equally likely outcomes approach
 As such, these values will most likely vary from person
to person

The only rule for a subjective probability is
that the probability of the event must be a
value in the interval [0,1]
STA 291 Summer 2010 Lecture 6
14

The equally likely approach usually relies on
symmetry to assign probabilities to events
◦ As such, previous research or experiments are not
needed to determine the probabilities
 Suppose that an experiment has only n outcomes
 The equally likely approach to probability assigns a
probability of 1/n to each of the outcomes
 Further, if an event A is made up of m outcomes then
P(A) = m/n
STA 291 Summer 2010 Lecture 6
15

Selecting a simple random sample of 2
individuals
◦ Each pair has an equal probability of being selected

Rolling a fair die
◦ Probability of rolling a “4” is 1/6
 This does not mean that whenever you roll the die 6
times, you always get exactly one “4”
◦ Probability of rolling an even number
 2,4, & 6 are all even so we have 3 possible outcomes
in the event we want to examine
 Thus the probability of rolling an even number is
3/6 = 1/2
STA 291 Summer 2010 Lecture 6
16

Borrows from calculus’ concept of the limit
a
P( A)  lim
n  n
◦ We cannot repeat an experiment infinitely many
times so instead we use a ‘large’ n
 Process
 Repeat an experiment n times
 Record the number of times an event A occurs, denote this
value by a
 Calculate the value of a/n
a
P( A) 
n
STA 291 Summer 2010 Lecture 6
17

“large” n?
◦ Law of Large Numbers
 As the number of repetitions of a random experiment
increases, the chance that the relative frequency of
occurrence for an event will differ from the true
probability of the event by more than any small
number approaches zero
 Doing a large number of repetitions allows us to
accurately approximate the true probabilities using the
results of our repetitions
STA 291 Summer 2010 Lecture 6
18

X is a random variable if the value that X will
assume cannot be predicted with certainty
◦ That’s why its called random

Two types of random variables
◦ Discrete
 Can only assume a finite or countably infinite number
of different values
◦ Continuous
 Can assume all the values in some interval
STA 291 Summer 2010 Lecture 6
19

Are the following random variables discrete
or continuous?
◦ X = number of houses sold by a real estate
developer per week
◦ X = weight of a child at birth
◦ X = time required to run 800 meters
◦ X = number of heads in ten tosses of a coin
STA 291 Summer 2010 Lecture 6
20

A list of the possible values of a random
variable X, say (xi) and the probability
associated with each, P(X=xi)
◦ All probabilities must be nonnegative
◦ Probabilities sum to 1
0  P( xi )  1
 P( x )  1
i
STA 291 Summer 2010 Lecture 6
21

X
0
1
2
3
4
P(X)
.1
.2
.2
.15
.1
5
6
7
.05 .05 .15
The table above gives the proportion of
employees who use X number of sick days in
a year
◦ An employee is to be selected at random
 Let X = # of days of leave




P(X=2) =
P(X≥4) =
P(X<4) =
P(1≤X≤6) =
STA 291 Summer 2010 Lecture 6
22

Expected Value (or mean) of a random
variable X
◦ Mean = E(X) = μ = ΣxiP(X=xi)

Example
X
2
4
6
8
10
12
P(X)
.1
.05
.4
.25
.1
.1
◦ E(X) =
STA 291 Summer 2010 Lecture 6
23

Variance
◦ Var(X) = E(X-μ)2 = σ2 = Σ(xi-μ)2P(X=xi)

Example
X
2
4
6
8
10
12
P(X)
.1
.05
.4
.25
.1
.1
◦ Var(X) =
STA 291 Summer 2010 Lecture 6
24