Download L5 IntroToProbability

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
L5: Intro To Probability CSCI 3022
I). Recap So Far:
A). Descriptive Statistics
B). Visualizing Data:
1. Univariate Data
Histograms
Box plots
2 .Bivariate Data
Bivariate data is data for which there are two variables for each observation.
We can plot stacked histograms or side-by-side boxplots to visualize, but this won’t always reveal relationships between the two
variables.
Ex). Histograms from sample of spousal ages for 282 American couples:
We can learn much more by displaying the bivariate data in a graphical form that
maintains the pairing. The figure below shows a scatter plot of the paired ages. The xaxis represents the age of the husband and the y-axis the age of the wife:
L5: Intro To Probability CSCI 3022
II)
Why Probability?
The last few lessons we explored Descriptive Statistics.
Descriptive statistics help us understand the collective properties of the elements of a data sample and form the basis for
testing hypotheses and making predictions using inferential statistics.
Inferential statistics is built on the foundation of probability theory, so we turn our attention to probability!
Poll 1:
Inside a bag there is one marble, which is equally likely to be red or purple. You add a red marble, shake the bag, and take out a
marble at random. It’s red. What’s the probability that the remaining marble is red?
Poll 2:
Linda is 31 years old, outspoken, very bright and has a degree in philosophy. As a student, she was deeply concerned with the issues
of discrimination and social justice and has participated in protests.
Which is more probable?
A. Linda is a banker
B. Linda is a banker and donates to the Equal Justice Initiative
L5: Intro To Probability CSCI 3022
II). Defining Probability
A). Review of Set Notation:
B). Probability Terminology
C). Probability Functions
You can think of probability as a numerical measure of uncertainty. Exactly what this means is the subject of considerable
philosophical debate which we will touch on from time to time. For now, it is reassuring to note that almost all sides in the debate
agree on some basic computational principles.
Ex: A standard 52-card deck consists of 13 cards in each of four suits: hearts, diamonds,
spades, and clubs.
Suppose you draw a single card at random from a standard 52-card deck.
a). What is the probability that the card is the Ace of Diamonds?
b). What is the probability that the card is an Ace or a Diamond?
L5: Intro To Probability CSCI 3022
DEFINITION:
Comment: It’s a great idea to think of probability as a measure like length, area, or volume. Indeed, advanced classes in
probability are based on an area of mathematics called “measure theory”.
D). The Philosophy of Probability:
What do we mean by “the probability that this coin lands on heads is 0.5”?
Does it mean that the coin has a certain physical property that causes it to land on heads 50% of the time?
Does it mean that the coin exhibits certain behaviors in the long run?
Or does probability refer to something about my subjective degree of belief that the coin will land on heads?
To further complicate things, what does it mean to say that “the probability that democrats win the US presidency in 2024 is 0.4”?
(What physical property could “probability” refer to here? There is no “long run” for a single event like an election…)
The fact that there doesn’t seem to be a single clear correct answer to the question what kind of thing is probability? is philosophically
interesting. But it is also statistically and scientifically interesting, because so many issues in statistics and science arise because there
are different, plausible interpretations of probability theory.
Some Interpretations of Probability:
Classical Interpretation: Symmetrical
outcomes:
Pros:
•
Cons:
•
•
Applies easily to fair coins/die/card
examples
Objective (Frequentist view) Interpretation:
Subjective (sometimes called Bayesian)
Interpretation:
Defines probabilities as relative frequencies.
So, what occurs in the long run is the
probability.
Defines probabilities as subjective degree
of belief.
Ex: Saying that “there’s a 10% probability of it raining
today” means if we keep track of all days on which it is
forecast to rain with probability 10%, then in the long run
the number of days it actually does rain gets closer and
closer to 10%.
Ex: Saying that “there’s a 10% probability of it
raining today” means that someone or group (e.g.,
the people who constructed the model) believe with
10% confidence that it will rain today. Subjectivists
often justify their views based on what people
would be willing to bet (and avoiding so-called
Dutch Book Arguments).
Doesn’t handle situations when
outcomes are not equally likely
Doesn’t handle situations when the #
of possible outcomes is infinite.
We’ve just scratched the surface! If you’re interested in exploring this more check out STAT4700: Philosophy of Statistics
L5: Intro To Probability CSCI 3022
E). More Practice with Probability:
Ex). You take a random survey of 100 CU students and ask them if they use Facebook and/or Twitter.
80 of the students report they use Twitter, 40 of the students report that they use Facebook and 3 students report they don’t
use either. Suppose you choose a name at random from the list you surveyed.
a). What’s the probability that this person does not use Facebook?
b). What’s the probability that this person uses Facebook or Twitter?
c). What’s the probability that this person uses Facebook and Twitter?
d). What’s the probability that this person does not use Facebook given that they do use Twitter?
Back to our poll:
Inside a bag there is one marble, which is equally likely to be red or purple. You add a red marble, shake the bag, and take out a
marble at random. It’s red. What’s the probability that the remaining marble is red?
L5: Intro To Probability CSCI 3022
Ex). A bit string of length four is generated at random so that each of the 16 possible bit-strings is equally
likely. What is the probability that it contains at least two consecutive 1s, given that the first bit is a 1?
{1111, 1110, 1101, 1011, 0111, 1100, 1010, 1001, 0110, 0011, 0101, 1000, 0100, 0010, 0001, 0000}
3). Multiplication Rule
The definition of conditional probability yields the following result:
Ex). I deal two cards at random without replacement from a standard 52 card deck.
a). What is the probability that both cards are hearts?
Ex). Suppose you deal five cards from a well-shuffled deck.
a). What is the chance that all five cards are hearts?
b). What is the probability that all five cards are of the same suit?