Download sampling – evaluating algoritms

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Infinite monkey theorem wikipedia , lookup

Random variable wikipedia , lookup

Inductive probability wikipedia , lookup

Birthday problem wikipedia , lookup

Randomness wikipedia , lookup

Conditioning (probability) wikipedia , lookup

Probability interpretations wikipedia , lookup

Law of large numbers wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Transcript
SAMPLING – EVALUATING ALGORITMS
October 3: Lecture 9
Some simple Combinatorics
• In how many ways can we order n objects?
• How do we prove this?
– …… induction anyone?
• This is also called the number of permutations
Some simple Combinatorics
• In how many ways we can select k objects out of n?
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5
• The number is
• How do we prove this?
Some simple Combinatorics
• In how many ways we can select 2 objects out of 4?
• Number the objects and count distinct pairs
• Take all permutations and count the first pair
1
2
3
4
1
2
4
3
2
1
3
4
2
1
4
3
• Each pair is `overcounted’, by a factor of
Binomial Distribution
Bernoulli Trial
• A rigged coin: Head Probability p, Tail probability 1-p
• Suppose I throw the coin n times.
What is the probability of getting k heads?
1
2
3
4
5
• What is the probability that the k red are heads
and the n-k blue are tails ?
• Going back to the original question
(consider all k subsets):
The Bernoullis
Bernoulli
From Wikipedia, the free encyclopedia
Bernoulli can refer to:
any one or more of the Bernoulli family of Swiss mathematicians in the 18th
century, including:
Daniel Bernoulli (1700–1782), developer of Bernoulli's principle
Jacob Bernoulli (1654–1705), also known as Jean or Jacques, after
whom Bernoulli numbers are named
Johann Bernoulli (1667–1748)
Johann III Bernoulli (1744–1807), also known as Jean, astronomers
Nicolaus I Bernoulli (1687–1759)
Nicolaus II Bernoulli (1695–1726)
Binomial Distribution
by picture
Expected Value
also called mean
• A dice, a random variable X (can be 1…6)
• What is its expected value?
• A rigged coin, Xi=1 with probability p, X=0 with 1-p
• In general we sum the products of probabilities * values
Binomial!
Variance
measuring the wildness
• Ok, a random variable has an expected value. But it can
be always near it, or it can be very wild all over the place!
• This is captured by variance.
• We define another random variable::
• We define
• Standard deviation
• Something helpful:
• So.. Var(X) = E[X2] – (E[X])2
Variance for Binomial
• A rigged coin, Xi=1 with probability p, X=0 with 1-p
• The variance of Xi
Proof?
The normal distribution
a continuous probability
• Probability density function
• The probability makes sense only for intervals
Sampling before polls
• For simplicity say that everyone is going to vote for either
democrats or republicans. Also assume that every voter
in the USA can be reached by phone.
• Problem: Estimate the influence of the two parties.
• Solution: Take 1000 persons at random and ask them.
• Say 550 say democrats, 450 say republicans.
What can we infer?
Sampling before polls
confidence intervals
• Percentage of democrats is with probability 95%
in the interval
• Percentage of democrats is with probability 97.5%
at least
Confidence Intervals
How do we prove them?
• We have a Binomial Distribution.
• For very large n it looks like a normal distribution
• We can then find the shaded area around the mean, by
calculating an interval
• Concretely we ask what is the interval that covers, say
95% of the total area?
Sampling before polls
• For simplicity say that everyone is going to vote for either
democrats or republicans. Also assume that every voter
in the USA can be reached by phone.
• However, not everyone has a listed phone number. More
females hide their numbers from the public.
• We can do the same experiment as before. But now we
have a biased estimator. The estimator random variable
does not have the same mean as the actual random
variable.
• This is called BIAS.
Confidence intervals in general
In general we desire unbiased estimators
with small variance