Download Slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
Probability and Statistics
Joyeeta Dutta-Moscato
May 24, 2016
There are three kinds of lies: lies, damned lies and statistics
- Mark Twain, attributed to Disraeli
Terms and concepts
•
•
•
•
•
Sample vs population
Central tendency: Mean, median, mode
Variance, standard deviation
Normal distribution
Descriptive
Cumulative distribution
•
•
•
•
•
•
Hypothesis
Null hypothesis (H0)
Alternate hypothesis (HA)
Significance
P-value
Confidence Interval
Statistical Hypothesis Testing
• Method of least squares
• Euclidean distance
• Overfitting & generalization
Statistical Models
Statistics
Central tendency and Spread
•
•
•
•
•
Mean
Median
Mode
Variance, standard deviation
Normal distribution
Central tendency and Spread
•
•
•
•
•
Mean
Median
Mode
Variance, standard deviation
Normal distribution
http://www.mathsisfun.com/data/standard-normal-distribution.html
But do numbers tell the full story?
https://en.wikipedia.org/wiki/
Anscombe's_quartet
Anscombe’s Quartet
Good graphics reveal data
Anscombe’s quartet
Building a model from data
Fitting the data to a model:
y = f(x)
Objective: Minimize mean square error
Does mean square error = 0 mean
this is the best model?
What does this mean about the
relationship between x and y?
Correlation
When we say that two genes are correlated, we mean that they vary
together.
But how to quantify the degree of correlation?
Pearson’s r measures the extent to which two random variables are
linearly related.
Perfect linear correlation = 1
No correlation = 0
Anti-correlation = -1
Positive Correlations
Negative Correlations
What do correlations tell us?
Interesting site:
http://www.tylervigen.com/
So how do we do make statements of causality?
-
Can ask the question:
How likely is event X given an event Y?
Probability: How likely is it?
• How likely is a certain observation?
Possible Outcomes
Head, Tail
1, 2, 3, 4, 5, 6
P(Head) = ?
P(Tail) = ?
P(1) = ?
P(2) = ?
.
.
P(6) = ?
Probability of Multiple Events
Toss a coin twice.
How likely are you to observe 2 Heads?
P(2 Heads) = P(Head) x P(Head)
What is the DISTRIBUTION of outcomes?
Key condition: INDEPENDENCE
Probability of Multiple Events
Toss a coin twice.
How likely are you to observe 2 Heads?
P(2 Heads) = P(Head) x P(Head)
Key condition: INDEPENDENCE
What is the DISTRIBUTION of outcomes?
P(2 Heads) = ¼
P(2 Tails) = ¼
P(1 Head) = P(1 Head, 1 Tail)
+ P( 1 Tail, 1 Head)
=¼+¼
=½
Key condition: Must sum to 1
Probability of Multiple Events
Toss a coin twice.
How likely are you to observe 2 Heads?
P(2 Heads) = P(Head) x P(Head)
Key condition: INDEPENDENCE
What is the DISTRIBUTION of outcomes?
P(2 Heads) = ¼
P(2 Tails) = ¼
P(1 Head) = P(1 Head, 1 Tail)
+ P( 1 Tail, 1 Head)
=¼+¼
=½
Histogram of outcomes of 10 tosses
Key condition: Must sum to 1
Probability of Multiple Events
Toss a coin twice.
How likely are you to observe 2 Heads?
P(2 Heads) = P(Head) x P(Head)
Key condition: INDEPENDENCE
What is the DISTRIBUTION of outcomes?
P(2 Heads) = ¼
P(2 Tails) = ¼
P(1 Head) = P(1 Head, 1 Tail)
+ P( 1 Tail, 1 Head)
=¼+¼
=½
Histogram of outcomes of 10 tosses
Key condition: Must sum to 1
As the number of independent (random) events grows,
the distribution approaches a NORMAL or GAUSSIAN distribution
Cumulative Distribution
The probability distribution shows the probability of the value X
The cumulative distribution shows the probability of a value
less than or equal to X
Wikipedia: http://en.wikipedia.org/wiki/Cumulative_distribution_function
Statistical Hypothesis Testing
You are running experiments to test the effect of a drug on
subjects.
How likely is it that the effect would be observed even if no real
relation exists?
If the likelihood is sufficiently small (eg. < 1%), then it can be
assumed that a real relation exists.
Otherwise, any observed effect may simply be due to chance
H0 : Null hypothesis
No relation exists
HA : Alternate hypothesis
There is some sort of relation
Statistical Hypothesis Testing
SIGNIFICANCE LEVEL is decided a priori to decide whether H0 is
accepted or rejected.
(Eg: 0.1, 0.5, 0.01)
If P-VALUE < significance level, then H0 is rejected.
i.e. The result is considered STATISTICALLY SIGNIFICANT
Wikipedia: http://en.wikipedia.org/wiki/P-value
Error reporting
How reliable is the measurement?
(How reliable is the estimate?)
Eg:
95% CONFIDENCE INTERVAL 
We are 95% confident that the true value
is within this interval
STANDARD ERROR can be used to approximate confidence intervals
Standard error = Standard deviation of the sampling distribution
Back to Probability
0 < Prob < 1
P(A) = 1 – P(AC)
[AC = Complement of A]
If events A and B are independent,
(event B has no effect on the
probability of event A)
Then:
P (A, B) = P(A) · P(B)
If they are not independent,
Then:
P (A, B) = P(A|B) · P(B)
P (A, B) = JOINT PROBABILITY of A and B
P (A|B) = CONDITIONAL PROBABILITY of A given B
Example
We are given 2 urns, each containing a collection of colored
balls. Urn 1 contains 2 white and 3 blue balls; Urn 2 contains 3
white and 4 blue balls. A ball is drawn at random from urn 1
and put into urn 2, and then a ball is picked at random from urn
2 and examined. What is the probability that the ball is blue?
Example
We are given 2 urns, each containing a collection of colored
balls. Urn 1 contains 2 white and 3 blue balls; Urn 2 contains 3
white and 4 blue balls. A ball is drawn at random from urn 1
and put into urn 2, and then a ball is picked at random from urn
2 and examined. What is the probability that the ball is blue?
Urn 2
Urn 1
3 x 5
5
8
Scenario 1: The ball
picked from Urn 1 is
blue
+
2 x 4
5
8
=
Scenario 2: The ball
picked from Urn 1 is
white
23
40
=
0.575
Bayes Theorem
P (A|B) =
P (B|A)· P(A)
P (B)
How?
P (A, B) = P(A|B) · P(B)
P (A, B) = P(B, A)
so
P(A|B) = P (A, B) / P(B)
or
P(A|B) = P(B|A)· P(A) / P(B)
Also, This is equivalent to:
P (A|B) =
P (B, A) = P(B|A)· P(A)
P (B|A)· P(A)
P (B|A)· P(A) + P (B|AC)· P(AC)
Contingency Table
Courtesy: Rich Tsui, PhD
Contingency Table
You have developed a test to detect a certain disease
What is the True Positive Rate (TPR) and True Negative Rate (TNR) of this test?
Sensitivity = TPR = TP / (TP + FN) = P(Test+ | Disease+)
Specificity = TNR = TN / (TN + FP) = P(Test- | Disease-)
What is the Positive Predictive Value (PPV) and Negative Predictive Value (NPV)?
PPV = TP / (TP + FP) = P(Disease+ | Test+)
NPV = TN / (TN + FN) = P(Disease- | Test-)
Sensitivity (TPR)
The probability of sick people who are
correctly identified as having the condition
Specificity (TNR)
The probability of healthy people who are
correctly identified as not having the
condition
Positive predictive value (PPV)
Given that you test positive, the probability
that you actually have the condition.
Negative predictive value (NPV)
Given that you test negative, the probability
that you actually do not have the condition.
The Prevalence of a particular disease is 1/10.
A test for this disease provides a correct diagnosis in 90% of cases (i.e. if you
have the disease, 90% of the time you will test positive, and if you do not
have the disease, 90% of the time you will test negative). Given that you test
positive for the disease, what is the probability that you actually have the
disease?
The Prevalence of a particular disease is 1/10.
A test for this disease provides a correct diagnosis in 90% of cases (i.e. if you
have the disease, 90% of the time you will test positive, and if you do not
have the disease, 90% of the time you will test negative). Given that you test
positive for the disease, what is the probability that you actually have the
disease?
Prevalence = Prior probability
in population
Solution:
P (D+) = 0.1
T+
TD+
D-
 Test positive
 Test negative
 Disease present
 Disease absent
P (T+|D+) = 0.9
P (T-|D-) = 0.9, therefore P(T+|D-) = 1 – 0.9 = 0.1
P (D+|T+) =
P (T+|D+)· P(D+)
P (T+|D+)· P(D+) + P (T+|D-)· P(D-)
= 0.5
=
(0.1)· (0.9)
(0.1)· (0.9) + (0.9)· (0.1)