Download History of the English Language

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Probability
Probability
Sample space:
P(6) = 1/6 = 0.1666
1,2,3,4,5,6
Probability
•
Probability values range from 0 to 1.
•
Adding all probability of the sample yields 1.
•
The probability that an event A will not occur is 1
minus the probability of A.
•
If two events are independent, the probability is the
sum of their individual probabilities.
•
Two events A and B are independent if knowing that
the occurrence of A does not change the probability of
the occurrence of B.
Probability
Law of large numbers
The larger the sample space, the closer the
sample distribution to the theoretical distribution.
Joint Probability
P(A,B)
=
P(A)  P(B)
P(5,6)
=
(0.166)  P(0.166) = 0.0277
Conditional Probability
P(A  B)
P(AB ) =
P(B)
Conditional Probability
In a corpus including 12.000 nouns and 3.500
adjectives, 2.000 adjectives precede a noun.
(1) What is the likelihood that a noun occurs after
an adjective?
(2) What is the likelihood that an adjective
precedes a noun?
Conditional Probability
P(ADJ  N)
P(ADJN)
=
P(N)
P(2000)
P(ADJN)
=
P(12000)
= 0.1666
P(2000)
P(NADJ)
= 0.5714
=
P(3500)
Probability
pronominal
0.4  0.8 = 0.32
lexical
0.4  0.2 = 0.08
0.6
pronominal
0.6  0.6 = 0.36
0.4
lexical
0.6  0.4 = 0.24
0.8
0.4
transitive
0.2
0.6
intransitive
Sum = 1
Probability distribution
T
H
HH
HT TH
TT
Probability distribution
0 heads
=
HH
1 head
=
HT + TH
2 heads
=
TT
Probability distribution
HH
HT
0
TH
1
TT
3
Sample space
Random variable
Probability distribution
Cumulative outcome
0 = 1
1 = 2
2 = 1
Probability distribution
Cumulative outcome
0 = 1
1 = 2
2 = 1
Probability
0.25
0.50
0.25
 P(x) = 1
Binomial distribution
Bernoulli trail:
•
two possible outcomes on each trail
•
the outcomes are independent of each other
•
the probability ratio is constant across trails
Binomial distribution
•
It is based on categorical / nominal data.
•
There are exactly two outcomes for each trail.
•
All trials are independent.
•
The probability of the outcomes is the same for
each trail.
•
A sequence of Bernoulli trails gives us the binominal
distribution.
Example 1
A coin is tossed three times. What is the
probability of obtaining two heads?
H
HH
HHH
T
HT
HHT HTH
TH
TT
HTT THH THT TTH TTT
Sample space:
HHH
HHT
HTH
THH
Random variables:
0 head:
1 head:
2 heads:
3 heads:
1
3
3
1
/
/
/
/
TTT
TTH
THT
HTT
0 Head
1 Head
2 Heads
3 Heads
8
8
8
8
=
=
=
=
0.125
0.375
0.375
0.125
Example 2
If you toss a coin 8 times what is the
probability of obtaining a score of:
0 heads
1 head
2 heads
3 heads
4 heads
5 heads
6 heads
7 heads
8 heads
Probability Distribution
Sample:
Tossing a coin a 100 times , yielded 42
heads and 58 tails. Is this a fair coin?
Heads:
42
Tails:
58
Expected:
50% - 50%
Sample error?
Population
4 : 4?
Samples
42 : 58
Normal distribution
Normal distribution
•
The center of the curve represents the mean,
median, and mode.
•
The curve is symmetrical around the mean.
•
The tails meet the x-axis in infinity.
•
The curve is bell-shaped.
•
The total under the curve is equal to 1 (by
definition).
Skewed distribution
Bimodal distribution
Skewed distribution
Random distribution
Normal distribution
Example
Boys MLU
2.7
2.9
2.6
2.3
3.2
2.9
2.6
Girls MLU
3.2
2.9
3.0
3.4
3.2
3.3
2.9
2.74
3.12
Example
Inspection of data:
1. Frequency – ordinal –interval
2. Normally distributed – not normally distributed
Boys
2.8
Boys
Girls
Girls
3.3
Related documents