Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Probability Probability Sample space: P(6) = 1/6 = 0.1666 1,2,3,4,5,6 Probability • Probability values range from 0 to 1. • Adding all probability of the sample yields 1. • The probability that an event A will not occur is 1 minus the probability of A. • If two events are independent, the probability is the sum of their individual probabilities. • Two events A and B are independent if knowing that the occurrence of A does not change the probability of the occurrence of B. Probability Law of large numbers The larger the sample space, the closer the sample distribution to the theoretical distribution. Joint Probability P(A,B) = P(A) P(B) P(5,6) = (0.166) P(0.166) = 0.0277 Conditional Probability P(A B) P(AB ) = P(B) Conditional Probability In a corpus including 12.000 nouns and 3.500 adjectives, 2.000 adjectives precede a noun. (1) What is the likelihood that a noun occurs after an adjective? (2) What is the likelihood that an adjective precedes a noun? Conditional Probability P(ADJ N) P(ADJN) = P(N) P(2000) P(ADJN) = P(12000) = 0.1666 P(2000) P(NADJ) = 0.5714 = P(3500) Probability pronominal 0.4 0.8 = 0.32 lexical 0.4 0.2 = 0.08 0.6 pronominal 0.6 0.6 = 0.36 0.4 lexical 0.6 0.4 = 0.24 0.8 0.4 transitive 0.2 0.6 intransitive Sum = 1 Probability distribution T H HH HT TH TT Probability distribution 0 heads = HH 1 head = HT + TH 2 heads = TT Probability distribution HH HT 0 TH 1 TT 3 Sample space Random variable Probability distribution Cumulative outcome 0 = 1 1 = 2 2 = 1 Probability distribution Cumulative outcome 0 = 1 1 = 2 2 = 1 Probability 0.25 0.50 0.25 P(x) = 1 Binomial distribution Bernoulli trail: • two possible outcomes on each trail • the outcomes are independent of each other • the probability ratio is constant across trails Binomial distribution • It is based on categorical / nominal data. • There are exactly two outcomes for each trail. • All trials are independent. • The probability of the outcomes is the same for each trail. • A sequence of Bernoulli trails gives us the binominal distribution. Example 1 A coin is tossed three times. What is the probability of obtaining two heads? H HH HHH T HT HHT HTH TH TT HTT THH THT TTH TTT Sample space: HHH HHT HTH THH Random variables: 0 head: 1 head: 2 heads: 3 heads: 1 3 3 1 / / / / TTT TTH THT HTT 0 Head 1 Head 2 Heads 3 Heads 8 8 8 8 = = = = 0.125 0.375 0.375 0.125 Example 2 If you toss a coin 8 times what is the probability of obtaining a score of: 0 heads 1 head 2 heads 3 heads 4 heads 5 heads 6 heads 7 heads 8 heads Probability Distribution Sample: Tossing a coin a 100 times , yielded 42 heads and 58 tails. Is this a fair coin? Heads: 42 Tails: 58 Expected: 50% - 50% Sample error? Population 4 : 4? Samples 42 : 58 Normal distribution Normal distribution • The center of the curve represents the mean, median, and mode. • The curve is symmetrical around the mean. • The tails meet the x-axis in infinity. • The curve is bell-shaped. • The total under the curve is equal to 1 (by definition). Skewed distribution Bimodal distribution Skewed distribution Random distribution Normal distribution Example Boys MLU 2.7 2.9 2.6 2.3 3.2 2.9 2.6 Girls MLU 3.2 2.9 3.0 3.4 3.2 3.3 2.9 2.74 3.12 Example Inspection of data: 1. Frequency – ordinal –interval 2. Normally distributed – not normally distributed Boys 2.8 Boys Girls Girls 3.3