Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction Probability Theory ● Probability theory is an enhancement for our lack of knowledge ● It deals with non-deterministic events ● The theory originated in the problems of gambling ● It is used in many fields – Medicine – Economy – Computer Science – Linguistics – Psychology – Management Statistics ● Statistics concerns with problems in data – Collection – Analysis – Interpretation – Presentation – Organization ● A simple example: – We have two boxes: one red and one blue – The red box contains 2 apples and 6 oranges – The blue box contains 3 apples and 1 orange – We randomly pick one of the boxes and then we randomly select a fruit from the box with replacement – We assume that the red box is picked in 40% of time, and the blue one is picked is 60% of time ● We want to answer the questions such as: – What is the overall probability that the selection procedure will pick an apple? – Given that we have chosen an orange, what the probability that the box we chose was the blue one? ● How do we solve these problems? – We choose a random variable corresponding to the identity of a box – This random variable (B) takes one of two values r (for red) and b (for blue) – Another random variable is chosen for the identity of fruit – This random variable (F) takes the values a (for apple) or o (for orange) – We formulate our problems in interest through the values of random variables – We define a real-valued function that gives a probability value in [0,1] to the values of random variables Probability and statistics in computational linguistics ● ● ● ● Language is a complex system We need a theory to model the uncertainty involved in language structures The uncertainty is rooted in our lack of knowledge about language, its nature, its development, and its variation We have access to a large amount of data Machine Learning ● ● ● ● How to fit a numerical model on a set of data It uses the probability and statistic theory in order to give a machine the ability to learn from data We have many unlabeled data and some labeled data in our field Machine learning allows for using these data to solve many problems such as – Question answering – Machine translation – Text summarization – Document classification – ... Combinatorail Analysis Sample Space ● ● The set S consisting of all possible outcomes of an experiment is called the sample space Examples: – Tossing a coin S={H , T } – Tossing two coins S={HH , HT , TH , TT } – Tossing a coin until an H is obtained – – S={H , TH , TTH , TTTH , ...} Choosing an English letter S={a , A , b , B , c , C , ...} Statement ● ● ● ● A declarative sentence that can be true for some of the outcomes or false for other outcomes of a sample state Example: – A toss of a given coin results in head – A letter selected at random is 'z' – A word selected at random starts with 'z' We are interested in statements whose truth or falsehood for each possible outcome is deterministic A statement is a function from sample state S to {true,false} f : S→{true , false } Experiment ● ● A statement or several statements for which some outcomes in a sample space is true and other outcomes in the sample space are false Examples: – A toss of a given coin results in head (H) – The letter appeared in the beginning of a word in 'z' Event ● The set of sample points for which a statement is true ● Every subset of a sample state is an event ● Example: – The event corresponding to “a head is seen when tossing two coins” {HH , HT , TH } – The event corresponding to “the word starts with 'z'” {zone , zero , zambia , ...} Compound Statements ● ● Given two statements p and q, the statements in the following forms are compound statements – p and q – p or q – not p Example: – A word selected at random starts with 'z' and ends with 'o' – A word selected at random is VERB or NOUN – A letter selected at random is not vowel Compound Statements and Events ● ● If P and Q are the events corresponding to statements p and q then – the event correspond to p and q is – the event correspond to p or q is – event correspond to not p is P∩Q P∪Q ¬P Example: – – – – – P={HH , HT , TH } q: a tail is seen when tossing two coins Q={HT , TH , TT } p and q: a head and a tail is seen ... P∩Q={HT , TH } p or q: a head or a tail is seen ... P∪Q={HH , HT , TH , TT } not p: no head is seen .... ¬P={TT } p: a head is seen when tossing two coins Counting ● ● Number of outcomes of an event P is equal to the number of elements of P, n( P) Principles of counting – Addition principle – Multiplication principle – Permutation – Combination Addition Principle ● ● ● For any two sets A and B – n( A∪B)=n( A)+n(B)−n( A∩B) A and B are disjoint sets if A∩B=∅ If A and B are two disjoint sets then n( A∪B)=n( A)+n(B) Multiplication Principle ● If an experiment is performed in m steps and each step results in n_i i=1,2,3,... outcomes then the total number of possible outcomes for all experiments together is m ∏ ni =n1×n2×n3 …nm i=1 Multiplication Principle ● Examples – If we toss a coin three times then the total number of possible outcomes is 2*2*2 = 8 – If we randomly choose three letters from English alphabet with replacement then the total number of possible outcomes is 26*26*26 = 17576 – If we randomly choose three letters from English alphabet without replacement then the total number of possible outcomes is 26*25*24 = 15600 – The total number of subsets of a set of m elements is 2^m Permutation ● ● ● Any arrangement of objects in a list is a permutation NOTE: the order of objects in permutation is important Example: – How many different arrangement of letters a,b, and c are possible ● ● {abc, acb, bac, bca, cba, cab} 3*2*1 = 6 Permutation ● The total number of permutation of a list consisting of n different objects is n!=n (n −1) (n−2)⋯(1) 0!=1 ● The total number of permutation of r different objects out of n different objects is P (n , r)=n (n−1) (n−2)⋯(n−r +1) n! P (n , r )= (n−r)! P (n , n)=n! Permutation ● Example – In how many ways we can sort a deck of 52 cards 52!=52 (52−1) (52−2)⋯(1)=8.0658e+67 – In how many ways we can select an ordered list of three letters out of 6 letters a, b, c, d, e, and f. 6! P (6,4)= =360 (6 −4)!