Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Machine Learning Tutorial.1 Boolean algebra, probabilities Boolean Algebra Tutorial (Duncan Fyfe Gillies slides) • Computer hardware works with binary numbers, but binary arithmetic is much older than computers – Ancient Chinese Civilisation (3000 BC) – Ancient Greek Civilisation (1000 BC) – Boolean Algebra (1850) Propositional Logic • The Ancient Greek philosophers created a system to formalize arguments called propositional logic. – A proposition is a statement that could be TRUE or FALSE – Propositions could be compounded by means of the operators AND, OR and NOT Propositional Calculus Example • Propositions may be TRUE or FALSE: – it is raining – the weather forecast is bad • A combined proposition: – it is raining OR the weather forecast is bad Propositional Calculus Example • We can assign values to propositions, for example: I will take an umbrella if and only if it is raining OR the weather forecast is bad • In other words the proposition “I will take an umbrella” is the result of the Boolean combination (OR) between two conditions: raining weather forecast being bad. • In fact we could write: I will take an umbrella = it is raining OR the weather forecast is bad Diagrammatic representation • We can think of the umbrella proposition as a result that we calculate from the weather forecast and the fact that it is raining by means of a logical or. Rain Bad forecast OR Take an umbrella Computer Organization class diagrams Truth Tables • Since propositions can only take two values, we can express all possible outcomes of the umbrella proposition by a table: – True = 1 – False = 0 Rain Bad Forecast Umbrella False False False False True True True False True True True True Bad Forecast Umbrella 0 0 0 0 1 1 1 0 1 1 1 1 Rain More complex propositions • We can make our propositions more complex, for example: (Take Umbrella ) = ( NOT (Take Car ) ) AND ( (Bad Forecast ) OR (Raining ) ) Rain Bad forecast Take a car OR AND NOT Take an umbrella Boolean Algebra • We could write propositional statements as above but to perform calculations quickly and efficiently we can use an equivalent, but more succinct notation. • We also need a to have a well-defined semantics for all the “operators”, or connectives that we intend to use. • The system we will employ is called Boolean Algebra (introduced by the English mathematician Boole in1850) and satisfies the criteria above. Fundamentals of Boolean Algebra • The truth values are replaced by 1 and 0: – TRUE = 1 – FALSE = 0 • Propositions are replaced by variables (for example): – it is raining = R – The weather forecast is bad = W • Operators are replaced by symbols – NOT = – OR = – AND = disjunction conjunction Truth Tables All possible outcomes of the operators can be written as truth tables NOR A A OR B A AND B A B R A B R A R 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 0 1 0 0 1 0 1 1 1 1 1 1 1 For n variables there are 2n possibilities. Theory of Probability Olga Kubassova, School of Computing http://www.comp.leeds.ac.uk/olga 7th February 2007 Probability Basics 0≤P≤1 • Experiment – a process that yields outcome. • Sample space – the set of all possible outcomes. • Event – a subset of the sample space. – E.g. getting heads when tossing a coin. • Mutually exclusive – if events cannot occur simultaneously. Example: • Rolling a fair six-sided die: • Sample Space: S={1,2,3,4,5,6}, |S| = 6 – Independent events • Event: obtaining 4 Probability notation • P(A) – the prob. of event A occurring • P(A∪B) – prob. of A or B – Probability of rolling 1 or 6 • P(A∩B) – prob. of A and B – Probability or rolling 1 and 2 – Probability of rolling an even number grater than 3 • P(¬A) – prob. of A not occurring (the complement of A) Pay attention to the AND/OR notation (set theory) Even numbers on both dice? Even numbers on both dice AND Sum is greater than 5 Sum is greater than 5? Even numbers on both dice OR Sum is greater than 5 1,1 1,2 1,3 1,4 1,5 1,6 2,1 2,2 2,3 2,4 2,5 2,6 3,1 3,2 3,3 3,4 3,5 3,6 4,1 4,2 4,3 4,4 4,5 4,6 5,1 5,2 5,3 5,4 5,5 5,6 6,1 6,2 6,3 6,4 6,5 6,6 Are the dice identical? How many different events? Example: • Rolling a fair six-sided die: • Sample Space: S={1,2,3,4,5,6}, |S| = 6 – Independent events • Event: obtaining 4 |S | • P(S) = p(event ) 1 i 1 i – Sum of probabilities of all (independent!) events – P(eventi)=1/6 (uniform distribution) Finite probability • Experiment that has a finite number of outcomes. • Therefore, each event has a finite probability. • Experiment: Throw a die. A = number > 4. B = even number. Find P(A), P(B), P(A∪B) and P(A∩B) Mutually Exclusive Events • Events are mutually exclusive if they are disjoint, or cannot occur simultaneously A∩B=0 Examples? Even numbers on both dice AND Sum is less than 1 Uniform distributions • Every outcome in the sample space is equally probable. • E.g. tossing coins rolling dice drawing a card from a deck Conditional probability • P(A | B) – the prob. of A happening given that B has occurred. P( A B) P( A | B) P( B) Example of the Conditional Probability Imagine that 5% of people of a given population own at least one dog. 2% of people own at least one dog and at least one cat. What is the probability that someone will own a cat, given that they also have a dog? Let A = “dog owner”, B = “cat owner”, then: P(A) = 0.05; P(A ∩ B) = 0.02 P(B | A) = P(A ∩B)/P(A) Independence • If events A and B do not influence each other then: P(A | B) = P(A) and P(A∩B) = P(A)P(B) Example (Independence) • Let throw the coin 2 times. Probability that the first time it is heads and the second it is tail do not influence each other Chain rule Chain rule allows computing probabilities of sequences of the events P( A1 , A2 ,, An ) P( A1 ) P( A2 | A1 ) P( A3 | A1 , A2 ) P( An | A1 , An1 ) Chain rule (II) • Let us compute sequences of events. • E.g. prob. of sequence of letters, ‘t’, ‘h’ and ‘e’: P(t,h,e) = P(t)P(h | t)P(e | th) Bayes’ theorem Bayes’ theorem (II) • Meningitis causes stiff neck 50% cases • Prob. of meningitis is 1/50000 • Prob. of stiff neck is 1/20 • What is the prob. of having meningitis if a patient has a stiff neck? Additional resources • Andy Roberts homepage: http://andy-roberts.net/teaching/index.html • Google directory: http://directory.google.com/Top/Science/Math/Prob ability • Wikipedia: http://en.wikipedia.org/wiki/Probability Exponentials – What are they? • Simply: a number (base) raised to a power (exponent). • E.g. the area of a square whose sides are of length 3… • = 32 • = 3 squared, or 3 to the power 2 • =3×3 Exponentials – A quick explanation • Easy enough to calculate: b • • • • ab = a × a × a … × a So there are ‘b’ lots of ‘a’ What is 43? 4 × 4 × 4 = 64 Exponentials in Computer Science • Have you ever noticed the following numbers popping up in your computer science studies? • 1, 2, 4, 8, 16, 32, 64, 128…1024… • What do they all have in common? • They are all powers of 2!! Exponentials in Computer Science • The powers of 2 should be (very) familiar to you by now. • All the computers you have been using work in binary bits (at the lowest level)… • 2 values: 0/1, true/false etc • all ‘data sizes’ must be expressed in powers of 2. Exponentials in Computer Science • • • • • • • 1 kilobyte ≠ 1000 bytes (‘standard’ use of kilo) 1 kilobyte = 1024 bytes Because… 29 = 512 210 = 1024 211 = 2048 So 210 is the closest we can get to 1000 Working with Exponentials • First, some easy exponentials to remember: • a0 = 1 • a1 = a Logarithms • This is because log(arithm)s are just ‘reversed’ exponentials • E.g. if 24 = 16 • log216 = 4 – base is 2 • Logarithms ‘map’ large numbers onto smaller numbers Logarithmic Scale 10 9 8 7 6 x 5 2^x 4 log(x,2) 3 2 1 0 1 2 3 4 5 6 7 8 9 10 Logarithms - Bases • There are several common bases: – 10: very common base, Richter scale etc. – e: used by a lot of scientists – 2: very common in computer science. WHY? Logarithms – An example • Imagine a (small) football tournament of 8 teams. • It is a knockout tournament, so every time that 2 teams play the losing team is eliminated from the tournament and the winning team goes on to the next round. • How many rounds must be played to determine an overall winner? Logarithms – An example • Imagine a (small) football tournament of 8 teams. • It is a knockout tournament, so every time that 2 teams play the losing one is removed from the tournament and the winning one goes on to the next round. • How many rounds must be played to determine an overall winner? Logarithms – An example Round 0: 8 teams Round 1: 4 teams Round 2: 2 teams Round 3: 1 team 3 rounds needed! Logarithms – Example explained • 3 rounds are needed to determine the winner of 8 teams, competing 2 at a time (i.e. one-on-one) • This can be easily calculated using logs. • 2 teams play at a time, so the base is 2. (i.e. 2x = 8, so we need to use log2) • So, log2 8 = …. • 3