Download Hypothesis

ArtificialIntelligence Roman Barták Department of Theoretical Computer Science and Mathematical Logic Knowledgeinlearning Sofarwelearntafunctioninput→ output. Weonlyassumedtoknowtheformofthefunction(suchas adecisiontree)definedbythehypothesisspace. Canwetakeadvantageofpriorknowledgeaboutthe world? Inmostcasesthepriorknowledgeisrepresentedasgeneral first-orderlogicaltheories. Somemethods: – current-best-hypothesis search – versionspacelearning – inductivelogicprogramming Logicalformulationoflearning Hypotheses, exampledescriptions,andclassificationwillberepresented usinglogicalsentences. Examples – attributes becomeunarypredicates Alternate(X1) ∧ ¬Bar(X1) ∧ ¬Fri/Sat(X1)∧Hungry(X1)∧… – classification isgivenbyliteralusingthegoalpredicate WillWait(X1)or¬WillWait(X1) Hypothesiswillhavetheform ∀xGoal(x)⇔ Cj(x) Cj iscalledtheextensionofthepredicate ∀r WillWait(r) ⇔ Patrons(r,Some) ∨ (Patrons(r,Full)∧ Hungry(r)∧ Type(r,French)) ∨ (Patrons(r,Full)∧ Hungry(r)∧ Type(r,Thai) ∧ Fri/Sat(r)) ∨ (Patrons(r,Full)∧ Hungry(r)∧ Type(r,Burger)) Hypothesisspace Hypothesisspaceisthesetofallhypothesis. Thelearningalgorithmbelievesthatonehypothesisis correct,thatis,itbelievesthesentence: h1 ∨ h2 ∨ h3 ∨…∨ hn Hypothesesthatarenotconsistentwiththeexamplescan berulesout. Therearetwopossiblewaystobeinconsistent withan example(thenotionsoriginatedinmedicinetodescribe erroneousresultsfromlabtests) – falsenegative– hypothesissaystheexampleshouldbe negativebutinfactitispositive – falsepositive– hypothesissaystheexampleshouldbe positivebutinfactitisnegative Current-best-hypothesissearch Theideaistomaintainasinglehypothesis,and toadjustitasnewexamplesarriveinorderto maintainconsistency iftheexampleisconsistent withthehypothesis thendonotchangeit iffalsenegative thengeneralize thehypothesis iffalsepositive thenspecialize thehypothesis Thecurrent-best-hypothesislearningalgorithm Specializationandgeneralization Howtoimplementspecializationandgeneralizationofthe hypothesis? • Ifhypothesish1 isageneralization ofhypothesish2,thenwemusthave ∀xC2(x)⇒ C1(x) • Ci istypicallyaconjunctionofpredicates – generalizationcanberealizedby droppingconditionsorbyaddingdisjuncts – specializationcanberealizedbyadding extraconditionsorbyremovingdisjuncts Arestaurantexample: – thefirstexampleispositive,attributeAlternate(X1)istrue,solettheinitialhypothesisbe h1:∀xWillWait(x)⇔ Alternate(x) – thesecondexampleisnegative,hypothesispredictsittobepositive,soitisafalsepositive; we needtospecializebyaddingextracondition h2:∀xWillWait(x)⇔ Alternate(x)∧ Patrons(x,Some) – thethirstexampleispositive,thehypothesispredictsittobenegative,soitisafalsenegative;we needtogeneralizebydroppingtheconditionAlternate h3:∀xWillWait(x)⇔ Patrons(x,Some) – Thefourthexampleispositive,thehypothesispredictsittobenegative,soitisafalsepositive; weneedtogeneralizebyaddingadisjunct (wecannotdropthePatronscondition) h3:∀xWillWait(x)⇔ Patrons(x,Some)∨ (Patrons(x,Full) ∧ Fri/Sat(x)) Current-best-hypothesis:properties Aftereachmodificationofthehypothesisweneed tocheckallthepreviousexamples. Thereareseveralpossiblegeneralizationsand specializationsandwemayneedtobacktrack wherenosimplemodificationofthehypothesisis consistentwithallthedata. Thesourceofproblems– strongcommitment – Thealgorithmhastochooseaparticularhypothesisas itsbestguesseventhoughitdoesnothaveenough datayettobesureofthechoice. Asolutioncouldbeleast-commitmentsearch. Versionspacelearning Thehypothesisspacecanbeviewedasadisjunctivesentence h1 ∨ h2 ∨ h3 ∨…∨ hn Hypothesisinconsistentwithanewexampleisremovedfromthedisjunction. Assumingtheoriginalhypothesisspacedoesinfactcontaintherightanswer, thereduceddisjunctionmuststillcontaintherightanswer. Thesetofhypothesisremainingiscalledtheversionspace. Theversionspacelearningalgorithm(alsothecandidateelimination algorithm). Thisapproachisincremental:onenever hastogobackandreexaminetheold examples Representationofversionspace Hypothesis spaceisenormous, sohowcanwepossiblywritedownthis enormousdisjunction? Wehaveanorderingofhypothesisspace(generalization/specialization) sowecanspecifyboundaries,whereeachboundarywillbeasetof hypothesis(aboundaryset). G=amostgeneralboundary • consistentwithallobservationssofar • therearenoconsistenthypotheses thataremoregeneral • initiallyTrue S=amostspecificboundary • consistentwithallobservationssofar • therearenoconsistenthypotheses thataremorespecific • initiallyFalse Everythinginbetween G-setandS-setisguaranteedtobeconsistentwith theexamplesandnothingelseisconsistent. Versionspaceupdate ForeachnewexampleweupdatethesetsGandS: – falsepositiveforSi ÄthrowSi outoftheS-set – falsenegativeforSi ÄreplaceSi intheS-setbyallitsimmediategeneralizations – falsepositiveforGi ÄreplaceGi intheG-setbyallitsimmediatespecilaizations – falsenegativeforGi ÄthrowGi outoftheG-set Thealgorithmcontinuesuntiloneofthreethingshappens: – wehaveexactlyonehypothesisleftintheversionspace – theversionspacecollapses(eitherSorGbecomesempty) – werunoutofexamplesandhaveseveralhypothesisremaining intheversionspace • theversionspacerepresentsadisjunctionofhypotheses • ifthehypothesisdisagreeinclassification,onepossibilityistotakethe majorityvote Propertiesofversionspacelearning Ifthedomaincontainsnoiseorinsufficientattributesfor exactclassification,theversionspacewillalwayscollapse. – todate,nocompletely successfulsolutionhasbeenfound Ifweallowunlimiteddisjunctioninthehypothesisspace, – theS-setwillalwayscontainasinglemost-specifichypothesis (thedisjunctionofthedescriptionsofpositiveexamples) – theG-setwillcontainjustthenegationofthedisjunctionofthe descriptionsofthenegativeexamples – canbeaddressedbyallowingonlylimitedformsofdisjunction byincludingageneralizationhierarchyofmoregeneral predicates: • insteadofWaitEstimate(x,30-60) ∨WaitEstimate(x,>60)wecanuse LongWait(x) Thepureversionspacealgorithmwasfirstappliedinthe Meta-DENDRALsystem,whichwasdesignedtolearnrulesfor predictinghowmoleculeswouldbreakintopiecesinmass spectrometer. Inductivelogicprogramming Inductivelogicprogramming(ILP)combines inductivemethodswiththepoweroffirst-order representations(logicprograms). ILPworkswellwithrelationships betweenobjects, whichishardforattribute-onlyapproaches. Inprinciplethegeneralknowledge-inductionproblem istosolvetheentailmentconstraint: Background∧ Hypothesis∧ Descriptions|=Classifications TwoprincipalapproachestoILP: – top-downinductivelearningmethods (systemFOIL) – inductivelearningwithinversededuction (systemPROGOL) ILPproblem Background∧ Hypothesis∧ Descriptions|=Classifications • ExamplesaretypicallygivenasPrologfacts Father(Philip,Charles), Father(Philip, Anne), … Mother(Mum,Margaret), Mother(Mum, Elizabeth), … Married(Diana, Charles), Married(Elizabeth, Philip), … Male(Philip), Male(Charles), … Female(Beatrice), Female(Margaret),… • Similarlyknownclassifications aregivenbyPrologfacts: Grandparent(Mum,Charles), Gradparent(Elizabeth, Beatrice), … ¬Gradparent(Mum,Harry), ¬Grandparent(Spencer,Peter), … • Possiblehypothesis: Grandparent(x,y) ⇔ [∃z [∃z [∃z [∃z • Mother(x,z) Mother(x,z) Father(x,z) Father(x,z) ∧ ∧ ∧ ∧ Mother(z,y)] ∨ Father(z,y)] ∨ Mother(z,y)] ∨ Father(z,y)] Wecanexploitbackgroundknowledge: Parent(x,y) ⇔ Mother(x,y) ∨ Father(x,y) • Thenwecansimplifythehypothesis: Grandparent(x,y) ⇔ [∃z Parent(x,z) ∧ Parent(z,y)] Top-downlearning • Startwithaclausewithanemptybody Grandfather(x,y) ← • Thisclauseclassifieseveryexampleaspositive,soitneeds tobespecialized – byaddingliteralsoneatatimetothebody Grandfather(x,y) ← Father(x,y) Grandfather(x,y) ← Parent(x,z) Grandfather(x,y) ← Father(x,z) … • Wepreferthespecializationthatclassifiescorrectlymoreexamples – specialize thisclausefurther Grandfather(x,y) ← Father(x,z) ∧ Parent(z,y) – ifbackgroundknowledge Parentisnotavailablewemayneedto addmoreclauses Grandfather(x,y) ← Father(x,z) ∧ Father(z,y) Grandfather(x,y) ← Father(x,z) ∧ Mother(z,y) • eachclausecoverssomepositiveexamplesandnonegativeexample Top-downlearningalgorithm Build new clauses covering positive examples Literals are chosen from known predicates, equality/inequality literals, and arithmetic comparisons: • they have to include a variable that is already in clause • we can exploit types (number, person,…) • the choice of literal can be based on information gain SystemFOILsolvedalongsequence ofexercises onlist-processing functions (forexampleappend,QuickSort). Inverseresolution Background∧ Hypothesis∧Descriptions|=Classifications • ClassicalresolutiondeducesClassificationsfromBackground, Hypothesis,Descriptions. • Wecanruntheproofbackward,findHypothesissuchthat theproofgoesthrough: – forresolvent CproduceC1 andC2 (ifC2 isgiventhenproduceC1) ¬Parent(Elizabeth,Anne) ∨ Grandparent(George,Anne) ¬Parent(z,Anne) ∨ Grandparent(George,Anne) ¬Parent(z,y) ∨ Grandparent(George,y) … © 2016 Roman Barták Department of Theoretical Computer Science and Mathematical Logic [email protected]

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Hypothesis