Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU Presentation 13 Naïve Bayes Classifier Chapter 8 in SPB Basis of Naïve Bayes Classifier: Bayes Theorem β’ The Naïve Bayes classifier is a classification method based on Bayes Theorem. Let πΆπ denote that an output belongs to the j-th class, π = 1, 2, β― , π½, out of J possible classes. Let π πΆπ π1 , π2 , β― , ππ ) denote the (posterior) probability of belonging in the j-th class given the individual characteristics π1 , π2 , β― , ππ . Furthermore, let denote the probability of a case with individual characteristics π1 , π2 , β― , ππ belonging to the j-th class and π(πΆπ ) denote the unconditional (i.e. without regard to individual characteristics) prior probability of belonging to the j-th class. For a total of J classes, Bayes theorem gives us the following probability rule for calculating the case-specific probability of falling into the j-th class: where Denom ο½ P( X 1 , X 2 ,ο , X p | C1 ) P(C1 ) ο« ο ο« P( X 1 , X 2 ,ο , X p | C J ) P(C J ) The Naïve Independence Assumption β P( X 1 , X 2 , ο, X p | C j ) ο½ P( X 1 | C j ) ο P( X 2 | C j ) ο P( X p | C j ) . This assumption states that the joint probability of a specific case being in Class j is (naively) equal to the product of the individual probabilities of each individual characteristic being in Class j. This simplifies the computations of the Class probabilities of cases and helps prevent class probabilities from predominately being singular (i.e. either zero or one) across a majority of the cases. Then, in the independent case, the terms on the right-hand-side of the above equation can be calculated simply as the relative frequencies of the individual ππ βs in Class πΆπ . For example, the training data set could be used to calculate the relative frequency P( X i | C j ) ο½ [(# of X i in C j )/(total # of cases in C j )] Two Ways of Calculating the Prior Class Probabilities π(πΆπ ) β’ βPureβ Bayesian Uniform (Uninformative) Prior β each Class has an equal probability P(C j ) ο½ 1 / J , j ο½ 1,2,ο, J β’ Empirical Bayes Prior β the training set relative frequencies of each Class are used as the βEmpiricalβ Bayes Prior P(C j ) = (# training cases falling into C j / total # of training cases) β’ For more details on the Naïve Bayes Classifier see the pdf file βNaïve Bayes Classifier.pdfβ on the class website. Classroom Exercise: Exercise 8