Download PPT13 - Naive Bayes Classifier

Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU Presentation 13 Naïve Bayes Classifier Chapter 8 in SPB Basis of Naïve Bayes Classifier: Bayes Theorem • The Naïve Bayes classifier is a classification method based on Bayes Theorem. Let 𝐶𝑗 denote that an output belongs to the j-th class, 𝑗 = 1, 2, ⋯ , 𝐽, out of J possible classes. Let 𝑃 𝐶𝑗 𝑋1 , 𝑋2 , ⋯ , 𝑋𝑝 ) denote the (posterior) probability of belonging in the j-th class given the individual characteristics 𝑋1 , 𝑋2 , ⋯ , 𝑋𝑝 . Furthermore, let denote the probability of a case with individual characteristics 𝑋1 , 𝑋2 , ⋯ , 𝑋𝑝 belonging to the j-th class and 𝑃(𝐶𝑗 ) denote the unconditional (i.e. without regard to individual characteristics) prior probability of belonging to the j-th class. For a total of J classes, Bayes theorem gives us the following probability rule for calculating the case-specific probability of falling into the j-th class: where Denom  P( X 1 , X 2 , , X p | C1 ) P(C1 )    P( X 1 , X 2 , , X p | C J ) P(C J ) The Naïve Independence Assumption – P( X 1 , X 2 , , X p | C j )  P( X 1 | C j )  P( X 2 | C j )  P( X p | C j ) . This assumption states that the joint probability of a specific case being in Class j is (naively) equal to the product of the individual probabilities of each individual characteristic being in Class j. This simplifies the computations of the Class probabilities of cases and helps prevent class probabilities from predominately being singular (i.e. either zero or one) across a majority of the cases. Then, in the independent case, the terms on the right-hand-side of the above equation can be calculated simply as the relative frequencies of the individual 𝑋𝑖 ‘s in Class 𝐶𝑗 . For example, the training data set could be used to calculate the relative frequency P( X i | C j )  [(# of X i in C j )/(total # of cases in C j )] Two Ways of Calculating the Prior Class Probabilities 𝑃(𝐶𝑗 ) • “Pure” Bayesian Uniform (Uninformative) Prior – each Class has an equal probability P(C j )  1 / J , j  1,2,, J • Empirical Bayes Prior – the training set relative frequencies of each Class are used as the “Empirical” Bayes Prior P(C j ) = (# training cases falling into C j / total # of training cases) • For more details on the Naïve Bayes Classifier see the pdf file “Naïve Bayes Classifier.pdf” on the class website. Classroom Exercise: Exercise 8

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download PPT13 - Naive Bayes Classifier