Survey

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Survey

Document related concepts

no text concepts found

Transcript

MSDS 420 Module 8 Discussion When assessing the effectiveness of the information retrieval (IR) system, precision gives the fraction of the returned results that are relevant to the information needed. Recall gives the fraction of the relevant documents in the collection that were returned by the system. To illustrate how these two statistics play a complementary role in effectivity measures, we can start by looking at the confusion matrix shown in Table 1 (Ting, 2011). Actual Class Positive Negative Assigned Class Positive True Positive (TP) False Positive (FP) Negative False Negative (FN) True Negative (TN) Table 1: Confusion Matrix From this confusion matrix, we can mathematically define Precision (positive predicted value) and Recall as follows: Precision = True Positives/Total number of actual positives = TP/(TP + FN) Recall = True Positives/Total number of positives predicted = TP/(TP + FP) Effectively, these two measures tell us how valid the results are and how complete the results are. One figure that helped me visualize these metrics is shown in Figure 1. MSDS 420 Module 8 Discussion Figure 1: Precision and Recall Image created by Walber. https://commons.wikimedia.org/wiki/File:Precisionrecall.svg Instead of two measures, they are often combined to provide a single measure of retrieval performance called the F-measure: F-measure = 2 * Recall * Precision/(Recall + Precision) The inverted index at the end of a textbook is like the inverted index of an IR system. The purpose of inverted index is to optimize a query’s speed for full-text searches. Textbook indexes are basically printed inverted indexes that required a tremendous amount of effort to produce. References MSDS 420 Module 8 Discussion Ting K.M. (2011) Precision and Recall. In: Sammut C., Webb G.I. (eds) Encyclopedia of Machine Learning. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30164-8_652