Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A Brief Summary on Text Retrieval J. H. Wang Apr. 15, 2008 The Retrieval Process Text User Interface 4, 10 user need Text Text Operations 6, 7 logical view Query user feedback Operations logical view DB Manager Module Indexing 5 quer y Searching 8 inverted file Index 8 retrieved docs ranked docs Text Database Ranking 2 Topics in Text Retrieval • Indexing – Data structures: Inverted files • Different forms (see IIR Chap.2) • Searching – – – – Full-text search: string matching Keyword search Boolean search Complex searches: phrase, prefix, suffix, substring, … Topics in Text Retrieval • Text processing – Lexical analysis – Term selection • Stopword removal, stemming, … – Term weighting • TFxIDF: various forms (see IIR Chap.6) • Ranking -- IR Models – Vector space model – Probabilistic model (see IIR Chap. 11) – Latent Semantic Indexing (see IIR Chap. 18) • Dimension reduction Topics in Text Retrieval • Retrieval evaluation – – – – Precision Recall F-measure More: (see IIR Chap. 8) • Relevance feedback – Vector-space – Probabilistic • Query expansion – Global analysis – Local analysis What’s Next? • More models… – Language models – Machine learning techniques • Naïve Bayes, Support Vector Machine, … • More topics… – Text classification – Clustering – Filtering What’s Next? • Applications – The Web • Crawling, link analysis, … – Digital libraries • Metadata, content protection, … – Multimedia IR • Image, spoken document, music, video, … – Cross-language information retrieval – Natural language processing, information extraction, question answering, –…