Download Summary on Text Retrieval

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
A Brief Summary on Text
Retrieval
J. H. Wang
Apr. 15, 2008
The Retrieval Process
Text
User
Interface
4, 10
user need
Text
Text Operations
6, 7
logical view
Query
user feedback Operations
logical view
DB Manager
Module
Indexing
5
quer
y
Searching
8
inverted file
Index
8
retrieved docs
ranked docs
Text
Database
Ranking
2
Topics in Text Retrieval
• Indexing
– Data structures: Inverted files
• Different forms (see IIR Chap.2)
• Searching
–
–
–
–
Full-text search: string matching
Keyword search
Boolean search
Complex searches: phrase, prefix, suffix,
substring, …
Topics in Text Retrieval
• Text processing
– Lexical analysis
– Term selection
• Stopword removal, stemming, …
– Term weighting
• TFxIDF: various forms (see IIR Chap.6)
• Ranking -- IR Models
– Vector space model
– Probabilistic model (see IIR Chap. 11)
– Latent Semantic Indexing (see IIR Chap. 18)
• Dimension reduction
Topics in Text Retrieval
• Retrieval evaluation
–
–
–
–
Precision
Recall
F-measure
More: (see IIR Chap. 8)
• Relevance feedback
– Vector-space
– Probabilistic
• Query expansion
– Global analysis
– Local analysis
What’s Next?
• More models…
– Language models
– Machine learning techniques
• Naïve Bayes, Support Vector Machine, …
• More topics…
– Text classification
– Clustering
– Filtering
What’s Next?
• Applications
– The Web
• Crawling, link analysis, …
– Digital libraries
• Metadata, content protection, …
– Multimedia IR
• Image, spoken document, music, video, …
– Cross-language information retrieval
– Natural language processing, information extraction,
question answering,
–…
Related documents