Download Knowledge Discovery and Information Retrieval

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Knowledge Management Systems
• Knowledge Discovery in Databases
• Information Retrieval
• Formal methods to discover information &
possibly knowledge.
- Data collection
• Documents
• Usage
- Data analysis
• Relationships
• IR measures
KDD Process
• Goal: extracting actionable knowledge from
data
- Understandable patterns
- Rules
• Updated methods to extend beyond statistical
analysis
- Volumes of data collection
- Increased computation power
• Real-time
• Continuous data
- Advances in visualization
KDD in Use
• Data Mining is only one step
-
Preprocessing
Data Transformation
Pattern Detection
Interpretation
Use
• Most development work is in the
preprocessing
• Most intellectual work should be in forming
hypotheses
KDD Practices
•
•
•
•
•
•
•
Classification
Regression
Clustering
Summarization
Dependency Modeling
Link analysis
Sequence analysis
IR & the Semantic Web
• Rich description of documents enables
additional functionality
- Darpa Agent Markup Language
- Ontology Interface Layer
• Is this “semantic markup” derived from tacit
or explicit knowledge?
- How can it be generated?
- How can it be used?
• Information Retrieval
• Question answering (simple & complex)
• Faith in XML
Semantic IR
• How systems should work
• Events ontology
• Coordination among individuals
- Groups?
- Interdependencies?
• Processing for Hybrid IR?
- Trust in ML
- Trust in System
Navigating Social Cyberspaces
• Understanding Usenet use
- Postings
• Why
• How
• Information
- Distribution
• Cross postings
• Specific groups & cultures
- Free-riders vs. Contributors
- Usenet readers
Social Cyberspace Dimensions
• Netscan – social accounting metrics
- Size of group
- Culture
- Social cues
• Messaging protocols
- Asynchronous
- Real time (IM)
• Discussion Engagement
- Frequency, Replies
- Date, Time
• Thread and Author Tracker
- Thread Visualization
- New Threads vs. Replying to Old
Blogs & Social Dimensions
• Are blogs taking the place of newsgroups?
• RSS Readers
• Topic discovery methods
- Blog rolls
- Search engines
- Links
• Issues of Awareness
• Posting technologies s. Usenet
Answer Garden
•
•
•
•
•
A shared organizational memory system
Storing, retrieving and viewing information
What methods worked best?
What about user paricipation?
What’s an optimal size?
PeopleGarden
• Another view of participation
• How does the community work?
- Welcoming
- Volumes of dicsussion
• Groups found and formed
- Paired relationships
- Arguments and issue development
• Visualizing interaction
- Personal history
- Groups and Threads
Problems in Data Warehousing
• How about problems in understanding users?
• Technical issues are easier than social issues
- Privacy
- Accuracy