Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Detailed Syllabus Lecture-wise Breakup Subject Code Subject Name Semester Odd 2013 Information Retrieval & Data Mining Credits Contact Hours 3 ContactCoordinator Sangeeta Instructor Sangeeta Module No. Subtitle of the Module Topics in the module 1. Introduction to Information Retrieval 3 2. Boolean Retrieval 3. Dictionary and tolerant retrieval Theory of information retrieval, Information retrieval on data and information retrieval on the web Information retrieval tools and their architecture. An example information retrieval problem, Processing Boolean queries, The extended Boolean model versus ranked retrieval Wild card queries, Spelling correction , Phonetic correction 4. Scoring Term weighting and the vector space model Term frequency and weighting, Vector space model, Variant tf-idf scoring 2 5. Link analysis Web as graph, PageRank 2 6. Information retrieval tools 5 7. Web Crawling Web directory, Search engine, Meta search engines, Web searching and search engine architecture, Searching algorithms (Fish, Shark etc...), and Page ranking algorithms. WebCrawler architecture and Web crawling (parallel, distributed and focused web crawling). Nearduplicates and shingling. JIIT University, Noida No. of Lectures for the module 2 2 4 8. Q&A system Enhancing Technical Q&A System with Cite History [Paper] Design Lessons from the Fastest Q&A Site in the West [Paper] Avaaj Otalo — A Field Study of an Interactive Voice Forum for Small Farmers in Rural India [Paper] Introduction to data mining, data ware house architecture, metrics and security. 4 9. Introduction to data mining 10. Data Preprocessing Data extraction, Data cleaning, Data Integration and transformation, Data reduction, loading and post loading. 2 11. Classification Algorithms Usability and complexity analysis of Bayesian, Nearest neighbor, Decision tree based and rule based algorithms. 5 12. Clustering Algorithms 4 13. Association Algorithms Usability and complexity analysis of Agglomerative Hierarchical, Kmeans partitioning algorithms. Usability and complexity analysis of Apriori, sampling, partitioning, and multiple minimum support algorithms. Total number of Lectures 42 2 5 Recommended Reading material: Author(s), Title, Edition, Publisher, Year of Publication etc. ( Text books, Reference Books, Journals, Reports, Websites etc. in the IEEE format) 1. Jiawei Han and Micheline Kamber, ”Data Mining, Concepts and Techniques”, Elsevier 2nd edition. 2. Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, “An introduction to Information Retrieval”, 2009 Cambridge University Press UP. 3. Margaret H. Dunham, “data mining: introduction and advanced Topics”, Pearson Education.. 4. Pang-Ning Tan, Michael Steinbach, Vipin Kumar, “Introduction to Data Mining “, Pearson Education. 5. Richard O. Duda, Peter E. Hart, David G. Stork , “Pattern Classification”, 2nd Edition, Wiley Publication, November 2000, JIIT University, Noida 6. Rijsbergen C. J. ,”Information Retrieval”, 2nd edition. 7. Salton, G. and McGill, M.J., “Introduction to Modern Information Retrieval”, Computer Series. McGraw-Hill, New York, NY. 8. ACM Transaction on Internet Technology. 9. ACM Transactions on Database Systems. 10. IEEE Transaction on Knowledge and Data Engineering. 11 ACM Transactions on Information Systems. JIIT University, Noida