Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Relevance Feedback • Limitations – Must yield result within at most 3-4 iterations – Users will likely terminate the process sooner – User may get irritated at seeing same documents repeated after every iteration • It has proven to increase the effectiveness of retrieval Designing a Relevance Feedback System • Use positive or negative relevance judgments • Where to apply relevance judgments (query, profile, document, retrieval algorithm) • Term weight modification. E.g., – Increase the weight for terms that appear in relevant docs – Add new terms found in relevant docs that are frequently mention in connection with query term Genetic Algorithms • Several possible solutions are generated in parallel • The best few of these solutions is chosen and replicated, while the poor ones eliminated • Replicated solutions creates a breeding population, from which new solutions arise • The breeding is accomplished by by an exchange of some of the characteristics of the chosen solutions in a crossover operation Genetic Algorithms (cont.) • Hill climbing is avoided by – Pursue multiple solutions in parallel, and discard the low hills – Introduce new characteristic values at low rate through mutation process (random exchange) • Relevance Feedback – Relieves the user of the burden of assigning term weights • Begins with no weights. Generates query variants by assigning term weights randomly Genetic Algorithms (cont.) • Query variants are vector of query term weights • Each query variant is used to search the documents in the database • Evaluate each variant with equation on pg. 226 • The variants with highest value creating the most replications • The resulting breeding population is developed to the same size as the original population Natural Language Processing • Focus on structure more than meaning, consequently problems are – Syntactic ambiguity e.g., they are visiting relatives – Deep structure of a sentence e.g., grace – May or may not be semantically correct e.g., Colorless green ideas sleep furiously – Syntactic rules do not apply to e.g., boolean queries Natural Language Processing (cont.) • Semantic Analysis – Even more elusie e.g., red herring, carrying coals to Newcastle • Techniques for Semantic Analysis – Latent semantic indexing uses multidimensional scaling methods to identify concepts – Dialogue Analysis involves interaction that each time clarifies further what is to be retrieved Citation Processing • Use of cited documents to enhance the description of a primary document • Some use co-citation as a measure of document similarity I.e., number of papers that cite both • Bibliographic coupling, when two documents cite the same document • Design problems: Locating citations, interpretation, eliminate duplicate/useless, Hypertext Links • Means of connecting 2 distinct pieces of text • Consists of an identifier and a pointer • Possibly aid retrieval by suggesting hyperlinks given in top ranked document retrieved • Do not follow links from linked documents • Information Filtering: Eliminate large segments of database from consideration • Passage Retrieval: Identifying relevant sections within a large document encyclopedia Image and Sound Processing • Techniques for evaluating and manipulating images directly • Voice recognition • Animation and sound: compare to those in libraries • Music can use style and then pattern matching