Download relevance feedback algorithms inspired by quantum detection

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Relativistic quantum mechanics wikipedia , lookup

Max Born wikipedia , lookup

Delayed choice quantum eraser wikipedia , lookup

Coherent states wikipedia , lookup

Quantum decoherence wikipedia , lookup

Double-slit experiment wikipedia , lookup

Orchestrated objective reduction wikipedia , lookup

Path integral formulation wikipedia , lookup

Renormalization group wikipedia , lookup

Many-worlds interpretation wikipedia , lookup

Quantum entanglement wikipedia , lookup

Bell's theorem wikipedia , lookup

Theoretical and experimental justification for the Schrödinger equation wikipedia , lookup

Measurement in quantum mechanics wikipedia , lookup

Copenhagen interpretation wikipedia , lookup

Quantum computing wikipedia , lookup

History of quantum field theory wikipedia , lookup

T-symmetry wikipedia , lookup

EPR paradox wikipedia , lookup

Density matrix wikipedia , lookup

Quantum teleportation wikipedia , lookup

Interpretations of quantum mechanics wikipedia , lookup

Quantum group wikipedia , lookup

Canonical quantization wikipedia , lookup

Quantum machine learning wikipedia , lookup

Quantum key distribution wikipedia , lookup

Hidden variable theory wikipedia , lookup

Symmetry in quantum mechanics wikipedia , lookup

Bra–ket notation wikipedia , lookup

Quantum cognition wikipedia , lookup

Quantum state wikipedia , lookup

Quantum electrodynamics wikipedia , lookup

Probability amplitude wikipedia , lookup

Transcript
RELEVANCE FEEDBACK ALGORITHMS INSPIRED
BY QUANTUM DETECTION
Abstract
Information Retrieval (IR) is concerned with indexing and retrieving documents
including information relevant to a user’s information need. Relevance Feedback
(RF) is a class of effective algorithms for improving Information Retrieval (IR)
and it consists of gathering further data representing the user’s information need
and automatically creating a new query. In this paper, we propose a class of RF
algorithms inspired by quantum detection to re-weight the query terms and to rerank the document retrieved by an IR system. These algorithms project the query
vector on a subspace spanned by the eigenvector which maximizes the distance
between the distribution of quantum probability of relevance and the distribution of
quantum probability of non-relevance. The experiments showed that the RF
algorithms inspired by quantum detection can outperform the state-of-the-art
algorithms.
Existing System
The automatic procedure that modify the user’s queries is known as RF; some
relevance assessments about the retrieved documents are collected and the query is
expanded by the terms found in the relevant documents, reduced by the terms
found in the irrelevant documents or reweighted using relevant or irrelevant
documents.
RF can be positive, negative or both. Positive RF only brings relevant documents
into play and negative RF makes only use of irrelevant documents; any effective
RF algorithms includes a “positive” component. Although positive feedback is a
well established technique by now, negative feedback is still problematic and
requires further investigation, yet some proposals have already been made such as
grouping irrelevant documents before using them for reducing the query
Proposed System
It is designed to compute the new query vector using a linear combination of the
original vectors, the relevant document vectors and the non-relevant document
vectors, where the labels of relevance are collected in a training set.
Quantum probability is the theory of probability developed within Quantum
Mechanics (QM). In QM, a probability space can be represented as vectors,
matrices and operators between them. A tutorial would be out of the scope of this
paper, therefore we provide the information instrumental to understanding the rest
of this paper.
Detection consists of identifying the information concealed in the data which are
transmitted by the source placed on one side, through a channel to the detector
placed on the other side. The data are only a representation of the “true”
information that one side wants to transmit.
Implementation
Implementation is the stage of the project when the theoretical design is
turned out into a working system. Thus it can be considered to be the most critical
stage in achieving a successful new system and in giving the user, confidence that
the new system will work and be effective.
The implementation stage involves careful planning, investigation of the
existing system and it’s constraints on implementation, designing of methods to
achieve changeover and evaluation of changeover methods.
Modules
1.Vector Space Model
2.Relevance Feedback
3.Quantum Probability
4.Quantum Detection
Vector Space Model
The VSM for IR represents both documents and queries as vectors of the kdimensional real space Rk. This vector space is defined by k basis vectors
corresponding to the terms extracted from a document collection. Each document
vector results from the weighted linear combination of the basis vectors which
represents the terms extracted from the document collection.
The early formulation of the VSM was reported by Salton who later developed the
model in the 1970s for describing the statistical methods to measure semantic
relationships between words such as synonymy and polysemy and to build
networks of terms and documents.
The VSM was revisited and was then experimented and applied to several tasks
(e.g., crosslanguage IR , passage retrieval and automatic hypertext generation.
Relevance Feedback
The RF algorithm is also known as Rocchio’s algorithm and it is designed to
compute the new query vector using a linear combination of the original vectors,
the relevant document vectors and the non-relevant document vectors, where the
labels of relevance are collected in a training set.
Relevance feedback is a feature of some information retrieval systems. The idea
behind relevance feedback is to take the results that are initially returned from a
given query and to use information about whether or not those results are relevant
to perform a new query.
Quantum Probability
A probability space is given by some observables and by a probability function of
these observables. Quantum probability is the theory of probability developed
within Quantum Mechanics (QM). In QM, a probability space can be represented
as vectors, matrices and operators between them.
When a probability function is provided, each observable value and then each basis
vector corresponds to a probability measure given by the function, thus obtaining a
probability distribution.
The Gleason theorem explains why the trace rule is used to compute the
probabilities of events represented by vector spaces. The theorem basically states
that the density matrix introduced in this section can encapsulate all the
information about a probability space, that is, it provides a probability distribution
for any conceivable observable.
Quantum Detection
Detection consists of identifying the information concealed in the data which are
transmitted by the source placed on one side, through a channel to the detector
placed on the other side. The data are only a representation of the “true”
information that one side wants to transmit.
When the particle arrives at the other side, it is measured by the receiver. This
measurement is accomplished by an observable. The observed values are utilized
to determine the state of the signal (e.g., a document) given by the coder.
In IR, the information that is relevant to the user’s information need is transmitted
by a system to the user by means of a document which is only a representation of
the information fulfilling the user’s need. The values observed from a document
(e.g., term frequency) can serve to decide about the state of the document.
Although the use of PRF with the algorithms inspired by quantum detection can be
less effective than RF and less effective than the algorithms with no expansion for
larger n’s for long or verbose queries. When inspecting the results at the level of
query size, the situation appears to be more complex than explicit RF because the
algorithms that are inspired by quantum detection were more effective than the
baseline
algorithms
depending
on
n
or
query
size.
System Configuration
H/W System Configuration:
-
Pentium –IV
RAM
-
1GB
Hard Disk
- 80 GB
Processor
S/W System Configuration:

Operating System
:Windows95/98/2000/XP

Application Server
: Tomcat5.0/6.X

Front End
: HTML, Java, Jsp

Scripts

Server side Script
: Java Server Pages.

Database
: Mysql

Database Connectivity
: JDBC.
: JavaScript.