Download Handbook of Research on Fuzzy Information Processing in Databases

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Encyclopedia of World Problems and Human Potential wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

SQL wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Relational algebra wikipedia , lookup

Database wikipedia , lookup

Versant Object Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

Transcript
Handbook of Research
on Fuzzy Information
Processing in Databases
José Galindo
University of Málaga, Spain
Volume I
Information science reference
Hershey • New York
Acquisitions Editor:
Development Editor:
Senior Managing Editor: Managing Editor:
Assistant Managing Editor:
Copy Editor:
Cover Design:
Printed at:
Kristin Klinger
Kristin Roth
Jennifer Neidig
Jamie Snavely
Carole Coulson
April Schmidt, Shanelle Ramelb
Lisa Tosheff
Yurchak Printing Inc.
Published in the United States of America by
Information Science Reference (an imprint of IGI Global)
701 E. Chocolate Avenue, Suite 200
Hershey PA 17033
Tel: 717-533-8845
Fax: 717-533-8661
E-mail: [email protected]
Web site: http://www.igi-global.com
and in the United Kingdom by
Information Science Reference (an imprint of IGI Global)
3 Henrietta Street
Covent Garden
London WC2E 8LU
Tel: 44 20 7240 0856
Fax: 44 20 7379 0609
Web site: http://www.eurospanbookstore.com
Copyright © 2008 by IGI Global. All rights reserved. No part of this publication may be reproduced, stored or distributed in any form or by
any means, electronic or mechanical, including photocopying, without written permission from the publisher.
Product or company names used in this set are for identification purposes only. Inclusion of the names of the products or companies does
not indicate a claim of ownership by IGI Global of the trademark or registered trademark.
Library of Congress Cataloging-in-Publication Data
Handbook of research on fuzzy information processing in databases / Jose Galindo, editor.
p. cm.
Summary: "This book provides comprehensive coverage and definitions of the most important issues, concepts, trends, and technologies in
fuzzy topics applied to databases, discussing current investigation into uncertainty and imprecision management by means of fuzzy sets and
fuzzy logic in the field of databases and data mining. It offers a guide to fuzzy information processing in databases"--Provided by publisher.
Includes bibliographical references and index.
ISBN-13: 978-1-59904-853-6 (hardcover)
ISBN-13: 978-1-59904-854-3 (ebook)
1. Databases--Handbooks, manuals, etc. 2. Data mining--Handbooks, manuals, etc. 3. Fuzzy mathematics--Handbooks, manuals, etc. I.
Galindo, Jose, 1970QA76.9.D32H336 2008
005.74--dc22
2007037381
British Cataloguing in Publication Data
A Cataloguing in Publication record for this book is available from the British Library.
All work contributed to this book set is original material. The views expressed in this book are those of the authors, but not necessarily of
the publisher.
If a library purchased a print copy of this publication, please go to http://www.igi-global.com/agreement for information on activating
the library's complimentary electronic access to this publication.
34
Chapter II
An Overview of
Fuzzy Approaches to
Flexible Database Querying
Sławomir Zadrożny
Polish Academy of Sciences, Poland
Guy de Tré
Ghent University, Belgium
Rita de Caluwe
Ghent University, Belgium
Janusz Kacprzyk
Polish Academy of Sciences, Poland
Abstract
In reality, a lot of information is available only in an imperfect form. This might be due to imprecision,
vagueness, uncertainty, incompleteness, or ambiguities. Traditional database systems can only adequately
cope with perfect data. Among others, fuzzy set theory has been applied to deal with imperfections of
data in a more natural way and to enhance the accessibility of databases. In this chapter, we give an
overview of main trends in the research on flexible querying techniques that are based on fuzzy set theory.
Both querying techniques for traditional databases as well as querying techniques for fuzzy databases
are described. The discussion comprises both the relational and the object-oriented database modeling
approaches.
Introduction
Databases are a very important component in
computer systems. Because of their increasing
number and volume, good and accurate accessibility to a database becomes even more important. A
lot of research has already been done to improve
database access. In this research, many aspects
have been dealt with, among which we mention file
organization, indexing, querying techniques, query
languages, and other data access techniques.
In this chapter, we give an overview of the
main research results on the development of flexible querying techniques that are based on fuzzy
Copyright © 2008, IGI Global, distributing in print or electronic forms without written permission of IGI Global is prohibited.
An Overview of Fuzzy Approaches
set theory (Zadeh, 1965) and its related possibility
theory (Dubois & Prade, 1988; Zadeh, 1978). The
scope of the chapter is further limited to an overview
of those techniques that aim to enhance database
querying by introducing fuzzy preferences (Bosc,
Kraft, & Petry, 2005). ���������������������������
Other techniques not dealt
with in this chapter include:
•
•
•
Self-correcting querying systems that can
correct syntactic and semantic errors in query
formulations.
Navigational querying systems that allow
intelligent navigation through the database.
Cooperative querying systems that support
“indirect” answers like summaries, conditional answers, and contextual background
information for (empty) results. (Gaasterland,
Godfrey, & Minker, 1992)
We will assume a simplified view of the database query as a combination of a number of
conditions that are to be met by the data sought.
The introduction of fuzzy preferences in queries
can be done at two levels: inside query conditions
and between query conditions. Fuzzy preferences
are introduced inside query conditions via flexible
search criteria and allow to express that some values
are more desirable than others in a gradual way.
Fuzzy preferences between query conditions are
expressed via grades of importance assigned to
particular query conditions indicating that the satisfaction of some query conditions is more desirable
than the satisfaction of others. Because of the use
of fuzzy preferences and the central role of fuzzy
set theory, the flexible querying approaches dealt
with in this chapter will be called fuzzy querying
in the remainder of the chapter.
The research on fuzzy querying already has a
long history. It has been inspired by the success of
fuzzy logic in modeling natural language propositions. The use of such propositions in queries, in
turn, seems to be very natural for human users
of any information system, notably the database
management system. Later on, the interest in
fuzzy querying has been reinforced by the omnipresence of network based applications, related
to buzzwords of modern information technology,
such as e-commerce, e-government, and so forth.
These applications evidently call for a flexible
querying capability when users are looking for
some goods, hotel accommodations, and so forth,
that may be best described using natural language
terms like cheap, large, close to the airport,
and so on. Another amplification of the interest in
fuzzy querying comes from developments in the
area of data warehousing and data mining related
applications. For example, a combination of fuzzy
querying and data mining interfaces (Kacprzyk &
Zadrożny, 2000a, 2000b) �����������������������
or fuzzy logic and the
OLAP (Online Analytical Processing) technology
(Laurent, 2003) may lead to new, effective, and
more efficient solutions in this area.
The remainder of the chapter is organized as
follows. In the next section, some preliminaries are
presented. In Fuzzy
�����������������������������������
Querying of Crisp Relational
Databases section,����������������������������������
������������������������������������������
the results on fuzzy querying in
classical relational databases are presented, while
the Fuzzy Querying of Fuzzy Relational Databases
and Object-Oriented Approaches sections deal
with the same issues for fuzzy and object oriented
cases, respectively. Finally,
����������������������������
some concluding remarks are given.
Other chapters in this volume also deal with
some particular cases of fuzzy querying. Among
the most relevant ones, we want to mention here
the chapters written by:
•
•
•
•
•
Thomopoulos, Buche, and Haemmerlé who
describe flexible querying with hierarchical
fuzzy sets.
Dubois and Prade who handle bipolar queries.
Takači and Škrbić who deal with introducing
priorities in fuzzy queries.
Barranco, Campaña, and Medina who write
about a fuzzy object-relational database
model and some strategies for fuzzy queries
in this model.
De Tré, Demoor, Callens, and Gosseye who
present some flexible querying techniques
that are based on case based reasoning.
35
An Overview of Fuzzy Approaches
Preliminaries
In order to review and discuss main contributions
to the research area of fuzzy querying, we have
to introduce the terminology and notation related
to the basics of database management and fuzzy
logic. A relational database may be meant in an
abstract sense as a collection of relations or, informally, of tables (Codd, 1970) which represent them,
comprising rows and columns. Each relation R—or
relational variable R (Date, 2004)—is defined via
the relation schema:
R(A1 : Dom(A1 ), A2 : Dom(A2 ),  , An : Dom(An ))
(1)
where the Ai’s are the names of attributes (columns) and Dom(Ai)’s are their associated domain.
Each relation (table) represents a class of objects
(meant as in common parlance rather than in the
object-oriented paradigm) essential for a part of the
real world modeled by a given database. A tuple
(row) of such a relation represents a particular
object of such a class.
The most interesting operation on a database,
from this chapter’s perspective, is the retrieval of
data satisfying certain conditions. Usually, to retrieve data, a user forms a query specifying these
conditions (criteria). The retrieval process may be
meant as the calculation of a matching degree for
each tuple of relevant relation(s). Classically, a row
either matches the query or not; that is, the concept
of matching is binary. In the context of flexible
criteria, a degree of matching is considered.
Usually two general formal approaches to the
querying are assumed: the relational algebra and
the relational calculus. The former has a procedural
character: a query consists here of a sequence of
operations on relations that finally yield requested
data. These operations comprise five basic ones:
union, difference, projection, selection, and cross
product that may be combined to obtain some
derived operations such as intersection, division,
and join. The latter approach, known in two fla-
36
vors as the tuple relational calculus (TRC) or the
domain relational calculus (DRC), is of a more
declarative nature. Here a query just describes
what information is requested, but how it is to be
retrieved from a database is left to the database
management system.
The exact form of queries is not of utmost
importance for our considerations, as we focus
on the condition part of queries. However, some
reported research in this area directly employs the
de-facto standard querying language for relational
databases, that is, SQL (Structured Query Language) (cf., Melton & Simon, 2002; Ramakrishnan
& Gehrke, 2000). Thus, we will also sometimes
refer to the SELECT instruction of this language
and its WHERE clause, where query conditions
are specified.
We will use the following concepts and notation concerning fuzzy logic. A fuzzy set FS in
the universe U is characterized by a membership
function:
mFS
F : U → [0,1]
(2)
For each element x ∈ U , mFS (x) denotes the
membership grade or extent to which x belongs to
FS. On the one hand, fuzzy sets make it possible
to represent vague concepts like “tall man,” in an
appropriate way, taking into account the graduality
of such a concept. On the other hand, a fuzzy set
that is interpreted as a possibility distribution
can be used to represent the uncertainty about
the value of a variable, for example, representing
the height of a man (Dubois & Prade, 1988; Zadeh,
1978). Possibility distributions are denoted by π.
The notation πX is often used to indicate that the
distribution concerns the variable X:
X
: U → [0,1]
where X takes values from a universe U.
(3)
Possibility and necessity measures can
provide for the quantification of such an uncertainty. These measures are denoted by Π and N,
respectively, that is:
An Overview of Fuzzy Approaches
~ (U ) → [0,1]
Π :℘
and
~
N :℘(U ) → [0,1]
(4)
~ (U ) stands for the family of fuzzy sets
where ℘
defined over U. Assuming that all we know about
the value of a variable X is a possibility distribution pX, these measures, for a given fuzzy set FS,
assess how it is possible (Π) or sure (N) that the
value of X belongs to FS. More precisely, if pX is
the underlying possibility distribution, then:
Π X (F ) = sup min (
u∈U
X
N X (F ) = inf max(1 −
u∈U
(u ), FS((u))
u ))
X
(u ), F S((u))
u ))
(5)
(6)
Sometimes, the interval [NX (FS), ΠX (FS)] is
used as an estimate of the possibility that the actual
value of X comes from FS. The possibility (necessity) that two variables X and Y, the values of which
are given by possibility distributions, pX and pX, are
in relation θ, for example, equality, is computed
as follows. The joint possibility distribution, pXY,
of X and Y on U × U (assuming non-interactivity
of the variables) is given by:
XY
(u, w) = min( X (u ), Y (w))
(7)
Knowing the possibility distributions of two
variables X and Y, one may be interested on how
these distributions are similar to each other. Obviously, Equations (8) through (9) may provide some
assessment of this similarity, but other indices of
similarity are also applicable. This leads to a distinction, proposed by Bosc, Duval, and Pivert (2000),
between representation-based and value-based
comparisons of possibility distributions. We will
discuss this later in the Fuzzy Querying of Fuzzy
Relational Databases section.
As an alternative for possibility and necessity measures, extended possibilistic truth values
(EPTVs) can be used to quantify uncertainty (de
Tré, 2002). An EPTV is defined as a possibility
distribution in the universe I * = {T , F , ⊥} that
consists of the three truth values T (true), F (false),
and ⊥ (undefined), that is:
~
~ ( I *) t * : P →℘
(10)
where P denotes the universe of all propositions. In
general, the EPTV ~
t * ( p ) of a proposition p ∈ P
has the following format:
~
t * ( p ) = {(T ,
~
t *( p )
(T )), ( F ,
~
t *( p )
( F )), (⊥,
~
t *( p )
(⊥))}
(11)
Relation θ may be fuzzy and represented by a fuzzy
~ (U × U )
F ∈℘
set FS
such that mFSF (u , w) = (u , w). The
possibility (resp. necessity) measure associated
with pX will be denoted by Π XY (resp. NXY). Then,
we calculate the measures of the variables in relation θ as follows:
Hereby ~t *( p ) (T ) denotes the possibility that p
is true, ~t *( p ) ( F ) is the possibility that p is false,
and ~t *( p ) (⊥) is the possibility that some elements
of p are not applicable, undefined, or not supplied.
EPTVs extend the approach of possibility and necessity measures with an explicit facility to deal
with the inapplicability of information as can,
for example, occur with the evaluation of query
(F )== sup min( X (u ), Y (conditions.
Possibility (X Y ) = Π XY (F)
w), (u , w))
u
,
w
∈
U
ility (X Y ) = Π XY (F ) = sup min( X (u ), Y (w), (u , w))
(8)
In Table 1, some special cases of EPTVs are
u , w∈U
presented.
These cases are verified as follows:
Necessity(X Y ) = N XY ((F)
F ) == inf max(1 − X (u ),1 − Y (w), (u , w))
•
If it is completely possible that the proposition
u , w∈U
ecessity(X Y ) = N XY (F ) = inf max(1 − X (u ),1 − Y (w), (u , w))
is true and no other truth values are possible,
u , w∈U
(9)
then it means that the proposition is true.
37
An Overview of Fuzzy Approaches
Table 1. Special EPTVs
•
•
•
•
~
t * ( p)
Interpretation
{(T,1)}
p is true
{(F,1)}
p is false
{(T,1), (F,1)}
p is unknown
{(⊥,1)}
p is inapplicable
{(T,1), (F,1), (⊥,1)}
Information about p is not available
If it is completely possible that the proposition
is false and no other truth values are possible,
then it means that the proposition is false.
If it is completely possible that the proposition is true, it is completely possible that the
proposition is false, and it is not possible
that the proposition is inapplicable, then it
means that the proposition is applicable, but
unknown. This truth value will be called, in
short, unknown.
If it is completely possible that the proposition
is inapplicable and no other truth values are
possible, then it means that the proposition
is inapplicable.
If all truth values are completely possible,
then this means that no information about
the truth of the proposition is available. The
proposition might be inapplicable, but might
also be true or false. This truth value will be
called, in short, unavailable.
Assume again that all we know about the
value of a variable X is a possibility distribution
pX, defined over a universe U. Then the EPTV of
the proposition “X is FS” that expresses to which
extent the value of X is compatible with the value
represented by a given fuzzy set F in U can be
calculated by:
t*(' XisFS ')
(T ) = sup min(
t*(' XisFS ')
( F ) = sup min(
u∈U
u∈U {⊥U }
X
(u ),
X
FS
(u ),1 −
38
(u ))
FS
(12)
(u ))
(13)
t*(' XisFS ')
(⊥) = min(
X
(⊥U ),1 −
FS
(⊥U ))
(14)
where ⊥U represents a special “undefined” element of U that is used to model cases where a
regular element of U is not applicable (cf. Prade
& Testemale, 1984).
Fuzzy Querying of Crisp
Relational Databases
In this case, a classical crisp relational database
is assumed, while queries are allowed to contain
natural language terms in their conditions. The
main lines of research include the study of the
idea of modeling linguistic terms in queries using
elements of fuzzy logic (Tahani, 1977); enhancements of the fuzzy query formalism with flexible
aggregation operators (Bosc & Pivert, 1993; Dubois
& Prade, 1997; Kacprzyk, Zadrożny,
������������������������
& Ziółkowski��,
1989; Kacprzyk & Ziółkowski, 1986), and practical problems with embedding fuzzy constructs in
the syntax of the standard SQL (Bosc, 1999; Bosc
& Pivert, 1992a, 1992b, 1995; de Tré, Verstraete,
Hallez, Matthé, & de Caluwe, 2006; Galindo,
Medina, Pons, & Cubero��������������������������
, 1998; Galindo, ���������
Urrutia,
& Piattini�����������������������������������
, 2006; Kacprzyk & Zadrożny, 1995;
Umano & Fukami, 1994).
Fuzzy Preferences Inside Query
Conditions
Tahani (1977) was the first to propose the use of
fuzzy logic to improve the flexibility of crisp data-
An Overview of Fuzzy Approaches
base queries. He proposed a formal approach and
architecture to deal with simple fuzzy queries. His
query language is based on SQL. Tahani proposed
to use vague terms typical for natural language, for
example, “high” and “young” in “WHERE salary
= HIGH AND age = YOUNG.” The semantics
of these vague terms is provided by appropriate
fuzzy sets. The matching degree, g, for such extended queries is calculated as follows. For a tuple
t and a simple (elementary) condition Q of type
A = l, where A is an attribute (e.g., “age”) and l
is a linguistic (fuzzy) term (e.g., “YOUNG”), the
value of the function g is:
g(Q, t) = ml(x)
(15)
where x is t[A]; that is, the value of tuple t for
attribute A and ml is the membership function of
the fuzzy set representing the linguistic term l.
The matching function g for complex conditions,
exemplified by “age = YOUNG AND (salary =
HIGH OR empyear = RECENT ),” is obtained
by applying the semantics of the fuzzy logical
connectives; that is:
(P ∧ Q, t ) = min( (P, t ), (Q, t ))
(16)
(P ∨ Q, t ) = max( (P, t ), (Q, t ))
(17)
(¬Q, t ) = 1 − (Q, t)
(18)
where P, Q are conditions. The min and max operators may be replaced by, for example, t-norm
and t-conorm operators (Klement, Mesiar, & Pap,
2000) to model the conjunction and disjunction
connectives, respectively.
The classical querying formalisms of the relational data model were also studied from the
perspective of the fuzzy querying purposes. The
relational algebra may be fairly easily adapted.
However, for some operations, multiple fuzzy
versions have been proposed. One such operation
lacking a clear, widely accepted fuzzy counterpart
is the division of relations which has been studied by many researchers, including Yager (1991),
Dubois and Prade (1996), and Galindo, ��������
Medina,
Cubero, and Garcia�������������������������������
(2001); see also a chapter by
Bosc et al. in this volume.
The relational calculus attracted much less attention. One of the earliest contributions in this area
is the work of Takahashi (1995) where he proposes
the FQL (Fuzzy Query Language), meant as a fuzzy
extension of the domain relational calculus (DRC).
A more complete approach has been proposed by
Buckles, Petry, and Sachar (1989). Even if it was
developed in the framework of a fuzzy database
model it covers all aspects relevant for the fuzzy
relational calculus. Also Zadrożny and Kacprzyk
(2002) proposed to interpret elements of DRC in
terms of a variant of fuzzy logic. This approach
also makes it possible to account for preferences
between query conditions in an uniform way.
Fuzzy Preferences Between Query
Conditions
The next step is to distinguish simple (fuzzy)
conditions composing a query with respect to their
importance. To model the relative importance of
conditions, weights are associated with them. Usually, a weight wi is represented by a real number
of the unit interval, that is, wi ∈ [0,1]. Hereby, as
extreme cases, wi = 0 models “not important at all”
and wi = 1 represents “fully important.” A weight
wi is associated with each (fuzzy) condition Pi.
Assume that the matching degree of a condition Pi
with an importance weigh wi is denoted by (Pi * , t)
. In order to be meaningful, weights should satisfy
the following requirements (Dubois, Fargier, &
Prade, 1997):
•
•
In order to have an appropriate scaling, it
must hold that at least one of the associated
weights is 1, that is, maxi �
wi = 1�.
If wi = 1 and the matching function equals
0 for Pi, that is, g(Pi, t) = 0������������������
, then the impact
of the weight should be 0, or (Pi* , t)= 0 . In
other words, if Pi is not satisfied at all and
39
An Overview of Fuzzy Approaches
Pi is fully important, then the weight should
not modify the matching degree�.
If wi = 1 and the matching function equals 1
for Pi , or g(Pi, t) = 1�������������������������
, then the impact of the
weight should be 1, or (Pi* , t) = 1 . In other
words, if Pi is completely satisfied and Pi is
fully important, then the weight should not
modify the matching degree�.
Lastly, if wi = 0���������������������������������
, then the result should be such
as if Pi would not exist. �
a similar scheme may be offered). Let us denote by
g(Pi, t) the matching degree for a tuple t of such an
elementary condition Pi without any importance
weight assigned. Then, Dubois and Prade (1997)
propose to use the following formula to compute
the matching degree, (Pi* , t) =
, of
1 an elementary
condition Pi with an importance weight wi ∈ [0,1]
assigned:
The impact of a weight can be modeled by first
matching the condition as if there is no weight and
then modifying the resulting matching degree in
accordance with the weight. A modification function that strengthens the match of more important
conditions and weakens the match of less important conditions is used for this purpose. Different
interpretations are possible.
From a conceptual point of view, a distinction
can be made between static weights and dynamic
weights. Static weights are fixed, known in advance,
and can be directly derived from the formulation
of the query. These weights are independent of
the values of the record(s) on which the query
criteria act and are not allowed to change during
query processing. A further, orthogonal distinction
can be made between static weight assignments,
where it is also known in advance with which
condition a weight is associated (e.g., in a situation where the user explicitly states preferences)
and dynamic weight assignments, where the associations between weights and conditions depend
on the actual attribute values of the record(s) on
which the query conditions act (e.g., in a situation
where most criteria have to be satisfied, but it is
not important which ones).
where ⇒ is an operator modeling a fuzzy implication connective. The overall matching degree
of the whole query composed of the conjunction
of conditions Pi is calculated using the standard
min-operator.
Depending on the type of the fuzzy implication
operator used, we get various interpretations of
importance weights. For example, using the Dienes
implication, we obtain from Equation (19):
Static weights. In most approaches, static
weights are used. As Dubois and Prade (1997)
discovered, some of the most practical interpretations of static weights may be formalized within
a universal scheme. Namely, let us assume that
query condition P is a conjunction of weighted
elementary query conditions Pi (for a disjunction
This is the interpretation presumably first
discussed by Yager (cf. for a reference Dubois &
Prade, 1997). The importance weight wi is here
treated as a threshold: if condition Pi is satisfied
to a degree greater than this threshold, then the
weighted condition Pi* is considered to be fully
satisfied. Otherwise the matching degree for Pi*
equals that for Pi.
•
•
40
(P , t)= (w ⇒ (P , t ))
i
*
i
i
(P , t)= max( (P , t),1 − w ) i
*
i
i
(19)
(20)
For a small importance (wi close to 0), the
satisfaction of elementary condition Pi does not
bear on the satisfaction of the overall query. On
the other hand, with wi close to 1, the satisfaction
of the elementary condition is essential for the
matching of the overall query P. Consequently,
the requirements for weights, proposed by Dubois
et al. (1997) and mentioned in the item list above,
are satisfied.
For the Gödel implication, Equation (19) turns
into:
(P , t) =  (P1 , t )

i
*
i
if
(Pi , t )≥ wi otherwise
(21)
An Overview of Fuzzy Approaches
Finally, another interpretation of importance
is obtained when the Goguen implication is used
in Equation (19):
(P , t )=  (P ,1t ) w

i
*
i
i
if
(Pi , t )≥ wi
otherwise
(22)
In fact, here we still have a threshold-type
interpretation, as in the previous case, but the
undersatisfaction of the condition is treated in a
more continuous way. For still another interpretation of importance, see Zadrożny (2005). The
use of importance weights indirectly leads to an
unconventional aggregation of partial matching
degrees.
Dynamic weights. The approach described for
static weights, based on Equation (19), has been
refined (Dubois & Prade, 1997) to deal with a
variable importance wi ∈ [0,1] depending on the
matching degree of the associated elementary
condition. For example, in a specific context, it
may be useful to assume wi to be constant for a
relatively high satisfaction of the elementary condition, but an extremely low satisfaction should be
more strongly reflected in the overall matching
by automatically increasing the weight wi. For
instance, when we want a car of a moderate price,
if a particular car has a very high price, the price
criterion becomes more important (wi = 1) in order
to reject that car.
More generally, when using dynamic weights
and dynamic weight assignments, neither the
weights nor the associations between weights and
criteria are known in advance. Both the weights
and their assignments then depend on the attribute
values of the record(s) on which the query criteria
act. This kind of flexibility is required to avoid
some unnatural behavior of the query evaluation in
cases where, for example, a condition is of limited
importance only within a given range of values such
as if the condition “high salary” is not important,
unless the salary value is extremely high.
Other approaches. Other flexible schemes of
aggregation are also a direct subject of research
in the framework of flexible fuzzy logic based
querying. In Kacprzyk and Ziółkowski (1986) and
Kacprzyk et al. (1989), the aggregation of partial
queries (conditions) to be guided by a linguistic
quantifier has been first described. In such approaches, conditions of the following form are
considered:
P = Ψ out of {P1 , … , Pk }
(23)
where Ψ is a linguistic (fuzzy) quantifier and Pi
is an elementary condition to be aggregated. For
example, in the context of a U.S.-based company,
one may classify an order as troublesome if it meets
most of the following conditions: “comes from
outside of USA,” “its total value is low,” “its shipping costs are high,” “employee responsible for it is
John Doe (known to be not completely reliable),”
“the amount of order goods on stock is not much
greater than ordered amount,” and so forth.
The overall matching degree may be computed
using any of the approaches used to model linguistic
quantifiers. In Kacprzyk and Ziółkowski (1986)
and Kacprzyk et al. (1989), first the linguistic
quantifiers in the sense of Zadeh (1983) and later
the OWA operators (Yager, 1994) are used (cf.
Kacprzyk & Zadrożny, 1997). Such approaches
make it also possible to take into account the importance of conditions to be aggregated. There are
many works on this topic studying various possible
interpretations of linguistic quantifiers for the flexible querying purposes such as Bosc, Pivert, and
Lietard (2001), Bosc, Lietard, and Pivert (2003),
Galindo et al. (2006), Vila, Cubero, Medina, and
Pons (1997).
Practical Approaches
More practical approaches to flexible fuzzy
querying in crisp databases are well represented
by SQLf (SQLfuzzy) (Bosc & Pivert, 1995) and
FQUERY (FuzzyQUERY) for Access (Kacprzyk
& Zadrożny, 1995). The former is an extension of
41
An Overview of Fuzzy Approaches
SQL introducing linguistic (fuzzy) terms wherever
it makes sense, and the latter is an example of
the implementation of a specific “fuzzy extension” of SQL for Microsoft Access®, a popular
desktop DBMS (database management system).
Also, Galindo et al.’s (1998) FSQL (FuzzySQL)
features the capability of fuzzy querying of a, in
principle, crisp database. However, as it is a more
comprehensive approach, it will be considered
in the section on fuzzy databases. Moreover, in
another chapter by Urrutia, Tineo, and Gonzalez
in this volume, the reader can find a comparison
of SQLf and FSQL.
FQUERY. In Kacprzyk and Zadrożny (1995),
an extension of the Access SQL language, with
the linguistic terms in the spirit of the approaches
discussed earlier, has been presented. The following types of linguistic terms have been considered:
fuzzy values (e.g., “YOUNG”); fuzzy relations
(fuzzy comparison operators) (e.g., “MUCH
GREATER THAN”); and fuzzy quantifiers (e.g.,
“MOST”). The matching degree is calculated
according to the previously discussed semantics
of fuzzy predicates and linguistically quantified
propositions. This extension to SQL has been
implemented as an add-in, FQUERY for Access,
to Microsoft Access, thus extending the native
Access’s querying interface with the capability of
manipulating linguistic terms.
In FQUERY for Access, the user composes
a query using a QBE (query-by-example) type
user interface provided by the host environment,
that is, Microsoft Access. The resulting rows are
ordered decreasingly with respect to the matching
degree. FQUERY has been one of the first implementations demonstrating the usefulness of fuzzy
querying features for a crisp database. In addition
to the syntax and semantics of the extended SQL,
the authors have also proposed a scheme for the
elicitation and manipulation of linguistic terms to
be used in queries.
The concept of FQUERY has been further developed in two directions. In ��������������������
Zadro���������������
ż��������������
ny������������
& Kacprzyk
(1998) and Kacprzyk and Zadro����������������
���������������������
ż���������������
ny (2001),�����
the
42
very same concept has been applied in the Internet
environment (WWW). Another line of development (Kacprzyk & Zadrożny, 2000a; Kacprzyk
& Zadrożny 2000b) consists in adding some data
mining capabilities to the existing fuzzy querying
interface. Such a combined interface partially employs the same modules and data structures as the
ones used in FQUERY and seems to be a promising
direction for the development of advanced OLAP
and data analysis tools.
SQLf. So far we have only discussed the “fuzzification” of conditions appearing in the WHERE
clause of the SQL’s SELECT instruction. In Bosc
and Pivert (1992b), Bosc and Pivert (1995), and
Bosc and Pivert (1997a), a new language, called
SQLf, has been proposed. This language is a much
more comprehensive and complete “fuzzy” extension of the crisp SQL language. In SQLf, linguistic
terms may appear as fuzzy values, relations, and
quantifiers (associated with aggregation operators) in the WHERE clause and other clauses. The
linguistic quantifiers may be used together with
subqueries. This is called by Bosc et al. the vertical quantification in contrast to the horizontal
quantification when a quantifier plays the role of
an aggregation operator and replaces the AND or
OR connectives in a condition as in (23).
All the operations of the relational algebra
(implicitly or explicitly used in SQL’s SELECT
instruction) are redefined to properly process fuzzy
relations that appear when parts of a fuzzy query
are processed. Other operations typical for SQL
are also redefined, including the partition of relations (GROUP BY clause) and the operators “IN”
and “NOT IN” used along with subqueries. All the
features of SQL have been redefined in such a way
so as to preserve the equivalences that occur in the
“crisp” SQL. A number of pilot implementations
of SQLf have been developed (e.g., Gonçalves &
Tineo, 2001a, 2001b).
Other approaches. Other approaches and
implementations for the flexible querying of crisp
relational databases, based on similar principles as
An Overview of Fuzzy Approaches
explained above, exist. Among these, we should
mention the PRETI-platform that is intended as
an experimental environment for the exchange of
expertise (de Calmès, Dubois, Hüllermeier, Prade,
& Sedes, 2002) and the approach based on EPTVs
(de Tré, de Caluwe, Tourné, & Matthé, 2003; de
Tré et al., 2006).
Fuzzy Querying of Fuzzy
Relational Databases
Fuzzy databases intend to grasp imperfect information about a modeled part of the world and represent
it directly in a database. The most straightforward
application of fuzzy logic to the classical relational
data model is by assuming that the relations in a
database themselves are also fuzzy. Each tuple of
a relation (table) is associated with a membership
degree. This approach is often neglected because
the interpretation of the membership degree is
unclear. On the other hand, it is worth noticing
that fuzzy queries, as discussed in the previous
section, in fact produce fuzzy relations.
Two leading approaches to the representation
of imperfect information in relational databases
are the possibilistic model (Prade & Testemale,
1984, 1987) and the similarity relation based
model (Buckles & Petry, 1982; Petry, 1996). More
recently, an extended possibilistic approach, based
on EPTVs has been proposed (de Tré & de Caluwe,
2003). The main idea behind the possibilistic data
model is to represent the imprecisely known value
of an attribute via a possibilistic distribution on the
domain of this attribute. For example, if all that
is known about the age of a suspect in a criminal
investigation is that he is “young,” then in a corresponding database, this information may be
represented by a suitable possibility distribution
on, for example, the interval [1,100]. This calls for
some special measures both in data representation
and querying, which will be described in the next
section.
The similarity based approach is rooted in the
observation that by specifying the search condi-
tions of a query, the user actually looks not only
for tuples exactly satisfying them but also for
similar tuples. Thus, a similarity relation on the
attribute domain is assumed. The values taken by
a similarity relation are in the unit interval [0,1],
where 0 corresponds to “totally different” and 1
to “totally similar.” It is a fuzzy binary relation
such that its membership function expresses the
similarity degree between the pairs of the domain
elements. Similarity relations are usually provided
by the user. The extended possibilistic approach is
an extension of the possibilistic approach. It explicitly deals with the inapplicability of information
during the evaluation of the query conditions: if
some part of the query conditions are inapplicable,
this will be reflected by the model.
We briefly discuss the main concepts of fuzzy
querying as proposed for both leading models of
fuzzy databases. Next, fuzzy querying in the extended possibilistic approach, as well as in some
hybrid approaches, is briefly described.
The Possibilistic Approach
Prade and Testemale (1984) proposed an algebra for
retrieving information from a fuzzy possibilistic
relational database. The principles of this algebra
can be illustrated by an example of the selection
operator. The syntax of the condition is more or
less the same as previously, but the attributes may
take possibilistic distributions as values. Two types
of elementary conditions are considered:
(i) A θ a, where A is the name of an attribute, θ
is a comparison operator (fuzzy or not), and
a is a constant (fuzzy or not);
(ii) A θ B, where A and B are names of attributes.
The computed matching degree of an elementary condition against a tuple t is expressed by a
pair: the possibility and necessity measure of some
sets (with respect to the possibility distributions
A(t) and B(t)). In case of (i), it is the set, crisp or
fuzzy, of the elements from the domain of A in
43
An Overview of Fuzzy Approaches
relation θ with a constant a. In the second case
(ii), it is the subset of the Cartesian product of
domains of A and B containing only the pairs of
elements being in relation θ. In this case, a joint
possibility distribution over the Cartesian product
of the domains of A and B is used.
Formally, the matching degree for case (i) is
computed as follows. Let us denote by FS the set
(in general fuzzy) whose possibility and necessity
measures have to be computed. Its membership
function for the elements of the domain of A is
as follows:
mFS (d ) = sup min( (d , d ′), (d ′)), d ∈ Dom( A)
F
a
Dom ((A ) (d , d ′ ),
′
(
d
)
= sup d ′∈min
F
a (d )), d ∈ Dom( A)
d ′∈Dom (A )
(24)
where ma is the membership function of the constant
a. The possibility and necessity measures of the
set FS with respect to the possibility distribution
pA(t) (the value of the attribute A for the tuple t) are
computed as in Equations (2)-(3).
For the second form of atomic condition (ii),
the set F comprises the pairs of elements (d, d' ),
d ∈ Dom(A), d ′ ∈ Dom(B ) such that d θ d’ is satisfied. Thus, its membership function is identical to
that of θ and the possibility and necessity measures
are computed as in Equations (8) through (9).
Baldwin, Coyne, and Martin (1993) have
implemented a system for querying a possibilistic
relational database using semantic unification and
the evidential logic rule. The queries are composed of one or more conditions, the importance
of each condition, a “filtering” function (similar
to the notion of quantifier), and a threshold. The
particularity of their work is the process, semantic
unification, used for matching the fuzzy values
of the criteria with the possibility distributions of
the attributes of a tuple. As a result, one obtains an
interval [n, p] where, similar to the previous case,
n (necessity) is the certain degree of matching and
p (possibility) is the maximum possible degree of
matching. However, this time the calculations are
based on the mass assignments theory developed
by Baldwin et al.
44
Bosc and Pivert (1997b) have proposed a new
type of queries for possibilistic databases. These
are directly querying the representation of the
attribute’s value (i.e., features of the corresponding
possibility distribution) rather than the value itself.
Examples of basic queries of this new type are:
“Find tuples such that all the values d1, d2, …,
dn are possible for attribute A”;
“Find tuples such that more than n values are
possible to a degree higher than λ for attribute
A.”
The matching degree for such queries is computed using the formula:
min(
A
(d1), A(d 2 ),  , A(d n )) (25)
where A is an attribute; d1, d2, …,dn are values
from its domain Dom(A); and πA is the possibility
distribution representing the value of A.
The tuples ��������������������
such that more than n values are
possible to a degree higher than λ for attribute A
are retrieved using the condition:
Card _ cut (A,
) > n (26)
where
Card _ cut (A,
) = {d d ∈ Dom(A)∧
A (d ) ≥
(27)
} and λ is a value in the interval [0,1].
These basic queries and the scheme for computing their matching degree may then be used to
process more complex queries like:
“Find the tuples where for attribute A the value
d1 is more possible than the value d2”;
“Find the tuples where for attribute A only one
value is completely possible.”
An Overview of Fuzzy Approaches
There are other works on fuzzy querying in the
possibilistic approach (de Caluwe, 2002; Umano,
1982; Umano & Fukami, 1994; Zemankova-Leech
& Kandel, 1984).
The Similarity Relation Based
Approach
The research on querying in similarity relation
based fuzzy databases has been summarized in
Buckles and Petry (1985), Buckles et al. (1989), and
Petry (1996). A complete set of operations of the
relational algebra has been defined for the similarity
relation based model. These operations result from
their classical counterparts by the replacement of
the concept of equality of two domain values with
the concept of similarity of two domain values.
The conditions of queries are composed of crisp
predicates as in a regular query language. Additionally, a set of level thresholds may be submitted as
a part of the query. A threshold may be specified
for each attribute appearing in the query’s condition. Such a threshold indicates what degree of
similarity of two values from the domain of given
attribute justifies in considering them equal. The
concept of threshold level plays also a central role
in the definition of the redundancy concept for this
database model. Two tuples are redundant if the
values of all corresponding attributes are similar
(to a level higher than a selected degree).
There are also a number of hybrid models
proposed in the literature. Takahashi (1993) has
proposed a model for a fuzzy relational database
assuming possibility distributions as attribute
values. Additionally, fuzzy sets are used as tuple
truth-values. For example, a tuple t may express
that “It is quite true that John’s age is nearly 40.”
Medina, Pons, and Vila (1994) propose a fuzzy
database model, GEFRED (Generalized Fuzzy
Relational Database), trying to integrate the
advantages of both the possibilistic and similarity
based models. The data are stored as generalized
fuzzy relations that extend the relations of the
relational model by allowing imprecise information and a compatibility degree associated with
each attribute value. They also define an algebra,
called a generalized fuzzy relational algebra,
to manipulate information stored in such a fuzzy
database. Galindo, Medina, and Aranda (1999)
have extended the GEFRED model with a fuzzy
domain relational calculus (FDRC). The GEFRED
model has been implemented using the crisp commercial DBMS Oracle (Galindo et al., 1998). The
implementation supports FSQL, a “fuzzy” SQL.
This fuzzy extension to SQL includes the linguistic
labels (terms; fuzzy values) and fuzzy comparison
operators (relations) that have been discussed in
the previous sections. Each condition could be assigned a fulfillment threshold ( ∈ [0,1]) requiring
that this condition has to be satisfied at least to a
degree α (thus, in some sense, changing a fuzzy
condition to a crisp one). In Galindo, ��������
Medina,
Cubero, and Garcia�������������������������������
(2000), the fuzzy quantifiers
have been included into their FDRC language.
Some applications of the FSQL are reported (Barranco, Campaña, Medina, & Pons, 2004; Galindo
et al., 2006).
An Extended Possibilistic Approach
In the extended possibilistic approach, the computed matching degree of an elementary condition
against a tuple t is expressed by an EPTV. This
EPTV represents the extent to which it is (un)certain
that t belongs to the result of a flexible query. Let us
again denote by FS the set (in general, fuzzy) whose
EPTV with respect to the possibility distribution
pA(t) has to be computed. Then the computation of
the EPTV can be done as in Equations (12) through
(14). In case of composed query conditions, the
resulting EPTV can be obtained by aggregating
the EPTVs of the elementary conditions. Hereby,
generalizations of the logical connectives of the
conjunction (∧), disjunction (∨), negation (¬), implication (⇒), and equivalence (⇔) can be applied
(de Tré, 2002; de Tré & de Baets, 2003).
The extended possibilistic approach is an extension of the possibilistic approach based on possibility and necessity measures presented in Prade
and Testemale (1984). It offers additional facilities
45
An Overview of Fuzzy Approaches
to cope with the inapplicability of information at
the logical level: if some of the query conditions
are inapplicable for a given tuple t, this will be
explicitly reflected in the EPTV representing the
matching degree for the tuple (de Tré, �����������
de Caluwe,
& Prade������������
, in press).
Object-Oriented Approaches
With object-oriented databases becoming mature,
research on “fuzzy” object-oriented databases has
drawn a lot of attention. Nowadays, several fuzzy
object-oriented database models exist. Based on
some of them, prototypes have already been implemented. The majority of the presented models do
not conform to a single underlying object data
model which is a logical consequence of the present
lack of (formal) object standards. The most recent
version of the ODMG (Object Data Management
Group) proposal (Cattell & Barry, 2000) offers
the best perspectives, although it still suffers from
some shortcomings such as the absence of formal
semantics and does not have the status of an official
standard (Alagić, 1997; Kim, 1994).
Informally, an object database is a collection of objects that are instances of classes and
typically have their own identity. Each class is
characterized by its structure, usually specified
by a finite number of attributes Ai: Dom(Ai) as in
the relational model and by its behavior specified by a finite number of operations. Classes are
interrelated via association relationships which
allow to associate objects with other objects, and
via inheritance relationships which allow sharing
characteristics among classes.
Research on fuzzy object-oriented databases
can also be subdivided into two main approaches:
those based on a possibilistic model and those based
on a similarity relation based model. Furthermore,
an extended possibilistic approach and some other
alternative approaches have been proposed. Most
research interest has been in the development
of semantically richer data modeling facilities.
Fuzzy querying of fuzzy object-oriented databases
46
has in most cases been performed using similar
techniques as described in the previous sections
of this chapter.
Possibilistic Models
Among the possibilistic approaches are the objectcentered model of Rossazza (1990) and ����������
Rossazza,
Dubois, and Prade�����������������������������
(1997); the object-oriented
model of Tanaka, Kobayashi, and Sakanoue (1991);
the FOOD model (Fuzzy Object-Oriented Data
model) of Bordogna, �������������������������������
Lucarella, and Pasi������������
(1994) and
Bordogna, Pasi, and Lucarella�������������������
(1999); the fuzzy
algebra of Rocacher ���������������������������
and Connan�����������������
(1996); the UFO
(Uncertainty and Fuzziness in an Object-oriented
model) model of Van Gyseghem (1998); the fuzzy
association algebra model of Na and Park (1997);
the FIRMS (Fuzzy Information Retrieval and
Management System) model of Mouaddib and
Subtil (1997); and the FOODM (Fuzzy ObjectOriented Database Model) model of Marín, Pons,
and Vila (2000).
In the object-centered model of Rossazza (1990)
and Rossazza
�����������������������������������������������
et al.��������������������������������
(1997), all information is contained in objects that are completely described by
a set of attributes. For these objects, no behavior
is defined. Objects with the same attributes are
collected in classes that are organized in class
hierarchies. A range of allowed values and a range
of typical values have been specified for the attributes. These ranges may be fuzzy. Various kinds of
(graded) inclusion relations between classes have
been defined: in order to find out to which extent
a class is a subclass of another class, the ranges of
their corresponding attributes are compared with
each other, using a “default reasoning” technique
as proposed in Reiter (1980).
In the object-oriented model of Tanaka et al.
(1991), fuzziness is considered with respect to both
the structural and the behavioral aspects of objects.
Attribute values can be fuzzy. Furthermore, fuzziness is considered at the levels of instantiation,
inheritance, and relationships between objects by
introducing some special classes. Special comparison operators, which are obtained by applying
An Overview of Fuzzy Approaches
Zadeh’s (1975) extension principle, are provided to
compare instances of fuzzy classes and to support
flexible querying.
The FOOD model of Bordogna et al. (1994,
1999) is based on a visualization paradigm that
supports the representation of data semantics and
the direct browsing of information. It has been defined as an extension of a graph-based object model
in which both the database scheme and instances
are represented as directed labeled graphs. A database manipulation language has been described
in terms of graph transformations. A prototype
of the model has been implemented (Bordogna,
Leporati, Lucarella, & Pasi��������
, 2000).
The fuzzy algebra of Rocacher and Connan
(1996) is an extension of the so-called EQUALalgebra which is part of the object-oriented database model ENCORE (Shaw & Zdonik, 1990).
The extension is based on an early version of the
ODMG data model (Cattell & Barry, 2000) and
is aimed at the modeling and manipulation of
fuzzy data. The extended operators are “union,”
“intersection,” “difference,” “select,” “image” (to
invoke functions on objects), “project,” “join,” “flatten,” “nest,” and “unnest.” Additionally, specific
operators have been provided to generate and to
compare fuzzy sets.
The UFO model of Van Gyseghem (1998)
has been an attempt to extend an object-oriented
database model as generally as possible in order
to be able to deal with fuzziness as well as with
uncertainty. Different concepts of the object orientation have been extended (attributes, methods,
objects, classes, inheritance, instantiation, etc.). A
specific feature of this approach is the use of “role”
objects to properly deal with the manipulation of
uncertain data.
In the approach of Na and Park (1997), a fuzzy
object-oriented data model has been built by means
of fuzzy classes and fuzzy associations. A fuzzy
database is represented by a fuzzy schema graph
at schema level and a fuzzy object graph at object
instance level. Data manipulation is handled by
means of a fuzzy association algebra which consists
of operators that can operate on the fuzzy associa-
tion patterns of homogeneous and heterogeneous
structures. As the result of these operators, truth
values are returned with the patterns.
The FIRMS model of Mouaddib and Subtil
(1997) can deal with fuzzy, uncertain, and incomplete information. At the base of the model are
the concepts of a “nuanced value” and “nuanced
domain.” Furthermore, a fuzzy thesaurus is used
to restrict the allowed domain values of discrete
attributes. A formal grammar is used to generate
the characteristic membership functions of the
thesaurus terms. In the FIRMS model, no class
hierarchies are supported.
The FOODM model of Marín et al. (2000) illustrates how different sources of vagueness can
be managed over a regular object-oriented database
model. It is founded on the concept of a “fuzzy
type” where properties are ranked in different
levels of precision according to their relationship
with the type. Objects are created using α-cuts
of their fuzzy type. Architecture of a prototype
implementation of the model has been presented
in Berzal, Marìn,
�����������������������������
Pons, and Vila��������
(2003).
Similarity Relation Based Models
George (1992) and George, Yazici, Petry, and
Buckles (1997) have proposed an object-oriented
database model, which facilitates an enhanced
representation of different types of imprecision and
utilizes a similarity relation to generalize equality
to similarity. Similarity permits to represent imprecision in data and imprecision in inheritance.
An object algebra based on extensions of the five
traditional operators (union, difference, product,
projection, and selection) and three operators to
handle nested class data have been provided to
support querying.
Other Approaches
In the “rough” object-oriented database of
Beaubouef and Petry (2002), an indiscernibility
relation and approximation regions of rough set
theory are used to incorporate uncertainty and
47
An Overview of Fuzzy Approaches
vagueness into the database model. As is the
case for fuzzy relational databases, the EPTVs
have also been applied in fuzzy object-oriented
databases. The database model of the constraint
based approach of de Tré, �����������������������
de Caluwe, and Van der
Cruyssen����������������������������������������
(2000) and de Tré and de Caluwe (2005)
is consistent with the ODMG data model (Cattell
& Barry, 2000). Both the data(base) semantics
and the flexible querying criteria are expressed
by generalized constraints. A many-valued possibilistic logic based on EPTVs is used in order to
be able to explicitly cope with missing information
and to express query satisfaction.
Concluding Remarks
In this chapter, we have presented an overview
of some of the most important contributions in
two main sub-areas of fuzzy querying: in crisp
and fuzzy databases. We have discussed the first
sub-area in more detail because it still seems to
be more promising. Both the relational and the
object-oriented database modeling and querying
approaches have been described.
Acknowledgment
The authors would like to thank the reviewers and
the editor Dr. Jose Galindo (University of Malaga,
projects TIN2006-14285 and TIN2006-07262 by
Ministry of Education and Science of Spain) for
their valuable comments and suggestions regarding the original manuscript of this chapter which
greatly helped to shape its final version.
References
Alagić, S. (1997). The ODMG object model: Does
it make sense? ACM SIGPLAN Notices, 32(10),
253-270.
Baldwin, J. F., Coyne, M. R., & Martin, T. P. (1993).
Querying a database with fuzzy attribute values
48
by iterative updating of the selection criteria. In A.
L. Ralescu (Ed.), Proceedings of the Workshop on
Fuzzy Logic in Artificial Intelligence (LNCS 847,
pp. 62-76). London: Springer-Verlag.
Barranco, C. D., Campaña, J., Medina, J. M., &
Pons, O. (2004). ImmoSoftWeb: A Web based fuzzy
application for real estate management. In J. Favela,
E. Menasalvas, & E. Chávez (Eds.), Advances in
Web intelligence (pp. 196-206). Berlin: Springer.
Beaubouef, T., & Petry, F. E. (2002). Uncertainty
in OODB modeled by rough sets. In Proceedings
of the 9th International Conference on Information Processing and Management of Uncertainty
in Knowledge-based Systems (IPMU 2002) (pp.
1697-1703), Annecy, France.
Berzal, F., Marìn, N., Pons, O., & Vila, M. A.
(2003). ���������������������������������������
FoodBi: Managing fuzzy object-oriented
data on top of the Java platform. In Proceedings of
the 10th International Fuzzy Systems Association
(IFSA) World Congress (pp. 384-387), Istanbul,
Turkey.
Bordogna, G., Leporati, A., Lucarella, D., & Pasi,
G. (2000). The
���������������������������������������
fuzzy object-oriented database management system. In G. Bordogna & G. Pasi (Eds.),
Recent issues on fuzzy databases (pp. 209-236).
Heidelberg, Germany: Physica-Verlag.
Bordogna, G., Lucarella, D., & Pasi, G. (1994).
A fuzzy object oriented data model. In Proceedings of the 3rd IEEE International Conference
on Fuzzy Systems (FUZZ-IEEE’94) (pp. 313-318),
Orlando, FL.
Bordogna, G., Pasi, G., & Lucarella, D. (1999). ��
A
fuzzy object-oriented data model for managing
vague and uncertain information. International
Journal of Intelligent Systems, 14(7), 623-651.
Bosc, P. (1999). Fuzzy databases. In J. Bezdek
(Ed.), Fuzzy sets in approximate reasoning and
information systems (pp. 403-468). Boston: Kluwer
Academic Publishers.
Bosc, P., Duval, L., & Pivert, O. (2000). ������������
Value-based
and representation-based querying of possibilistic
An Overview of Fuzzy Approaches
databases. In G. Bordogna & G. Pasi (Eds.), Recent
research issues on fuzzy databases (pp. 3-27).
Heidelberg: Physica-Verlag.
Proceedings of the IEEE International Conference
on Fuzzy Systems (FUZZ-IEEE 2001) (pp. 12311234), Melbourne, Australia.
Bosc, P., Kraft, D., & Petry, F. E. (2005). Fuzzy
������
sets in database and information systems: Status
and opportunities. Fuzzy Sets and Systems, 153(3),
418-426.
Buckles, B. P., & Petry, F. E. (1982). A fuzzy representation of data for relational databases. Fuzzy
Sets and Systems, 7, 213-226.
Bosc, P., Lietard, L., & Pivert, O. (2003). Sugeno
�������
fuzzy integral as a basis for the interpretation of
flexible queries involving monotonic aggregates.
Information Processing and Management, 39(2),
287�����
-306.
Bosc, P., & Pivert, O. (1992a). Some approaches
for relational databases flexible querying. International Journal on Intelligent Information Systems,
1, 323-354.
Bosc, P., & Pivert, O. (1992b). Fuzzy querying
in conventional databases. In L. A. Zadeh & J.
Kacprzyk (Eds.), Fuzzy logic for the management
of uncertainty (pp. 645-671). New York: Wiley.
Bosc, P., & Pivert, O. (1993). An approach for a
hierarchical aggregation of fuzzy predicates. ���
In
Proceedings of the 2nd IEEE International ����
Conference on Fuzzy Systems (FUZZ-IEEE´93) (pp.
1231-1236), San Francisco, CA.
Bosc, P., & Pivert, O. (1995). SQLf: A relational
database language for fuzzy querying. IEEE
Transactions on Fuzzy Systems, 3, 1-17.
Bosc, P., & Pivert, O. (1997a). Fuzzy queries against
regular and fuzzy databases. In T. Andreasen, H.
Christiansen, & H. L. Larsen (Eds.), Flexible query
answering systems. Dordrecht: Kluwer Academic
Publishers.
Bosc, P., & Pivert, O. (1997b). On representationbased querying of databases containing ill-known
values. In Z. W. Ras & A. Skowron (Eds.), Proceedings of the 10th International Symposium on
Foundations of Intelligent Systems (LNCS 1325,
pp. 477-486). London: Springer-Verlag.
Bosc, P., Pivert, O., & Lietard, L. (2001). Aggre������
gate operators in database flexible querying. In
Buckles, B. P., & Petry, F. E. (1985). Query languages for fuzzy databases. In J. Kacprzyk &
R. Yager (Eds.), Management decision support
systems using fuzzy sets and possibility theory
(pp. 241-251). Cologne, Germany: Verlag TÜV
Rheiland.
Buckles, B. P., Petry, F. E., & Sachar, H. S. (1989).
A domain calculus for fuzzy relational databases.
Fuzzy Sets and Systems, 29, 327-340.
Cattell, R. G. G., & Barry, D. (Eds.).
����������������
(2000). The
object data standard: ODMG 3.0. San Francisco:
Morgan Kaufmann.
Codd, E. F. (1970). A relational model of data for
large shared data banks. Communications of the
ACM, 13(6), 377-387.
Date, C. J. (2004). An introduction to database systems (8th ed.). ������������������������������
Boston: Pearson Education Inc.
de Calmès, M., Dubois, D., Hüllermeier, E., Prade
H., & Sedes, F. (2002). ������������������������
A fuzzy set approach to
flexible case-based querying: methodology and
experimentation. In Proceedings of the 8th International Conference, Principles of Knowledge
Representation and Reasoning (KR2002) (pp.
449-458), �����������������
Toulouse, France.
de Caluwe, R. (2002). Principles of fuzzy databases.
In J. Kacprzyk, M. Krawczak, & S. Zadrozny
(Eds.), Issues in information technology (pp. 151172). Warszawa, Poland: Exit.
de Tré, G. (2002). Extended possibilistic truth values. International Journal of Intelligent Systems,
17, 427-446.
de Tré, G., & de Baets, B. (2003). Aggregating
������������
constraint satisfaction degrees expressed by possibilistic truth values. IEEE Transactions on Fuzzy
Systems, 11(3), 361-368.
49
An Overview of Fuzzy Approaches
de Tré, G., & de Caluwe, R. (2003). Modelling
����������
uncertainty in multimedia database systems: An
extended possibilistic approach. International
Journal of Uncertainty, Fuzziness and KnowledgeBased Systems, 11(1), 5-22.
de Tré, G., & de Caluwe, R. (2005). A constraint
based fuzzy object oriented database model. In
Z. Ma (Ed.), Advances in fuzzy object-oriented
databases: Modelling and applications (pp. 1������
-45).
Hershey, PA: Idea Group Publishing.
de Tré, G., de Caluwe, R., & Prade, H. (in press).
Null values in fuzzy databases. Journal of Intelligent Information Systems.
de Tré, G., de Caluwe, R., Tourné, K., & Matthé,
T. (2003). ����������������������������������������
Theoretical considerations ensuing from
experiments with flexible querying. In T. Bilgiç,
B. De Baets, & O. Kaynak (Eds.), Proceedings of
the IFSA 2003 World Congress (LNCS 2715, pp.
388-391). ���������
Springer.
de Tré, G., de Caluwe, R., & Van der Cruyssen,
B. (2000). A
���������������������������������������
generalised object-oriented database
model. In G. Bordogna & G. Pasi (Eds.), Recent
issues on fuzzy databases (pp. 155-182). Heidelberg,
Germany: Physica-Verlag.
de Tré, G., Verstraete, J., Hallez, A., Matthé, T.,
& de Caluwe, R. (2006). The
�����������������������
handling of selectproject-join operations in a relational framework
supported by possibilistic logic. In Proceedings of
the 11th International Conference on Information
Processing and Management of Uncertainty in
Knowledge-based Systems (IPMU) (pp. 2181-2188),
Paris, France.
Dubois, D., Fargier, H., & Prade, H. (1997). �������
Beyond
min aggregation in multicriteria decision: (ordered)
weighted min, discri-min and leximin. In R. R.
Yager & J. Kacprzyk (Eds.), The ordered weighted
averaging operators: Theory and applications (pp.
181������������������������������������������
-192)�������������������������������������
. Boston: Kluwer Academic Publishers.
Dubois, D., & Prade, H. (1988). Possibility theory.
New York: Plenum Press.
50
Dubois, D., & Prade, H. (1996). Semantics of
quotient operators in fuzzy relational databases.
Fuzzy Sets and Systems, 78, 89����
-���
93.
Dubois, D., & Prade, H. (1997). ��������������������
Using fuzzy sets in
flexible querying: Why and how? In T. Andreasen,
H. Christiansen, & H. L. Larsen (Eds.), Flexible
query answering systems. Dordrecht: Kluwer
Academic Publishers.
Gaasterland, T., Godfrey, P., & Minker, J. (1992).
An overview of �����������������������
cooperative answering. Journal of
Intelligent Information Systems, 1, 123-157.
Galindo, J., Medina, J. M., & Aranda, M. C. (1999).
Querying fuzzy relational databases through fuzzy
domain calculus. International Journal of Intelligent Systems, 14, 375-411.
Galindo, J., Medina, J. M., Cubero, J. C., & Garcia,
M. T. (2000). ����������������������������������
Fuzzy quantifiers in fuzzy domain
calculus. In
��� Proceedings of the International Conference on Information Processing and Management of Uncertainty in Knowledge-based Systems
(IPMU´2000) (pp. 1697-1704), Madrid, Spain.
Galindo, J., Medina, J. M., Cubero, J. C., & Garcia,
M. T. (2001). Relaxing
�����������������������������������������
the universal quantifier of the
division in fuzzy relational databases. International
Journal of Intelligent Systems, 16(6), 713-742.
Galindo, J., Medina, J. M., Pons, O., & Cubero, J.
C. (1998). A server for fuzzy SQL queries. In T.
Andreasen, H. Christiansen, & H. L. Larsen (Eds.),
Proceedings of the Third International Conference
on Flexible Query Answering Systems (LNAI 1495,
pp. 164-174)��������������������������
. London: Springer-Verlag.
Galindo, J., Urrutia, A., & Piattini, M. (2006). Fuzzy
databases: Modeling, design and implementation.
Hershey, PA: Idea Group Publishing.
George, R. (1992). Uncertainty management issues
in the object-oriented database model. PhD Thesis,
Tulane University, New Orleans, LA, USA.
George, R., Yazici, A., Petry, F. E., & Buckles, B.
P. (1997). Modeling impreciseness and uncertainty
in the object-oriented data model: A similarity-
An Overview of Fuzzy Approaches
based approach. In R. de Caluwe, (Ed.), Fuzzy and
uncertain object-oriented databases: Concepts and
models (pp. 63-95). Singapore: World Scientific.
Gonçalves, M., & Tineo, L. (2001a). SQLf flexible
querying language extension by means of the norm
SQL2. In Proceedings of the
�����������������������
IEEE International
Conference on Fuzzy Systems (������������������
FUZZ- IEEE’ 2001)
(pp. 473-476).
Gonçalves, M., & Tineo, L. (2001b). SQLf3: An
extension of SQLf with SQL3 features. In Proceedings of �������������������������������������������
the IEEE International Conference on Fuzzy
Systems ������������������
(FUZZ-IEEE’ 2001) (pp. 477-480).
Kacprzyk, J., & Zadrożny, S. (1995). FQUERY
for Access: Fuzzy querying for windows-based
DBMS. In P. Bosc & J. Kacprzyk (Eds.), Fuzziness
in database management systems (pp. 415-433).
Heidelberg, Germany: Physica-Verlag.
Kacprzyk, J., & Zadrożny, S. (1997). ���������
Implementation of OWA operators in fuzzy querying for
Microsoft Access. In R. R. Yager & J. Kacprzyk
(Eds.), The ordered weighted averaging operators:
Theory and applications (pp.
�����������������������
293���������������
-306)����������
. Boston:
Kluwer Academic Publishers.
Kacprzyk, J., & Zadrożny, S. (2000a). On a fuzzy
querying and data mining interface. Kybernetika,
36, 657-670.
Kacprzyk, J., & Zadrożny, S. (2000b). On combining intelligent querying and data mining using
fuzzy logic concepts. In G. Bordogna & G. Pasi
(Eds.), Recent research issues on fuzzy databases
(pp. 67-81). Heidelberg: Physica-Verlag.
Kacprzyk, J., & Zadrożny, S. (2001). ������������
Using fuzzy
querying over the Internet to browse through
information resources. In B. Reusch & K. H.
Temme (Eds.), Computational intelligence in
theory and practice (pp. 235-262). Heidelberg:
Physica-Verlag.
Kacprzyk, J., Zadrożny, S., & Ziółkowski, A.
(1989). ����������������������������������
FQUERY III+: A “human-consistent”
database querying system based on fuzzy logic
with linguistic quantifiers. Information Systems,
14, 443-453.
Kacprzyk, J., & Ziółkowski, A. (1986). Database
���������
queries with fuzzy linguistic quantifiers. IEEE
Transactions on Systems, Man and Cybernetics,
16, 474-479.
Kim, W. (1994). Observations on the ODMG-93
proposal for an object-oriented database language.
ACM SIGMOD Record, 23(1), 4-9.
Klement, E. P., Mesiar, R., & Pap, E. (Eds.). (2000).
Triangular norms. Kluwer Academic Publishers.
Laurent, A. (2003). Querying fuzzy multidimensional databases: Unary operators and their
properties. International Journal of Uncertainty,
Fuzziness and Knowledge-Based Systems, 11,
31-46.
Marín, N., Pons, O., & Vila, M. A. (2000). Fuzzy
������
types: A new concept of type for managing vague
structures. International Journal of Intelligent
Systems, 15(11), 1061-1085.
Medina, J. M., Pons, O., & Vila, M. A. (1994).
GEFRED: A generalized model of fuzzy relational
databases. Information Sciences, 76(1-2), 87-109.
Melton, J., & Simon, A. R. (2002). SQL:1999:
Understanding relational language components.
Morgan Kaufmann.
Mouaddib, N., &, Subtil, P. (1997). Management
of uncertainty and vagueness in databases: The
FIRMS point of view. International Journal of
Uncertainty, Fuzziness and Knowledge Based
Systems, 5(4), 437-457.
Na, S., & Park, S. (1997). Fuzzy object-oriented
data model and fuzzy association algebra. In R.
de Caluwe (Ed.), Fuzzy and uncertain object-oriented databases: Concepts and models. Singapore:
World Scientific.
Petry, F. E. (1996). Fuzzy databases: Principles
and applications. Boston: Kluwer Academic
Publishers.
Prade, H., & Testemale, C. (1984). Generalizing
database relational algebra for the treatment of
incomplete or uncertain information and vague
queries. Information Sciences, 34, 115-143.
51
An Overview of Fuzzy Approaches
Prade, H., & Testemale, C. (1987). Representation
of soft constraints and fuzzy attribute values by
means of possibility distributions in databases. In
J. C. Bezdek (Ed.), Analysis of fuzzy information.
Boca Raton, FL: CRC Press.
Ramakrishnan, R., & Gehrke, J. (2000). Database
management systems. McGraw-Hill.
Reiter, R. (1980). A logic for default reasoning.
Artificial Intelligence, 13(1), 81-132.
Rocacher, D., & Connan, F. (1996). A
����������������
fuzzy algebra
for object oriented databases. In Proceedings of the
4th European Congress on Intelligent Techniques
and Soft Computing (EUFIT’96) 2 (pp. 871-876),
Aachen, Germany.
Rossazza, J.-P. (1990). Utilisation de hiérarchies
de classes floues pour la représentation de
connaissances imprécises et sujettes à exception:
Le système “SORCIER.” PhD Thesis, Université
Paul Sebatier, Toulouse, France.
Rossazza, J.-P., Dubois, D., & Prade, H. (1997).
A hierarchical model of fuzzy classes. In R. de
Caluwe (Ed.), Fuzzy and uncertain object-oriented
databases: Concepts and models (pp. 21-61). Singapore: World Scientific.
Shaw, G. M., & Zdonik, S. B. (1990). A query
algebra for object-oriented databases. In Proceedings of the 6th International Conference on
Data Engineering (ICDE’90) (pp. 154-162), Los
Angeles, CA.
Tahani, V. (1977). A conceptual framework for
fuzzy query processing: A step toward very intelligent database systems. Information Processing
and Management, 13, 289-303.
Tanaka, K., Kobayashi, S., & Sakanoue, T. (1991).
Uncertainty management in object-oriented database systems. In Proceedings of the International
Conference on Database and Expert System Applications (DEXA’91) (pp. 251-256). Berlin:
Springer-Verlag.
Umano, M. (1982). FREEDOM-0: A fuzzy database
system. In M. Gupta & E. Sanchez (Eds.), Fuzzy
information and decision processes (pp. 339-347).
Amsterdam: North-Holland.
Umano, M., & Fukami, S. (1994). Fuzzy relational
algebra for possibility-distribution-fuzzy relational
model of fuzzy data. Journal of Intelligent Information Systems, 3, 7-27.
Van Gyseghem, N. (1998). Imprecision and uncertainty in the UFO database model. Journal of
the American Society for Information Science,
49(3), 236-252.
Vila, M. A., Cubero, J.-C., Medina, J.-M., & Pons,
O. (1997). Using OWA operator in flexible query
processing. In R. R. Yager & J. Kacprzyk (Eds.),
The ordered weighted averaging operators: Theory
and applications (pp. 258-274)�����������������
. Boston: Kluwer
Academic Publishers.
Yager, R. R. (1991). Fuzzy quotient operators for
fuzzy relational databases. In Proceedings of
the International Fuzzy Engineering Symposium
(IFES’91) (pp. 289-296), Yokohama, Japan.
Yager, R. R. (1994). Interpreting linguistically
quantified propositions. International Journal of
Intelligent Systems, 9, 541-569.
Zadeh, L. A. (1965). Fuzzy sets. Information and
Control, 8(3), 338-353.
Takahashi, Y. (1993). Fuzzy database query languages and their relational completeness theorem.
IEEE Transactions on Knowledge and Data Engineering, 5, 122-125.
Zadeh, L. A. (1975). The
����������������������������
concept of a linguistic
variable and its application to approximate reasoning (parts I, II, and III). Information Sciences, 8,
199���������������
-��������������
251, 301������
-�����
357; 9, 43����
-���
80.
Takahashi, Y. (1995). A fuzzy query language for relational databases. In P. Bosc & J. Kacprzyk (Eds.),
Fuzziness in database management systems (pp.
365-384). Heidelberg, Germany: Physica-Verlag.
Zadeh, L. A. (1978). Fuzzy sets as a basis for a
theory of possibility. Fuzzy Sets and Systems,
1(1), 3-28.
52
An Overview of Fuzzy Approaches
Zadeh, L. A. (1983). ����������������������������
A computational approach to
fuzzy quantifiers in natural languages. Computational Mathematics Applications, 9, 149-184.
Zadrożny, S. (2005). Bipolar queries revisited. In
V. Torra, Y. Narukawa, & S. Miyamoto (Eds.),
Modelling decisions for artificial intelligence
(MDAI 2005) (LNAI 3558, pp. 387-398). Berlin:
Springer-Verlag.
Zadro������������������������������������������
ż�����������������������������������������
ny, S., & Kacprzyk, J. (1998). Implement����������
ing fuzzy querying via the Internet/WWW: Java
applets, ActiveX controls and cookies. ���
In Flexible
query answering systems (pp. 382-392). Heidelberg:
Springer-Verlag.
Zadrożny, S., & Kacprzyk, J. (2002). Fuzzy
������������
querying of relational databases: A fuzzy logic view. In
Proceedings of the EUROFUSE Workshop on Information Systems (pp. 153����������������������
-���������������������
158), Varenna, Italy.
Zemankova-Leech, M., & Kandel, A. (1984). Fuzzy
relational databases: A key to expert systems.
Cologne, Germany: Verlag TÜV Rheinland.
Key Terms
Database: A collection of persistent data. In a
database, data are modeled in accordance with a
database model. This model defines the structure of
the data, the constraints for integrity and security,
and the behavior of the data.
Fuzzy Database: In a regular database, only
crisp (perfectly described) data are stored. However, due to imprecision, vagueness, uncertainty,
incompleteness, or ambiguities, a lot of data are
in the real world available in an imperfect form
only. Fuzzy databases intend to grasp imperfect
information about a modeled part of the world and
represent it directly, as accurate as possible, in a
database. The two leading approaches to the representation of imperfect information in databases
are the possibilistic approach and the similarity
relation based approach.
Fuzzy Preferences Between Query Conditions: The introduction of fuzzy preferences in
fuzzy querying can also be done between query
conditions. These kinds of preferences are expressed via grades of importance, usually called
weights. Different weights are then assigned to
particular conditions indicating that the satisfaction
of some query conditions is more desirable than
the satisfaction of others.
Fuzzy Preferences Inside Query Conditions:
In fuzzy querying, the introduction of fuzzy preferences in queries can be done inside the query
conditions via flexible search criteria and allow to
express that some values are more desirable than
others in a gradual way.
Fuzzy Querying: Searching for data in a
database is called querying. Modern database
systems offer/provide a query language to support querying. Relational databases are usually
queried using SQL (Structured Query Language),
and object-oriented ODMG databases are queried
using OQL (Object Query Language). Traditional
database querying can be enhanced by introducing
fuzzy preferences and/or fuzzy conditions in the
queries. This is called fuzzy querying.
Object-Oriented Database: An object-oriented database is a database that is modeled in
accordance with an object-oriented database
model. In an object-oriented database model, the
data are structured in classes, which also embody
the behavior of the data. Classes are constructed
in the spirit of the object-oriented programming
paradigm and are as such closely connected to
an object-oriented programming language. The
best known object-oriented database model is the
ODMG model.
Relational Database: A relational database is
a database that is modeled in accordance with the
relational database model. In the relational database
model, the data are structured in relations that are
represented by tables. The behavior of the data is
defined in terms of the relational algebra, which
originally consists of eight operators (union, intersection, division, cross product, join, selection,
projection, and division), or in terms of the relational
calculus, which is of a declarative nature.
53
An Overview of Fuzzy Approaches
Possibilistic Fuzzy Database Approach: In
the possibilistic fuzzy database approach, imprecision in the value of an attribute is modeled via
a possibilistic distribution on the domain of this
attribute. This calls for the use of necessity and
possibility measures in database querying.
54
Similarity Relation Based Fuzzy Database
Approach: In the similarity relation based fuzzy
database approach, query results are allowed to
contain not only data that exactly satisfy the search
conditions but also data that are similar to these
data. For this reason, the attribute domains have to
be equipped with a similarity relation. (A similarity
relation is a fuzzy binary relation whose membership function expresses the similarity degree
between the pairs of the domain elements.)