Download The Natural Order-Generic Collapse for ω

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of logic wikipedia , lookup

Modal logic wikipedia , lookup

Quantum logic wikipedia , lookup

List of first-order theories wikipedia , lookup

Model theory wikipedia , lookup

Theorem wikipedia , lookup

Propositional calculus wikipedia , lookup

Non-standard calculus wikipedia , lookup

Law of thought wikipedia , lookup

Mathematical logic wikipedia , lookup

Intuitionistic type theory wikipedia , lookup

Intuitionistic logic wikipedia , lookup

First-order logic wikipedia , lookup

Natural deduction wikipedia , lookup

Curry–Howard correspondence wikipedia , lookup

Laws of Form wikipedia , lookup

Structure (mathematical logic) wikipedia , lookup

Transcript
The Natural Order-Generic Collapse
for ω-Representable Databases over the Rational
and the Real Ordered Group
Nicole Schweikardt
Institut für Informatik / FB 17
Johannes Gutenberg-Universität, D-55099 Mainz
[email protected]
http://www.informatik.uni-mainz.de/˜nisch/homepage.html
Abstract. We consider order-generic queries, i.e., queries which commute with every order-preserving automorphism of a structure’s universe. It is well-known that first-order logic has the natural order-generic
collapse over the rational and the real ordered group for the class of
dense order constraint databases (also known as finitely representable
databases). I.e., on this class of databases over Q, < or R, <, addition
does not add to the expressive power of first-order logic for defining ordergeneric queries. In the present paper we develop a natural generalization
of the notion of finitely representable databases, where an arbitrary (i.e.
possibly infinite) number of regions is allowed. We call these databases
ω-representable, and we prove the natural order-generic collapse over the
rational and the real ordered group for this larger class of databases.
Keywords: Logic in Computer Science, Database Theory, Constructive
Mathematics
1
Introduction and Main Results
In relational database theory a database is modelled as a relational structure
over a fixed, possibly infinite universe U. A k-ary query is a mapping Q which
assigns to each database A a k-ary relation Q(A) ⊆ Uk . In many applications the
elements in U only serve as identifiers which are exchangeable. If this is the case,
one demands that queries commute with every permutation of U. Such queries
are called generic. If U is linearly ordered, a query may refer to the ordering. In
this setting it is more appropriate to consider queries which commute with every
order-preserving (i.e. strictly increasing) mapping of U. Such queries are called
order-generic.
A basic way of expressing order-generic queries is by first-order formulas that
make use of the linear ordering and of the database relations. Database theorists
distinguish between two different semantics: active semantics, where quantifiers
only range over database elements, and the (possibly) stronger natural semantics,
where quantifiers range over all of U. In the present paper we always consider
natural semantics.
L. Fribourg (Ed.): CSL 2001, LNCS 2142, pp. 130–144, 2001.
c Springer-Verlag Berlin Heidelberg 2001
The Natural Order-Generic Collapse
131
It is a reasonable question whether the use of additional, e.g. arithmetical,
predicates on U allows first-order logic to express more order-generic queries than
with linear ordering alone. In some situations this question can be answered “yes”
(e.g. if U is the set of natural numbers with + and × as additional predicates,
cf. [3]). In other situations the question must be answered “no” (e.g. if U is
the set of natural numbers with + alone, cf. [8]) — such results are then called
collapse results, because first-order logic with the additional predicates collapses
to first-order logic with linear ordering alone. A recent overview of this area of
research is given in [3].
In classical database theory, attention usually is restricted to finite databases.
In this setting Benedikt et al. [2] have obtained a strong collapse result: Firstorder logic has the natural order-generic collapse for finite databases over ominimal structures. This means that if the universe U together with the additional predicates, has a certain property called o-minimality, then for every
order-generic first-order formula ϕ which uses the additional predicates, there
is a formula with linear ordering alone which is equivalent to ϕ on all finite
databases.
Belegradek et al. [1] have extended this result: Instead of o-minimality they
consider quasi o-minimality, and instead of finite databases they consider finitely
representable databases (also known as dense order constraint databases). Many
structures interesting to database theory, including N, <, +, Q, <, +,
R, <, +, and R, <, +, ×, ex , are indeed o-minimal or at least quasi o-minimal.
A database is called finitely representable if each of its relations can be explicitly
defined by a first-order formula which makes use of the linear ordering and of
finitely many constants in U. For U ∈ {Q, R}, finitely representable databases
are exactly those databases where every relation is defined by a Boolean combination of order-constraints over U. I.e., those database relations essentially
consist of a finite number of multidimensional rectangles in U.
A reasonable question is whether such collapse results hold for even larger
classes of databases. In [8] it was shown that over N, <, + the natural ordergeneric collapse does indeed hold for arbitrary databases. However, this result
cannot be carried over to dense linear orders: Belegradek et al. have shown (cf.
[1, Theorem 3.2]) that e.g. over Q, <, + the natural order-generic collapse does
not hold for arbitrary databases. This result draws a borderline between finite
and finitely representable databases on the one side and arbitrary databases on
the other. In the present paper we extend that borderline. We develop a natural
generalization of the notion of finitely representable databases. We call these
databases ω-representable, and we obtain the following
Main Theorem 1. First-order logic has the natural order-generic collapse for
ω-representable databases over Q, <, + and R, <, +.
We call a database ω-representable if each of its relations can be explicitly defined
by a formula in infinitary logic which makes use of the linear ordering and of a
countable, unbounded sequence of constants s1 < s2 < · · · in U. For U ∈ {Q, R},
ω-representable databases turn out to be exactly those databases where every
relation is defined by an infinitary Boolean combination of order-constraints
132
Nicole Schweikardt
over U, (sn )n1 . I.e., those database relations essentially consist of a finite or
a countable number of multidimensional rectangles in U.
In particular, the theorem above shows that there is a natural class that
contains “essentially infinite” databases, to which the collapse results of Benedikt
et al. and Belegradek et al. can be generalized, for the special case of Q, <, +
or R, <, + as underlying structures.
The two main tools for proving Main Theorem 1 are
(1.) a result of [8] that implies, for U ∈ {Q, R}, the natural order-generic collapse
over U, <, + for the class of ω-databases (these are the databases whose
active domain is either finite or consists of an unbounded sequence s1 <
s2 < · · · of elements in U), and
(2.) the following Main Theorem 2, which allows us to lift collapse results for
ω-databases to collapse results for ω-representable databases.
Main Theorem 2. Let U, <, · · · be an extension of U, < with arbitrary
additional predicates. If first-order logic has the natural order-generic collapse
over U, <, · · · for the class of ω-databases, then it also has the natural ordergeneric collapse over U, <, · · · for the class of ω-representable databases.
Structure of the Paper. In section 2 we provide the notation used throughout
the paper. In section 3 we give an outline of the proof and we point out analogies
and differences compared with related papers which use a similar proof method.
In section 4 we explain the collapse result of [8] which gives us the collapse for
ω-databases. In section 5 we examine infinitary logic and give a characterization of ω-representable relations. In section 6 we explain how an ω-representable
database can be represented by a ω-database. In section 7 we show that there
are first-order interpretations that map an ω-representable database to an ωdatabase, and vice versa. In section 8 we prove the two main theorems. In section 9 we conclude the paper by pointing out further questions and a potential
application.
2
Preliminaries
We use Q for the set of rationals, R for the set of reals, and ω for the set of
non-negative integers. For r, s ∈ R we write int [r, s] to denote the closed interval
{x ∈ R : r x s}. Analogously, we write int [r, s) for the halfopen interval
int [r, s] \ {s}, and int (r, s) for the open interval int [r, s] \ {r, s}.
Depending on the particular context, we use x as abbreviation for a sequence
x1 , . . , xm or a tuple (x1 , . . , xm ). Accordingly, if q is a mapping defined on all
elements in x, we write q(
x) to denote the sequence q(x1 ), . . , q(xm ) or the tuple
(q(x1 ), . . , q(xm )). If R is an m-ary relation on the domain of q, we write q(R)
to denote the relation {q(
x) : x ∈ R}. Instead of x ∈ R we often write R(
x).
For two disjoint sets A and B we write A B to denote the disjoint union of A
and B.
The Natural Order-Generic Collapse
133
First-Order Logic FO(τ ). A signature τ consists of finitely many relation
and constant symbols. Each relation symbol R ∈ τ has a fixed arity ar(R) ∈ ω.
Whenever we refer to some “c ∈ τ ”, we implicitly assume that c is a constant
symbol in τ . Analogously, “R ∈ τ ” always means that R is a relation symbol
in τ . We use x1 , x2 , . . as variable symbols. Atomic τ -formulas are y1 =y2 and
R(y1 , . . , ym ), where R ∈ τ is of arity, say, m and y1 , . . , ym are constant symbols
in τ or variable symbols. FO(τ )-formulas are built up as usual from the atomic
τ -formulas and the logical connectives ∧, ∨, ¬, the variable symbols x1 , x2 , . . ,
the existential quantifier ∃, and the universal quantifier ∀. We write qd(ϕ) to
denote the quantifier depth of a formula ϕ, i.e., the maximum number of nested
quantifiers that occurs in ϕ. We sometimes write ϕ(x1 , . . , xk ) to indicate that
x1 , . . , xk are the free variables of ϕ, i.e., those variables that are not bound by
a quantifier. We say that ϕ is a sentence if it has no free variables. If we insert
additional constant or relation symbols, e.g. < and +, into a signature τ , then
we simply write FO(τ, <, +) instead of FO(τ ∪ {<, +}).
Structures. Let τ be a signature. A τ -structure A = U, τ A consists of an
arbitrary set U which is called the universe of A, and a set τ A that contains
– an interpretation RA ⊆ Uar(R) , for each R ∈ τ ,
– an interpretation cA ∈ U, for each c ∈ τ .
and
The active domain of A is the set of all constants of A, together with the set of
all elements in U that belong to one of A’s relations.
Sometimes we explicitly want to specify the universe U of a τ -structure A.
In these cases we say that A is a U, τ -structure. In the present paper, we only
consider structures with universe U ∈ {R, Q}.
For a FO(τ )-sentence ϕ we say that A models ϕ and write A |= ϕ to indicate that ϕ is satisfied when interpreting each symbol in τ by its interpretation
in τ A . We write A |= ϕ to indicate that A does not model ϕ. For a FO(τ )formula ϕ(x1 , . . , xk ) and for elements a1 , . . , ak in the universe of A we write
A |= ϕ(a1 , . . , ak ) to indicate that the (τ ∪ {x1 , . . , xk })-structure A, a1 , . . , ak models the FO(τ ∪ {x1 , . . , xk })-sentence ϕ.
Since it is more convenient for our proof, we will talk about structures instead
of databases. A structure can be viewed as a database whose database schema
may contain not only relation symbols but also constant symbols. This allows
us to restrict ourselves to boolean queries (which are formulated by sentences)
instead of considering the general case of k-ary queries for arbitrary k (which
are formulated by formulas with k free variables).
Order-Generic Collapse. Let U ∈ {R, Q}. A mapping α : U → U is called an
order-automorphism of U if it is bijective and strictly increasing. For a U, τ structure A we write α(A) to denote the α(U), τ -structure with Rα(A) = α(RA )
for all R ∈ τ and cα(A) = α(cA ) for all c ∈ τ .
Let U, <, · · · be an extension of U, < with arbitrary additional predicates. A FO(τ, <, · · · )-sentence ϕ is called order-generic on A iff for every orderautomorphism α of U it is true that “A, <, · · · |= ϕ iff α(A), <, · · · |= ϕ”.
134
Nicole Schweikardt
Let C be a class of structures. We say “first-order logic has the natural ordergeneric collapse over U, <, · · · on structures in C” to express that the following
is valid for every signature τ : Let ϕ be a FO(τ, <, · · · )-sentence, and let K be
the class of all U, τ -structures in C on which ϕ is order-generic. There exists
a FO(τ, <)-sentence ψ which is equivalent to ϕ on K, i.e., “A, <, · · · |= ϕ iff
A, < |= ψ” is true for all A ∈ K.
Infinitary Logic L∞ω (<, S). Infinitary logic is defined in the same way as
first-order logic, except that arbitrary (i.e. possibly infinite) disjunctions and
conjunctions are allowed. Only in the context of infinitary logic we allow a signature to contain infinitely many symbols. What we need in the present paper
is the following: Let S be a possibly infinite set of constant symbols. The logic
L∞ω (<, S) is given by the following clauses: It contains all atomic formulas x=y
and x<y, where x and y are variable symbols or elements in S. If it contains ϕ,
then it contains also ¬ϕ. If it contains ϕ and if x is a variable symbol, then it contains also ∃xϕand ∀xϕ. If Φ is a (possibly infinite) set of L∞ω (<, S)-formulas,
then Φ and Φ are formulas in L∞ω (<, S).
The semantics is a direct extension of the semanticsof first-order logic, where
Φ is true if there is some ϕ ∈ Φ which is true; and Φ is true if every ϕ ∈ Φ
is true.
In the present paper we use infinitary logic only for the universe U = R or U =
Q, where the constant symbols are interpreted by numbers in U. Consequently,
we identify the set S of constant symbols with a set S ⊆ U.
Sets of Type at Most ω, ω-Structures, and ω-Representable Structures. Let U ∈ {R, Q}. We say that S ⊆ U is of type ω if U, <, S is isomorphic to U, <, ω. One can easily see that S is of type ω if and only if
S = {s1 < s2 < · · ·}, where the sequence (sn )n1 is strictly increasing and unbounded. Accordingly, we say that S is of type at most ω if S is finite or of type
ω.
We say that a U, τ -structure A is an ω-structure if the active domain of A
is of type at most ω.
A relation R ⊆ Um is called ω-representable if there is a set S ⊆ U of type at
most ω such that R is definable in L∞ω (<, S), i.e. there is a L∞ω (<, S)-formula
ϕR (x1 , . . , xm ) with R = {
a ∈ Um : U |= ϕR (
a)}. Accordingly, a U, τ -structure
A is called ω-representable if each of A’s relations is ω-representable.
For better readability, we formulate the rest of the paper only for the case
U = R. However, all statements remain correct if one replaces R by Q.
3
Outline of the Proof – The Lifting Method
It is by now quite a common method in database theory to lift results from one
class of databases to another. This lifting method can be described as follows:
Known: A result for a class of “easy” databases.
The Natural Order-Generic Collapse
135
Wanted: The analogous result for a class of “complicated” databases.
Method:
(1.) Show that all the relevant information about a “complicated”
database can be represented by an “easy” database.
(2.) Show that the translation from the “complicated” to the “easy”
database (and vice versa) can be performed in an appropriate way
(e.g. via an efficient algorithm or via FO-formulas).
(3.) Use this to translate the known result for the “easy” databases
into the desired result for the “complicated” databases.
In the literature the “easy” database which represents a “complicated” database
is usually called the invariant of the “complicated” database. Table 1 gives a
listing of recent papers in which the lifting method has been used.
Table 1. Some papers using the lifting method.
“compl.” dbs “easy” dbs result (“easy” dbs)
[9] planar spatial
finite dbs evaluation of
dbs
fixpoint+counting
queries
[7]
region dbs
finite dbs order-generic collapse
over R, <, +, ×
(cf. [2])
[5] finitely rep. dbs finite dbs logical characterization of complexity
classes
[1] finitely rep. dbs finite dbs order-generic collapse
over quasi o-minimal
structures
[here]
ω-rep. dbs
ω-dbs
order-generic collapse
over R, <, +
result (“compl.” dbs)
evaluation of top.
FO(R, <)-queries
collapse from top.
FO(R, <, +, ×)-queries
to top. FO(R, <)-queries
complexity of
query evaluation
order-generic collapse
over quasi o-minimal
structures
order-generic collapse
over R, <, +
In particular, Belegradek, Stolboushkin, and Taitslin [1] and Grädel and
Kreutzer [5] show that all the relevant information about a finitely representable
database (i.e. a database defined by a finite Boolean combination of orderconstraints) can be represented by a finite database, and that the translation
from finitely representable to finite (and vice versa, in [1]) can be done by a
first-order interpretation.
Grädel and Kreutzer use this translation to carry over logical characterizations of complexity classes to results on the data complexity of query evaluation.
They lift, e.g., the well-known logical characterization “PTIME = FO+LFP on
ordered finite structures” to the result stating that the polynomial time computable queries against finitely representable databases are exactly the FO+LFPdefinable queries.
Belegradek, Stolboushkin, and Taitslin use their FO-translations from finitely
representable databases to finite databases (and vice versa) to lift collapse results
for finite databases to collapse results for finitely representable databases.
136
Nicole Schweikardt
In the present paper the same is done for ω-representable databases and
ω-databases (instead of finitely representable databases and finite databases,
respectively). I.e.:
(1.) We show how all the relevant information about an ω-representable database
can be represented by an ω-database (cf. sections 5 and 6).
The representation here is considerably different from the representations of
[1] and [5]. It is, as the author feels, more natural for the context considered
in the present paper.
(2.) We show that the translation from the ω-representable to the ω-database
(and vice versa) can be done by a first-order interpretation (cf. section 7).
(3.) We use this translation to carry over a collapse result for ω-databases from
[8] to a collapse result for ω-representable databases (cf. section 8).
4
The Collapse Result for ω-Structures
In [8] a structure A is called nicely representable if it satisfies the following
conditions:
(1) There is an infinite sequence (In )n∈ω of intervals In = int [ln , rn ], such that
ln rn < ln+1 , and the sequence (rn )n∈ω is unbounded,
(2) n∈ω In is the active domain of A,
(3) every relation RA of A is constant on the multi-dimensional rectangles
In1 × · · · × Inar(R) (for all n1 , . . , nar(R) ∈ ω). I.e., either all elements in
In1 × · · · × Inar(R) belong to RA , or no element in In1 × · · · × Inar(R) belongs
to RA .
Theorem 1 ([8], Theorem 4). First-order logic has the natural order-generic
collapse over R0 , <, + for nicely representable structures.
Let us mention that the class of ω-representable structures (considered in the
present paper) properly contains both, the class of finitely representable and the
class of nicely representable structures, whereas the class of nicely representable
structures does not contain the class of finitely representable structures.
The proof of Theorem 1 presented in [8] even shows the slightly stronger
result which states that first-order logic has the natural order-generic collapse
over R, <, + for structures that satisfy the conditions (1),
(2’), and (3), where
the condition (2’) says that there is a set N ⊆ ω such that n∈N In is the active
domain of A. In particular ω-structures, i.e. structures whose active domain is
of type at most ω, do satisfy the conditions (1), (2’), and (3). This gives us the
following
Corollary 1. First-order logic has the natural order-generic collapse over
R, <, + for ω-structures.
The Natural Order-Generic Collapse
5
137
Infinitary Logic and ω-Representable Relations
It is well-known that FO(<, S) allows quantifier elimination over R, for every
set of constants S ⊆ R. In this section we show that also L∞ω (<, S) allows
quantifier elimination over R, provided that S is of type at most ω. Recall from
section 2 that S ⊆ R is of type ω if and only if S = {s1 < s2 < · · ·}, where the
sequence (sn )n1 is strictly increasing and unbounded. Accordingly, S is of type
at most ω if S is finite or of type ω.
However, our aim is not only to show that L∞ω (<, S) allows quantifier elimination, but to give an explicit characterization of the quantifier free formulas.
This characterization will give us full understanding of what ω-representable
relations look like.
Before giving the formalization of the quantifier elemination let us fix some
notation. For the rest of this paper let S ⊆ R always be of type at most ω.
We write S(i) to denote the i-th smallest elementin S. For infinite S we define
S(0) := −∞ and N (S) := ω, and we obtain R = i∈N (S) int [S(i), S(i+1)).
For finite S we define S(0) := −∞,
N (S) := {0, . . , |S|}, and S(|S|+1) :=
+∞; and, as before, we obtain R = i∈N (S) int [S(i), S(i+1)).
For m 1 and ı = (i1 , . . , im ) ∈ N (S)m we define S(
ı) := (S(i1 ), . . , S(im )),
and
CubeS;ı := int [S(i1 ), S(i1 +1)) × · · · × int [S(im ), S(im +1)) .
We say that S(
ı) are the coordinates of the cube CubeS;ı. Obviously,
Rm =
CubeS;ı .
ı∈N (S)m
Let a = (a1 , . . , am ) ∈ Rm . The type typea;S;ı of a with respect to CubeS;ı is the
conjunction of all atoms in {yi =xi , yi <xi , xi =xj , xi <xj : i, j ∈ {1. . , m}, i =
j} which are satisfied if one interprets the variables x1 , . . , xm , y1 , . . , ym by the
numbers a1 , . . , am , S(i1 ), . . , S(im ).
We define typesm to be the set of all complete conjunctions of atoms in
{yi =xi , yi <xi , xi =xj , xi <xj : i, j ∈ {1. . , m}, i = j}, i.e., the set of all
conjuctions t where, for all i, j ∈ {1, . . , m} with i = j, either yi =xi or yi <xi
occurs in t, and either xi =xj or xi <xj or xj <xi occurs in t. Of course, typesm
is finite, and typea;S;ı ∈ typesm . Analogously, we define Typesm to be the set
of all subsets of typesm , i.e., Typesm = {T : T ⊆ typesm }. Of course, Typesm
is finite.
For a relation R ⊆ Rm we define TypeR;S;ı := {typea;S;ı : a ∈ R ∩ CubeS;ı}
to be the set of all types occurring in the restriction of R to CubeS;ı. We say
that TypeR;S;ı is the type of CubeS;ı in R. Of course, TypeR;S;ı ∈ Typesm .
In the formalization of the quantifier elimination we further use the following
notation: If ϕ is a L∞ω (<, S)-formula with free variables x := x1 , . . , xk and
y := y1 , . . , ym , we write ϕ(
y /S(
ı)) to denote the formula one obtains by replacing
the variables y1 , . . , ym by the real numbers S(i1 ), . . , S(im ).
138
Nicole Schweikardt
Proposition 1 (Quantifier Elimination). Let S ⊆ R be of type at most ω
and let m 1. Every formula ϕ(x1 , . . , xm ) in L∞ω (<, S) is equivalent over R
to the formula
ϕ̃(
x)
:=
t(
y /S(
ı)) ∧
m
S(ij ) xj < S(ij +1)
j=1
ı ∈ N (S)m t∈TypeR;S;ı
where R ⊆ Rm is the relation defined by ϕ(
x).
I.e., R = {
a ∈ Rm : R |= ϕ(
a)} = {
a ∈ Rm : R |= ϕ̃(
a)}.
The proof is similar to the quantifier elimination for FO(<, S) over R. Due to
space limitations it is omitted here.
Recall from section 2 that a relation R ⊆ Rm is called ω-representable iff there
is a set S ⊆ R of type at most ω such that R is definable in L∞ω (<, S). From
Proposition 1 we know what R looks like: It is defined by an infinitary boolean
combination of order-constraints over S, and it essentially consists of a finite or
a countable number of multidimensional rectangles. (Note, however, that also
certain triangles are allowed, e.g. via the constraint S(i) x1 < x2 < S(i+1)).
An ω-representable binary relation is illustrated in Figure 1.
b
Fig. 1. An ω-rep. binary relation R. The grey regions are those that belong to R.
6
ω-Representations of Relations and Structures
Definition 1. Let R ⊆ Rm . A set S ⊆ R is called sufficient for defining R if S
is of type at most ω and R is definable in L∞ω (<, S).
Remark 1. We say that a relation R ⊆ Rm is constant on a set M ⊆ Rm if either
all elements of M belong to R or no element of M belongs to R.
From Proposition 1 we obtain that a set S ⊆ R of type at most ω is sufficient
for defining R if and only if R is constant on the sets
CubeS;ı;t := {
b ∈ CubeS;ı : typeb;S;ı = t} ,
for all ı ∈ N (S)m and all t ∈ typesm .
The Natural Order-Generic Collapse
139
Let R ⊆ Rm be ω-representable and let S ⊆ R be sufficient for defining R.
From Remark 1 we know, for all ı ∈ N (S)m and all t ∈ typesm , that either
R ∩ CubeS;ı;t = ∅ or R ⊇ CubeS;ı;t . This means that if we know, for each ı ∈
N (S)m and each t ∈ typesm , whether or not R contains an element of CubeS;ı;t ,
then we can reconstruct the entire relation R.
For ij = 0 we represent the interval int [S(ij ), S(ij +1)) ⊆ R by the number
S(ij ). Consequently, for ı ∈ (N (S) \ {0})m , we can represent CubeS;ı;t ⊆ Rm
by the tuple S(
ı) ∈ S m . The information whether or not R contains an element
of CubeS;ı;t can be represented by the relation RS;t := {S(
ı) : ı ∈ (N (S) \
{0})m and R ∩ CubeS;ı;t = ∅}.
In general, we would like to represent every CubeS;ı;t , for every ı ∈ N (S)m ,
by a tuple in S m . Unfortunately, the case where ij = 0 must be treated separately, because S(0) = −∞ ∈ S. There are various possibilities for solving this
technical problem. Here we propose the following solution: Use S(1) to represent
the interval int [S(0), S(1)). With every tuple ı ∈ N (S)m we associate a characteristic tuple char(
ı) := (c1 , . . , cm ) ∈ {0, 1}m and a tuple ı ∈ (N (S) \ {0})m
via cj := 0 and ij := 1 if ij = 0, and cj := 1 and ij := ij if ij = 0. Now
CubeS;ı;t can be represented by the tuple S(
ı ) ∈ S m . The information whether
or not R contains an element of CubeS;ı;t can be represented by the relations
RS;t;u := {S(
ı ) : ı ∈ N (S)m , char(
ı) = u, and R ∩ CubeS;ı;t = ∅} (for all
u ∈ {0, 1}m ). This leads to
Definition 2 (ω-Representation of a Relation).
Let R ⊆ Rm be ω-representable, and let S ⊆ R be sufficient for defining R.
(a) We represent the m-ary relation R over R by a finite number of m-ary
relations over S as follows: The ω-representation of R with respect to S is
the collection
repS (R) := RS;t;u t∈types , u∈{0,1}m ,
m
where RS;t;u := {S(
ı ) : ı ∈ N (S)m , char(
ı) = u, and R ∩ CubeS;ı;t = ∅}.
Here, for ı ∈ N (S)m we define ı and char(
ı) via ij := 1 and char(
ı) j := 0
if ij = 0, and ij := ij and char(
ı) j := 1 if ij = 0.
(b) For x ∈ CubeS;ı;t we say that u := char(
ı) is the characteristic tuple of x
w.r.t. S, y := S(
ı ) is the representative of x w.r.t. S, and t is the type of
x w.r.t. S. From Remark 1 we obtain that x ∈ R iff y ∈ RS;t;u .
We will now tranfer the notion of “ω-representation” from relations to τ -structures.
Recall from section 2 that a R, τ -structure A is called ω-representable iff
each of A’s relations is ω-representable.
Definition 3. Let A be a R, τ -structure. A set S ⊆ R is called sufficient for
defining A if
– S is of type at most ω,
– cA ∈ S, for every constant symbol c ∈ τ , and
140
Nicole Schweikardt
– S is sufficient for defining RA , for every relation symbol R ∈ τ .
Let A be a R, τ -structure, and let S be a set sufficient for defining A. According
A
to Definition 2, each of A’s relations
of arity, say, m can be represented by
A R
A
a finite collection repS (R ) = RS;t;u t∈types , u∈{0,1}m of relations over S. I.e.
m
A can be represented by a structure repS (A) with active domain S as follows:
Definition 4 (ω-Representation of a Structure). Let τ be a signature.
(a) The type extension τ of τ is the signature which consists of
– the same constant symbols as τ ,
– a unary relation symbol S, and
– a relation symbol Rt;u of arity, say, m, for every relation symbol R ∈ τ
of arity m, every t ∈ typesm , and every u ∈ {0, 1}m .
(b) Let A be an ω-representable R, τ -structure and let S be a set sufficient for
defining A. We represent A by the R, τ -structure repS (A) which satisfies
– crepS (A) = cA (for each c ∈ τ ),
– S repS (A) = S (for the unary relation symbol S ∈ τ ), and
rep (A)
A
– Rt;u S
= RS;t;
u∈
u (for each R ∈ τ , each t ∈ typesar(R) , and each ar(R)
).
{0, 1}
7
FO-Interpretations
The concept of first-order interpretations (or, reductions) is well-known in mathematical logic (cf., e.g. [4]). In the present paper we consider the following easy
version:
Definition 5 (FO-Interpretation of σ in τ ). Let σ and τ be signatures. A
FO-interpretation of σ in τ is a collection
Φ = ϕc (x) c∈σ , ϕR (x1 , . . , xar(R) ) R∈σ
of FO(τ )-formulas. For every U, τ -structure A, the U, σ-structure Φ(A) is
given via
– {cΦ(A) } = {a ∈ U : A |= ϕc (a)}, for each constant symbol c ∈ σ,
– RΦ(A) = {
a ∈ Uar(R) : A |= ϕR (
a)}, for each relation symbol R ∈ σ.
Making use of a FO-interpretation of σ in τ , one can translate FO(σ)-formulas
into FO(τ )-formulas (cf., [4, Exercise 11.2.4]):
Lemma 1. Let σ and τ be signatures, let Φ be a FO-interpretation of σ in τ ,
and let d be the maximum quantifier depth of the formulas in Φ.
For every FO(σ)-sentence χ there is a FO(τ )-sentence χ with qd(χ ) qd(χ)+d, such that “A |= χ iff Φ(A) |= χ” is true for every U, τ -structure
A.
Proof. χ is obtained from χ by replacing every atomic formula R(
x) (resp. x=c)
by the formula ϕR (
x) (resp. ϕc (x)).
The Natural Order-Generic Collapse
141
The following lemma shows that A is first-order definable in repS (A), i.e.: all
relevant information about A can be reconstructed from repS (A) (if A is ωrepresentable and if S is sufficient for defining A).
Lemma 2. There is a FO-interpretation Φ of τ in τ ∪{<} such that Φ(repS (A),
<) = A, for every ω-representable R, τ -structure A and every set S which is
sufficient for defining A.
Proof (sketch). For every constant symbol c ∈ τ we define ϕc (x) := x=c.
For every relation symbol R ∈ τ of arity, say, m we construct a formula ϕR (
x)
which expresses that x ∈ R. From Definition 2(b) we know that x ∈ R iff y ∈
RS;t;u , where y , t, and u are the representative, the type, and the characteristic
tuple, respectively, of x w.r.t. S.
It is straightforward to construct, for fixed t ∈ typesm and u ∈ {0, 1}m , a
FO(τ , <)-formula ψt,u (
x) which expresses that
– x has type t w.r.t. S,
– u is the characteristic tuple of x w.r.t. S, and
– for the representative y of x w.r.t. S it holds that Rt;u (
y ).
The disjunction of the formulas ψt;u (
x), for all t ∈ typesm and all u ∈ {0, 1}m ,
gives us the desired formula ϕR (
x) which expresses that x ∈ R.
We now want to show the converse of Lemma 2, i.e., we want to show
that the ω-representation of A is first-order definable in A. Up to now the
ω-representation repS (A) was parameterized by a set S which is sufficient for
defining A. For the current step we need the existence of a canonical, first-order
definable set S. For this canonization we can use the following result of Grädel
and Kreutzer [5, Lemma 8]:
Lemma 3 (Canonical set sufficient for defining R). Let R ⊆ Rm be ωrepresentable and let SR be the set of all elements s ∈ R which satisfy the following condition (∗):
There are a1 , . . , am , ε ∈ R, ε > 0, such that one of the following holds: – For all s ∈ int (s−ε, s) and for no s ∈ int (s, s+ε) we have R a[s/s ] .
Here a[s/s ] means that all components aj =s are replaced by s . for all s ∈ int (s, s+ε) we have R a[s/s ] .
– For
no s ∈ int (s−ε, s) and
– R
a[s/s ] holds for all s ∈ int (s−ε, s+ε) \ {s}, but not for s = s.
– R a[s/s ] holds for s = s, but not for any s ∈ int (s−ε, s+ε) \ {s}.
The following holds true:
(1.) SR is included in every set S ⊆ R which is sufficient for defining R.
(2.) SR is sufficient for defining R.
The set SR is called the canonical set sufficient for defining R. It is straightforward to formulate a FO(R, <)-formula ζR (x) which expresses condition (∗),
such that SR = {s ∈ R : R, R, < |= ζR (s)} for every ω-representable m-ary
relation R.
142
Nicole Schweikardt
Definition 6 (Canonical Representation of a Structure). Let τ be a signature and let A be an ω-representable R, τ -structure. The set
SA := {cA : c ∈ τ } ∪
SR A
R∈τ
is called the canonical set sufficient for defining A. Similarly, the representation
canrep(A) := repSA (A) is called the canonical representation of A.
Remark 2. It is straightforward to see that α canrep(A) = canrep α(A) is
true for every ω-representable R, τ -structure A and every order-automorphism
α of R.
We are now ready to prove the converse of Lemma 2.
Lemma 4. There is a FO-interpretation Φ of τ in τ ∪ {<} such that Φ (A,
<) = canrep(A), for every ω-representable R, τ -structure A.
Proof (sketch). For every constant symbol c ∈ τ we define ϕc (x) := x=c.
For every relation symbol R ∈ τ let ζR (x) be the formula from Lemma 3
describing the canonical set sufficient for defining RA . Obviously, the formula
ϕS (x) := c∈τ x=c ∨
R∈τ ζR (x) describes the canonical set sufficient for
defining A.
For every relation symbol Rt;u ∈ τ of arity, say, m we construct a formula
ϕRt;u (
y ) which expresses that y ∈ Rt;u . We make use of Definition 2(b). I.e.,
ϕRt;u states that y1 , . . , ym satisfy ϕS and that there is some x such that
–
–
–
–
y is the representative of x w.r.t. SA ,
R(
x),
x has type t w.r.t. SA , and
u is the characteristic tuple of x w.r.t. SA .
It is straightforward to formalize this in first-order logic.
8
The Main Theorems and Their Proofs
We first show the
Main Theorem 2. Let R, <, · · · be an extension of R, < with arbitrary
additional predicates. If first-order logic has the natural order-generic collapse
over R, <, · · · for the class of ω-structures, then it has the natural order-generic
collapse over R, <, · · · for the class of ω-representable structures.
Proof. Let τ be a signature, let ϕ be a FO(τ, <, · · · )-sentence, and let K be
the class of all ω-representable R, τ -structures on which ϕ is order-generic. We
need to find a FO(τ, <)-sentence ψ such that “A, <, · · · |= ϕ iff A, < |= ψ”
is valid for all A ∈ K.
Let τ be the type extension of τ . We first make use of Lemma 2: Let Φ
be the FO-interpretation of τ in τ ∪ {<} which is obtained in Lemma 2. In
The Natural Order-Generic Collapse
143
particular, we have Φ(canrep(A), <) = A, for all A ∈ K. From Lemma 1
we obtain a FO(τ , <, · · · )-sentence ϕ such that “canrep(A), <, · · · |= ϕ iff
Φ(canrep(A), <), <, · · · |= ϕ iff A, <, · · · |= ϕ” is true for all A ∈ K.
From our assumption we know that first-order logic has the natural ordergeneric collapse over R, <, · · · for the class of ω-structures. Of course canrep(A)
is an ω-structure. Furthermore, with Remark 2 we obtain that ϕ is order-generic
on canrep(A) for all A ∈ K.
Hence there must be a FO(τ , <)-sentence ψ such that “canrep(A),
<, · · · |= ϕ iff canrep(A), < |= ψ ” is true for all A ∈ K.
We now make use of Lemma 4: Let Φ be the FO-interpretation of τ in τ ∪{<}
which is obtained in Lemma 4. In particular, we have Φ (A, <) = canrep(A), for
all A ∈ K. According to Lemma 1, we can transform ψ into a FO(τ, <)-sentence
ψ such that “A, < |= ψ iff Φ (A, <), < |= ψ iff canrep(A), < |= ψ ”
is true for all A ∈ K. Obviously, ψ is the desired sentence, and hence our proof
is complete.
Main Theorem 2 and Corollary 1 directly give us the following
Main Theorem 1. First-order logic has the natural order-generic collapse over
R, <, + for the class of ω-representable structures.
9
Conclusion
We have developed the notion of ω-representable databases, which is a natural
generalization of the notion of finitely representable (i.e. dense order constraint)
databases. We have shown that any collapse result for ω-databases can be lifted
to the analogous collapse result for ω-representable databases. In particular, this
implies that first-order logic has the natural order-generic collapse over R, <, +
and Q, <, + for ω-representable databases.
Recursive Databases. In theoretical computer science one is often interested
in things that can be represented in the finite. This is not a priori true for
ω-representable databases. sHowever, there is a line of research considering recursive structures (cf. [6]). In this setting a database is called recursive if there
is, for each of its relations, an algorithm which effectively decides whether or not
an input tuple belongs to that relation. The results of the present paper are, in
particular, true for the class of ω-representable recursive databases, which still is
a rather natural extension of the class of finitely representable (i.e. dense order
constraint) databases.
Open Questions. It is an obvious question if the collapse results discussed in
the present paper also hold for Z-databases (i.e. databases whose active domain
is of type at most Z) and for Z-representable databases. It should be straightforward to transform the proof of Main Theorem 2 in such a way that it is valid
for these databases. However, we do not know if the corresponding analogue to
Corollary 1 is valid.
144
Nicole Schweikardt
Another question is whether such a collapse result for ω-representable
databases is valid also over structures other than R, <, + and Q, <, +. E.g.:
Is it valid over R, <, +, ×, or even over all (quasi) o-minimal structures? (This
would then fully generalize the results of Belegradek et al. [1].)
We also want to mention a potential application concerning topological
queries: Kuijpers and Van den Bussche [7] used the theorem of Benedikt et
al. [2] to obtain a collapse result for topological first-order definable queries. One
step of their proof was to encode spatial databases (of a certain kind) by finite
databases, to which the result of [2] can be applied. Here the question arises
whether there is an interesting class of spatial databases that can be encoded
by ω-representable (but not by finite) databases in such a way that our main
theorem helps to obtain some collapse result for topological queries.
Acknowledgements
I want to thank Luc Segoufin for pointing out to me the connection to topological
queries. Furthermore, I thank Clemens Lautemann for helpful discussions on the
topics of this paper.
References
1. O.V. Belegradek, A.P. Stolboushkin, and M.A. Taitslin. Extended order-generic
queries. Annals of Pure and Applied Logic, 97:85–125, 1999.
2. M. Benedikt, G. Dong, L. Libkin, and L. Wong. Relational expressive power of
constraint query languages. Journal of the ACM, 45:1–34, 1998.
3. M. Benedikt and L. Libkin. Expressive power: The finite case. In G. Kuper,
L. Libkin, and J. Paredaens, editors, Constraint Databases, pages 55–87. Springer,
2000.
4. H.D. Ebbinghaus and J. Flum. Finite Model Theory. Springer, 1999.
5. E. Grädel and S. Kreutzer. Descriptive complexity theory for constraint databases.
In Proc. CSL 1999, volume 1683 of Lecture Notes in Computer Science, pages 67–81.
Springer, 1999.
6. D. Harel. Towards a theory of recursive structures. In Proc. MFCS 1998, volume
1450 of Lecture Notes in Computer Science, pages 36–53. Springer, 1998.
7. B. Kuijpers and J. Van den Bussche. On capturing first-order topological properties
of planar spatial databases. In Proc. ICDT 1999, volume 1540 of Lecture Notes in
Computer Science, pages 187–198. Springer, 1999.
8. C. Lautemann and N. Schweikardt. An Ehrenfeucht-Faı̈ssé approach to collapse
results for first-order queries over embedded databases. In Proc. STACS 2001,
volume 2010 of Lecture Notes in Computer Science, pages 455–466. Springer, 2001.
9. L. Segoufin and V. Vianu. Querying spatial databases via topological invariants.
JCSS, 61(2):270–301, 2000.