Download Entropy-Balanced Bitmap Tree for Shape-Based

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Transcript
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 5, MAY 2011
1603
Entropy-Balanced Bitmap Tree for Shape-Based
Object Retrieval From Large-Scale Satellite
Imagery Databases
Grant J. Scott, Member, IEEE, Matthew N. Klaric, Student Member, IEEE,
Curt H. Davis, Fellow, IEEE, and Chi-Ren Shyu, Senior Member, IEEE
Abstract—In this paper, we present a novel indexing structure that was developed to efficiently and accurately perform
content-based shape retrieval of objects from a large-scale satellite
imagery database. Our geospatial information retrieval and indexing system, GeoIRIS, contains 45 GB of high-resolution satellite
imagery. Objects of multiple scales are automatically extracted
from satellite imagery and then encoded into a bitmap shape
representation. This shape encoding compresses the total size of
the shape descriptors to approximately 0.34% of the imagery
database size. We have developed the entropy-balanced bitmap
(EBB) tree, which exploits the probabilistic nature of bit values
in automatically derived shape classes. The efficiency of the shape
representation coupled with the EBB tree allows us to index
approximately 1.3 million objects for fast content-based retrieval
of objects by shape.
Index Terms—Content-based retrieval, image databases,
knowledge-based indexing, object indexing, remote sensing.
I. I NTRODUCTION
A
S THE volume of remote-sensing earth imagery continues
to increase, automated processes must be developed and
refined, which can eliminate the requirement of a human-inthe-loop for creating large-scale searchable image repositories.
Content-based image retrieval (CBIR) is an increasingly popular retrieval method for large-scale image databases. CBIR
queries are not performed in a traditional relational database
management system (RDBMS) of image metadata, e.g., sensor, location, or time, but instead use features extracted from
image content to search. Traditionally, descriptive features are
extracted to represent various discriminating properties of the
image content. These features may represent global properties,
e.g., color and texture, or collective localized features, e.g.,
Manuscript received July 7, 2009; revised November 30, 2009 and
May 18, 2010; accepted August 22, 2010. Date of publication December 17,
2010; date of current version April 22, 2011. This work was supported in part
by the National Geospatial-Intelligence Agency University Research Initiatives
(NURI) under Grant HM1582-04-1-2028 and by the U.S. National Science
Foundation under Grant IIS-0812515.
G. J. Scott, M. N. Klaric, and C. H. Davis are with the Center for Geospatial
Intelligence, University of Missouri, Columbia, MO 65211-0001 USA.
C.-R. Shyu is with the Informatics Institute, University of Missouri,
Columbia, MO 65211 USA.
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TGRS.2010.2088404
the shape and color of segmented objects or the texture of
partitioned image regions. Numerous CBIR systems have been
reported in the literature, e.g., Query by Image Content (QBIC)
[1], VisualSeek [2], Photobook [3], and PicToSeek [4]. In
[5], Gevers and Smeulders offer a comprehensive overview of
CBIR. In [6], Lew et al. provide a review of the state of the art
in CBIR.
In the remote-sensing domain, there are relevant contributions that focus on content-based retrieval and, oftentimes, image information mining. A notable contribution that
has explored content-based retrieval of satellite imagery is
the knowledge-based information mining (KIM) system by
Datcu et al. [7]. With regard to CBIR, KIM exploits Landsat
Thematic Mapper (TM), as reported in [8]. Li and Narayanan,
in [9], used Land Cover and Land Use thematic maps as
supervised training of support vector machines over the spectral
information of an image. They also exploit Gabor wavelets for
textural feature extraction to capture spatial information from
an image.
A necessity for developing a successful CBIR system is the
extraction of discriminant features to describe the images in
the database. As such, the development of feature extraction
algorithms has dominated the literature in the field. These
fundamental features are often assembled to model higher
level human visual perception for CBIR, where the ultimate
goal is to retrieve visually similar images. In addition, there
exists a solid foundation of literature on the extraction and
modeling of spatial components of imagery, e.g., objects or
natural divisions (e.g., the horizon of a landscape photo or a
foreground person). Shape analysis and retrieval have emerged
as particularly important topics in CBIR, because visual knowledge is often related to shape characteristics of objects. In this
paper, we are primarily concerned with object shape retrieval
from large-scale remote-sensing imagery databases. Hence, we
will focus on a shape feature set generated using objects that
were automatically extracted from a large collection of satellite imagery. Two promising approaches for automatic shape
extraction from large-scale satellite imagery databases include
techniques based on transforms (e.g., Fourier [10] or wavelet
[11]) and morphology [12]. For the reported research herein,
we employ the latter approach, as described in Section II.
There exists a plethora of research with regard to object
shape feature extraction. Traditionally, shapes are conceptualized in the literature using a few broad categories as follows:
0196-2892/$26.00 © 2010 IEEE
1604
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 5, MAY 2011
1) contours; 2) regions; or 3) skeletons. Contour representations
of shapes are typically outlines found with edge detectors and
other similar image processing techniques. Recently, scalespace methods [13] have become very popular as shape descriptors. Using this approach, an object contour is continually
smoothed by increasing Gaussian filters, building a hierarchy
of salient inflection points [14]. In [15], Avrithis et al. utilize
Fourier transforms and curve moments to build invariant curve
representations for further feature extraction. Kunttu et al.
present a method for encoding both color and shape information
using intensity Fourier [10]. Other methods have been developed for nonrigid shapes, e.g., multiscale convexity concavity
by Adamek and O’Connor [16]. Comparative literature, such
as [17], provides a retrieval performance review of various
contour-based descriptors using a standard data set, including
curvature scale space, wavelet encoded contours, and visual
correspondence. Other comparative literature includes Zhang
and Lu’s [18] extensive review of Fourier, scale space, Zernike
moments, and grid descriptors.
Object skeleton-based features are somewhat rare in the
literature. Skeletons can be derived using morphological image processing techniques or variations such as medial axis
[19] or shock graphs [20]. In the CBIR of skeleton-encoded
shape features, searching is equivalent to graph matching or
computing transformation steps to achieve the second graph
from the first graph. The complexity of ranking skeletons with
these methods limits the efficiency of retrieval performance,
because the similarity must be computed between the query and
numerous candidates. Despite the computational cost, skeletal
methods have been shown to particularly be robust with regard
to object occlusion.
Other shape descriptors include edge histograms combined
with Fourier transforms as in [21], which exploit statistical
information of the shape. Minimum bounding circle [22] and
convex hull approaches rely on finding a circular or convex
region to encompass the shape prior to feature extraction. The
feature extraction then processes the object shape by also examining the regions not populated by the intersection of shape
and bounding object.
Along with the breadth of object shape feature spaces,
there exists a healthy quantity of the literature focused on
measuring similarity of shapes in the aforementioned feature
spaces. Popular approaches depend, to a degree, on the feature
space. In scale-space methods, the predominant methods involve finding inflection point correspondence between objects.
Some approaches measure the similarity through deformation/
transformation steps to achieve the second shape from the first
(e.g., [23]), or the second skeleton from the first [20]. Other
approaches combine local and global invariants for computing
similarity [24]. Utilizing local invariants is key to maintaining
adequate retrieval of objects that are subject to occlusion.
In CBIR, it is desirable to provide the results of the query
in a similarity-ordered set. For this reason, CBIR is often
cast as a problem of finding nearest neighbors in the feature
space defined by the chosen object descriptor. Although there
exist methods such as pruning to eliminate segments of the
database, the most efficient approaches use indexing schemes
to access the feature space [25]. Various indexing schemes have
been reported in the literature, including the containment tree
for topological image structure [26], EBS k-D tree for highdimensional feature spaces [27], and sparse distributed memory
structures for properties generated from principal component
analysis (PCA) in [28]. In [29], Liu et al. construct a separate
1-D index for each feature in the feature set. Through their
search algorithm, this method has the benefit of quickly returning empty sets if no objects are within the desired similarity
radius. However, the algorithm generates a candidate point
set in each dimension, requiring results to be merged into a
final result set. In high-dimensional feature spaces, the number
of candidate point sets may inhibit performance. In addition,
dense feature spaces will compound this problem, because the
candidate point lists will substantially increase.
Note that the CBIR literature rarely has feature descriptors
tightly coupled with indexing structures. To create a truly
scalable system for CBIR, one should substantially increase
the database size without equivalent retrieval performance decreases. This paper directly addresses this problem. We have
designed an indexing structure, the entropy-balanced bitmap
(EBB) tree, which is particularly suited to our chosen shape
descriptor. Existing RDBMS indexing mechanisms are not
suitable for our shape encoding data. In addition, common
space/data partitioning indexing extensions for RDBMS are ill
suited for this high-dimensional data. We explored the suitability of metric index approaches and found them inadequate for
our data collection. By using a shape descriptor that provides
a small fixed encoding size and developing a tightly coupled
indexing and retrieval structure, we have developed a scalable
approach for content-based retrieval of objects using shape. Our
object shape database consists of 1.3 million objects, yet we
can return thousands of the most similar ranked shapes in a few
seconds.
The remainder of this paper is organized as follows. In
Section II, we explain our automatic object extraction, shape
encoding, and data clustering as the index preprocessing steps.
Section III describes the theoretical basis of the EBB tree, along
with relevant algorithm details. Our experimental methods and
results are detailed in Section IV. We conclude with discussion
in Section V.
II. O BJECT E XTRACTION AND P REPROCESSING
We have developed an extensive geospatial imagery retrieval
system, GeoIRIS [30], which employs numerous retrieval techniques. Currently, our image database contains 45 GB of highresolution orthorectified, georeferenced commercial satellite
imagery. This imagery is five banded—0.6–1.0 m panchromatic and 2.4–4.0 m multispectral—with each band having an
11-b effective range. One of the latest extensions to GeoIRIS
employs a scale-invariant shape descriptor to retrieve objects
from the database. This paper is focused on retrieval applications with geospatial awareness; as such, a typical interaction
for our system might be to submit an object as the query,
along with geospatial constraints. For example, “Given a query
image containing a baseball diamond, find all similar baseball
diamonds in the database that are within 2 km of a radio
broadcast tower.” With this goal in mind, we must efficiently
SCOTT et al.: ENTROPY-BALANCED BITMAP TREE FOR SHAPE-BASED OBJECT RETRIEVAL
1605
perform object-based retrievals from our database as the first
step to incorporate other geospatial knowledge. The extraction
and encoding of object shapes is described in the following
sections. Note that we also store object spectral information and
principal axis length for use in complex object queries.
A. Multiscale Object Extraction and Shape
Representation in Bitmaps
Our automatic object extraction algorithms for highresolution satellite imagery [31] exploit the differential morphological profile (DMP) [12] to facilitate the processing of
large quantities of imagery and efficiently discover objects.
One of the current challenges in any automatic object indexing
process is to extract the relevant objects from the imagery. In
small image databases, edge detectors and segmentation are viable object location strategies. Our database of high-resolution
satellite imagery contains numerous large scenes, with a total
coverage of 3994 km2 . For a collection of satellite imagery
of substantial scale, traditional object extraction methods are
inefficient.
Manual extraction of objects from an imagery collection
of this scale is infeasible; as such, automated processes are
necessary. To accomplish this difficult task, we process the
scenes using the DMP on the panchromatic channel of the imagery. The DMP is a multiscale segmentation algorithm, which
exploits contrast edges in imagery. Using geodesic morphology
by reconstruction, objects that are lighter or darker than their
surrounding image content generate response in the DMP. The
intensity of the DMP is correlated to the difference in the
contrast of the object and its surrounding. The resulting extractions are homogenous regions, each representing an object. The
interested reader should refer to [12] for a detailed presentation
of the DMP. The DMP produces a set of scaled contrast
responses, referred to as DMP levels. Level m represents the
possible objects detected using a geodesic disk of size rm ,
which were not detected with radius rm−1 . Each level in the
DMP represents objects extracted after the transition from one
geodesic scale to the next. During the processing of the DMP,
we utilized a normalized difference vegetation index (NDVI) to
filter out nonanthropogenic objects.
Because the resulting objects are anthropogenic structures
extracted from imagery using DMP responses, we use regionbased (solid) objects instead of applying additional processing
to generate contours or skeletons. Therefore, we focused our
approach on the region-based subset of all available shape
descriptors. Seeking the most efficient method to represent
region-based shapes and still have adequate descriptive power
to identify general shapes, we chose grid descriptors [32].
Grid descriptors are effectively a sampling of an object shape
into a matrix of fixed size. These grids provide natural scale
invariance. In addition, immediately prior to sampling the object into the grid, we align the principal axis of the extracted
object to the middle horizontal grid axis. Given a fixed-size
grid that represents a shape, it is natural to represent the grid
as a simple bitmap. In GeoIRIS, we used 322 b, representing
1024-D bitmap space, for an encoding size of 128 B per
shape. Early empirical analysis revealed that this size of bitmap
Fig. 1. Grid descriptor of extracted objects. The object exists in the original
image and can be extracted using the DMP. (a) and (b) Two regions of original
imagery. (c) and (d) DMP-extracted objects from (a) and (b), respectively.
(e) and (f) Encoding of the lighter extracted objects each of (c) and (d), the
ones represent bits set on.
provided good balance of retrieval performance and shape discrimination. Fig. 1 shows two example chips from our database
imagery, followed by a relevant level of the DMP, and finally
the resulting bitmap encoded shape.
This object-encoding scheme has some substantial benefits
that we exploit. First, for our large-scale image database with
1.3 million extracted objects, all objects are encoded in less than
160 MB or 0.34% of the original 45 GB. Therefore, continuing
to scale our database will not be limited by an increase in
the number of encoded shapes. Second, an indexing scheme
has been developed, which allows efficient ranked retrievals
and exploits bit operations, instead of floating-point operations,
during search and ranking. This indexing structure is developed
in Section III, including algorithms for induction and search.
Finally, revisiting Fig. 1(e) and (f), we can see that the encoding
is very intuitive. In our GeoIRIS database, we have a mixture of
scenes from urban, suburban, and rural areas of the world. On
the average, we have 325 encoded objects per square kilometer.
Because the balance of land cover type varies in the imagery,
the number of extracted objects will necessarily vary. For
1606
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 5, MAY 2011
example, as the proportion of urban versus rural area increases,
the number of objects can be expected to increase per square
kilometer. This condition will have an effect on the portion
of the original database size that is needed for object shape
representation.
niques, i.e., the concept of inverse document frequency [33].
We defined the inverse bit frequency, ibf , of bit[k] to be
B. Bitmap Dissimilarity Measure
However, ibf alone proved insufficient for some key objects
in our satellite imagery domain. This measure of bit relevance
drove the weights near the center horizontal axis bits to effectively zero. Our final bit-weighting scheme is to use the
following combination of (3) and (4):
One important factor of the indexing and retrieval scheme
is the choice of an appropriate dissimilarity metric. We utilize
one dissimilarity metric for initially clustering data, ranking the
bitmap results, and building the priority queues for leaf traversal. We experimented with numerous dissimilarity measures,
which rely on measuring the number of bits that varies between
two bitmaps. We have
d(B1 , B2 ) = |B1
XOR
B2 |
(1)
with XOR representing the bitwise exclusive OR operation. In
(1) and the following discussions, we borrow from a mathematical set notation and use |B| to represent the count of bits on
in a bitmap for equations (as well as the more general count of
elements in a set) and refer to this value as the cardinality of a
bitmap in the text.
When measuring the dissimilarity of two shaperepresentation bitmaps, not all bits need to be treated equal. As
expected, various bits in different objects can have significantly
different relevance to the object shape. We therefore evaluated
weighted bit dissimilarities using
dwt (B1 , B2 ) =
K
(B1 [k]
XOR
B2 [k]) ∗ wt[k]
(2)
k=1
where wt[k] is the weight assigned to bit k. In these dissimilarity measures, each bit that differs contributes to the dissimilarity
by the amount of its weight. In (2), the significant step is
assigning weights that accentuate shape differences.
As described in Section II, all objects are aligned to the
x-axis and centered at the y-dimension of the bitmap. This
alignment, coupled with the scaling of all objects into a fixed
bitmap size, implies that every object will horizontally span
the center of the bitmap. Therefore, the bits along this center
axis contribute less to the shape information than the top and
bottom regions. With this condition in mind, we assume that
bits farther from the center horizontal are more important in
describing the object shape when they are set on. Our initial
experimental bit-weighting approach used the square root of the
absolute y-of f set from the center x-axis as
(3)
y_off[k] = y_offset.
Because we use an even number of bits for the edges, the two
rows that form the center horizontal axis are counted as an offset
one. Therefore, with our chosen bitmap size of 32 × 32, (3) has
a range of [1, 4].
Empirical analysis revealed that bit-weighting schemes
based on the y-of f set alone were not sufficient. Therefore,
we borrowed ideas from established document retrieval tech-
ibf [k] = log
DB Population Size
DB objects with bit[k] = 1
wt[k] = max {ibf [k], y_off[k]} .
.
(4)
(5)
During our analysis, to determine an appropriate bit weighting for dissimilarity measures, we examined the effects on
different types of shapes. Fig. 6 has a sample of the variety
of shapes that may be found in our database. Airplanes are
representative of complex shapes, L-shaped buildings are representative of shapes with a few significant concavities, and
baseball fields are representative of shapes that combine linear
and curvature to form shapes that are difficult to distinguish
from purely curve-based shapes without bit weighting.
C. Clustering Multiscale Objects in Bitmap Space
In CBIR applications, it is generally expected that similar
objects are close together in the feature space. In addition,
we do not expect the feature space to uniformly be saturated
with objects. Instead, we expect that similar database objects
will tend to form high-dimensional clouds in the feature space.
In effect, this case is a necessary requirement of successful
feature extraction algorithms, low intraclass variance, and high
interclass spread. With this requirement in mind, it is beneficial to apply clustering algorithms to automatically discover
and label these dense regions. Clustering techniques are well
established in pattern recognition and data mining. The complexity of clustering algorithms is heavily influenced by the
size of the database, both in the dimensionality and number of
objects. In-depth discussion of feature extraction philosophies
and clustering techniques can be found in [34] and [35]. Once
clusters are discovered in the database, the statistical properties
of these clusters can be exploited to create efficient and accurate
indexing of the feature space. The EBB induction algorithms
rely on these clusters.
To prepare our database for indexing, we adapted the densitybased spatial clustering of applications with noise (DBSCAN)
[36] clustering algorithm for use with a large collection of
bitmaps. We developed a sampling-based clustering approach
that typically uses two passes through the database. This approach makes it particularly attractive for large-scale feature
sets, e.g., the objects extracted from our satellite imagery. To
generate clusters, we measure dissimilarity between any two
bitmaps using (2). After clustering, each object belongs to a
cluster of objects that are similar in bitmap space.
SCOTT et al.: ENTROPY-BALANCED BITMAP TREE FOR SHAPE-BASED OBJECT RETRIEVAL
1607
Fig. 2. Example EBB tree. Circular nodes are decisions in the search path, determined by maximizing (6), and square nodes are the leaves that contain the
bitmap population, which exists in the nodes at the various stages of induction, concluding with the leaves. The leaf nodes are labeled with their class population
for clarity.
III. EBB I NDEXING
Bitmap indexing has many uses in retrieval and databases.
In traditional RDBMS, bitmap indexes are utilized to partition
relations into a relatively small number of disjoint sets using
single attributes (e.g., gender). Another common usage is the
bitmap index for term–document correlation in information
retrieval (IR) schemes [33]. In IR, the bitmaps are typically
documents of a collection, and individual bits represent the
presence of a term in the document. In our current context, we
are dealing with bitmaps that represent object shapes as a binary
grid, using 32 × 32 b. If we attempt to grow a full bitmap index
that covers all of the bitmap space, we would need 21024 − 1
internal nodes to accommodate 21024 leaves. What is required
is a significantly smaller index. In the current discussion, we are
dealing with a large collection of bitmaps, clustered in sets of
naturally occurring groups of similar bitmaps. Our approach,
the EBB tree, exploits these groupings found in the bitmap
space to efficiently index the object shapes with a much smaller
tree than would be necessary to cover the entire space. For
retrieval efficiency, these clusters are further divided into a large
number of leaves that contain a small group of very similar
bitmaps. Furthermore, to accommodate large results sets, the
leaves are linked together in priority queues for leaf navigation.
For GeoIRIS, our bitmap index has 27 005 leaves, with an
average leaf population of 47.49 and an average search depth
of 14.72.
A. EBB Tree Induction
In previous work, the entropy-balanced statistical k-D tree
[27], [37] was used to exploit knowledge about classes or
groupings in a feature space when indexing continuous mul-
tidimensional feature sets. The motivation is to increase retrieval precision by lowering the entropy while simultaneously
reducing the imbalance of the tree. Using statistical analysis of
clustered or ground-truth labeled data, we exploit the statistical
properties of clusters to induce an entropy-balanced tree that
decreases the entropy from parent to child nodes. Statistical
entropy, as defined by Shannon, is a measure of the randomness
or variability of data [38]. Therefore, induction should seek
to minimize leaf entropy, ensuring that leaf contents have a
high degree of similarity. One desirable trait is to not greedily
sacrifice the entropy of one node to lower the entropy of its
sibling. The result is an efficient indexing structure, where
searches reach leaves of low entropy, implying more certainty
that the leaf contents are similar to the query.
Fig. 2 illustrates the general concepts of EBB induction.
Note that the bitmaps represented are 16 b in size, and in the
following discussion, the bit positions start at zero in the top
left and count across the bitmap rows. One traditional bitmap
that covers this bitmap space would require 65 536 leaf nodes
and 65 535 internal nodes, a total tree size of 217 − 1. At the
root level, there exists a large collection of bitmaps organized
into five classes, represented by the five grids. If the induction
algorithm determines the decision bit as k = 12, then the three
bitmaps with that bit on will be in the right child (A,C,D). The
two bitmaps with that bit of f (B,E) will be pushed into the left
child. At the second level, the initial left child may determine
the best decision bit as k = 8, thereby splitting the two classes
into the left (E) and right (B) child nodes. The root node’s right
child determines the next split to be k = 1, creating a single
class (D) leaf as its right child and a two-class node as its left
child (A,C). Finally, the internal node at the third level will use
bit k = 0 to separate its two classes into the final leaves. This
1608
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 5, MAY 2011
resulting EBB tree will allow navigation to the various classes
using two to three bit comparisons.
If we used a greedy maximum entropy reduction approach,
without the balancing effect, we would have a tendency to make
decisions that position leaves at higher levels in the tree. For
example, again considering the root node in Fig. 2, a greedy
decision may use bit k = 0 as the root node decision, creating a
class-A leaf at level 1. This step would be followed by a series
of similar decisions, resulting in a tree with leaves at levels 1, 2,
3, and two leaves at level 4. The resulting searches would take
from one to four comparisons. Projecting this behavior out to
much larger databases, with larger bitmaps, more classes, and
less crisp class-bit probabilities accentuates the variability in
search efficiencies. Another issue may arise when examining
why a greedy decision will split off leaves higher in the tree,
i.e., it sacrifices the entropy of one subtree for the gain in the
other subtree. Another downside to this type of split decision
is the effect of creating numerous low-probability high-entropy
leaves. These leaves would be sufficient in a classifier, but in an
indexing system that expects retrievals to require leaf traversals,
this approach can result in traversing through numerous highentropy leaves. Fig. 2 represents a simplification of the data for
illustration, where the classes have only 0 and 1 as bit probabilities. Actual data are significantly more complex, including
more classes, more bits, and bit probabilities in classes that
range between 0.0 and 1.0.
The a priori class-based bit probabilities are a key part of the
exploitation of database knowledge during index induction. In
any particular leaf, during the induction of an index, we must
estimate the probability of some Classi within the portion of
the bitmap space that the leaf occupies. These classes represent
our discovered clusters of bitmaps. The clustered bitmaps could
be represented as prototype vectors of floating-point numbers,
but in reality, statistics developed from these vectors have
little significance in the binary space. Therefore, we use the
probabilities of bits being on or of f for a given Classi in
some bitmap space covered by some Leafj . We approximate
this value by examining the members of each Classi in the
leaf and tracking the occurrence of on and of f bits for each
bit position. The use of these approximations are discussed as
follows in the context of computing the conditional probability
of Leafj , given Classi .
The EBB tree is designed for very large collections of
bitmaps, which lend themselves to exploiting the probabilistic
tendencies of data. One critical design issue is the development
of the split decision objective function, which can properly
exploit these class-based probabilities. Our desire is to induce
an index with a collection of low-entropy leaves. The result is
then a collection of leaves of objects in the bitmap space, where
each leaf represents a group of similar bitmaps. Therefore, we
desire a decision criterion that allows the recursive induction
algorithm to balance the reduction of entropy between each set
of sibling subtrees whenever a split decision is made. This condition ensures that the entropy of one subtree is not sacrificed
for the sake of the other subtree. The decision criterion is the
bit k that maximizes
γ = Hparent − σHR − σHL − ABS(σHR − σHL ).
(6)
Hparent is the entropy of the parent node, and σHR and σHL
are the weighted sum components of the right and left children,
respectively, where σH = P (Leafj )H(Leafj ). The first three
terms on the right-hand side of (6) represent the reduction of
entropy. The terms in the absolute value represent the balancing
factor. We also constrain the node splitting to require at least a
minimal entropy reduction from parent to children, such as a
percentile decrease. We calculate the entropy of any Leafj as
H(Leafj ) = −
L
P (Classi |Leafj ) log P (Classi |Leafj )
i=1
(7)
where L is the number of classes that exist in Leafj .
To calculate the entropy of a leaf in a high-dimensional
bitmap space, we define basic probabilities over a bitmap
database as a foundation. Given a database D composed of
a set of disjoint classes, we define the a priori probability of
Classi as
P (Classi ) =
|Classi |
|D|
(8)
where |Classi | and |D| are the size of Classi and the database,
respectively.
To capture the aforementioned point-based probabilities,
we calculate the a priori probabilities of each bit k in each
class i as
# off bits
|Classi |
# on bits
P (Classi,k=1 ) =
|Classi |
P (Classi,k=0 ) =
(9)
(10)
which represent the probability of Classi with bit k set off or
on, respectively. To maintain an approximation of the class bit
probabilities during tree induction, we calculate the probability
of Classi ’s bit k in N odej using
⎧
if on & of f bits
⎨ 1.0
P (Classi,k,j ) = P (Classi,k=0 ) if of f bits
(11)
⎩
P (Classi,k=1 ) if on bits.
Equation (11) examines Classi ’s bit variations in some particular Leafj . If the class has bitmaps with k both of f and on, then
the probability of reaching Leafj by a search with a Classi
bitmap is approximated as 1.0. If Classi has only of f or on
bits, this approximation is taken from (9) or (10), respectively.
The probability of Leafj , given Classi in a bitmap of size K,
is calculated as
P (Leafj |Classi ) =
K
P (Classi,k,j ).
(12)
k=1
This approach allows us to calculate the probability of Classi ,
given Leafj , using Bayes’ theorem as
P (Classi |Leafj ) =
P (Leafj |Classi )P (Classi )
P (Leafj )
(13)
where the probability of Leafj is the number of database objects in Leafj over the size of the database. Using (13), we can
SCOTT et al.: ENTROPY-BALANCED BITMAP TREE FOR SHAPE-BASED OBJECT RETRIEVAL
1609
Fig. 3. Two split decisions based on a three class example, with the result tree shown on the right. The grids shown are populated with the probability of on bits
in each class, as defined by (10). Bit numbers start at 0 and fill rows to the right, ending at 8 in the bottom right corner. L and R are the left and right children of a
possible split, respectively.
calculate the entropy (7) and thereby make the desired induction
decisions to maximize (6). The EBB has a notable property
related to the maximum height of the index. In particular, the
maximum search depth of any path to an EBB leaf is K + 1,
where K is the size of the bitmaps. As can be observed from
Algorithm 1, a specific bit k can only be used a single time in
one path. This condition limits the number of decision nodes to
K and, therefore, the maximum search depth.
The EBB tree is built with a recursive decision tree induction
algorithm as detailed in Algorithm 1, SplitNode. Initially, the
entire database D is evaluated as a root node R, and a decision
bitmap is created, dcsn, with all bits set of f . SplitNode is
then called with the root node and the blank decision bitmap.
For each of f bit in the dcsn bitmap, divide the current node
into two candidate child nodes. Each database object with the
current bit off is assigned to left child; if the bit is set on, objects
are assigned to the right child. Then, the split objective function
(6) is evaluated, possibly updating the current best decision
γmax and setting the decision bit kmax . For a node to split, the
entropy must be reduced from the parent to its children by some
threshold . When a split bit is determined, that bit is set on in
dscn. After storing the kmax in the current node as the decision
bit, SplitNode is called for both the left and right children,
each using the new dcsn bitmap. The dcsn bitmap is passed
to the recursive calls of the children to allow the induction to
accelerate, because those bits must never be evaluated again. If
some N odej uses bit k as the decision bit, then all bitmaps in
the left subtree will have the bit k off, and all bitmaps in the
right subtree will have the bit set on. Therefore, the maximum
height of the EBB is the number of bits in the bitmap, as in the
case in traditional bitmap index schemes.
Fig. 3 provides a three-class example to illustrate the behavior of Algorithm 1 for making split decisions. On the left side
are class representations of the probabilities of bit k being on,
i.e., (10). In the middle table are two decision levels, the root
node, Decision 1, and the root node’s right child as Decision
2. The bit index is the first column, followed by the probability
and entropy of each child for a possible split at that bit. Finally,
the decision value of (6) is shown in the last table column. The
last part in Fig. 3 is the resulting tree produced from the table.
For the computations in Fig. 3, P (Class1 ) = P (Class2 ) =
P (Class3 ) = 0.333. Note that bits 0, 2, 6, and 8 never need
evaluation, because the entire population has those bits set
off. The root decision has four bits that need to be evaluated:
1) bit 1; 2) bit 4; 3) bit 5; and 4) bit 7. Bits 5 and 7 are equivalent
but have different permutations of the class-to-child distribution
generated by bit 1. Bit 1 effectively partitions off class 3 into the
left child, leaving classes 1 and 2 in the right. Bit 4 partitions
class 1 between the left and right nodes, raising the entropy of
the right node. The root node’s right child decision bits that
must be evaluated are 4 and 5. Bit 4 will split class 1 between
the left and right children, increasing the entropy of the right,
because it will also have all of class 2. Bit 5, as a decision bit,
partitions the node into class-homogenous leaves, which is the
optimal split.
Algorithm 1. SplitNode(N,dcsn): Recursive node-splitting
algorithm for inducing the EBB tree. Parameters include the
node N and previous decisions dcsn.
1: Calculate and store (7) for N ;
2: Initialize decision parameter γmax ;
3: for all bit k such that: (k AND dcsn) = k do
4: Partition N into lef tN and rightN , using bit k;
5: Calculate σHR and σHL
6: Calculate γk using (6)
7: if γk > γmax then
8:
γmax ← γk ;
9:
kmax ← k;
10: end if
11: end for
12: if Suitable Decision Found then
13: Create LChild and RChild with appropriate data.
14: dcsn ← dcsn OR kmax ;
15: Store kmax the decision of this node N.k;
16: SplitNode(LChild, dcsn);
17: SplitNode(RChild, dcsn);
18: end if
B. EBB Tree Search and Retrieval
Searching the EBB tree is performed in the following two
steps: 1) search into the bitmap space index and 2) generate ranked results of bitmaps in the leaves using a chosen
metric. During the induction of the index, each node stores
the decision bit k. Once the tree is induced, the leaves are
analyzed to provide efficient nonlinear leaf traversal during
searches. To accommodate the need to traverse the leaves, we
build a neighbor priority queue for each leaf by calculating the
probabilistic prototypes of the leaves (i.e., groups of bitmaps).
The probabilistic prototype is calculated from the probability
1610
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 5, MAY 2011
that each bit k is on in the current leaf. When the probability of
bit k being on is greater than or equal to 0.5, the prototype bit
is set on. These prototypes are then used to compute neighbor
priority queues based on leaf prototype similarity on a per-leaf
basis. The generation of the leaf priority queues is O(m2 q),
where m is the number of leaves generated in the tree, and q is
the desired priority queue size.
Given a query bitmap B, the navigation down the index
is a series of simple bitwise operations. A search into the
feature space for n results is performed following the recursive
Algorithm 2. The search with bitmap B starts in the root
node, specifying desire result size S. In each internal node N ,
steps 1–6 facilitate the recursive tree navigation. At each internal node, a decision bit k was stored during induction. If the
query bitmap has that bit set on, the search continues in the
right subtree of the current node N ; otherwise, it continues in
the left subtree.
Algorithm 2. Search(N,B,S): EBB tree searching in node N
for S results from population D partitioned into leaves L using
query bitmap B.
1: if N not a leaf then
2: if B AND k = k then
3:
Return Search(N.RightChild,B,S);
4: else
5:
Return Search(N.LeftChild,B,S);
6: end if
7: else
8: Rank destination leaf’s bitmaps in order of similarity
into result R;
9: if |R| < S then
10:
for L ∈ P riorityQueue do
11:
Rank L bitmaps into R;
12:
if |R| ≥ S then
13:
Break;
14:
end if
15:
end for
16: end if
17: end if
18: RETURN R;
When a leaf of the tree is reached, the leaf population is
added to the ranked result set R. Oftentimes, a leaf may not
have an adequate amount of data to satisfy the desired result set
size, in which case the search must continue collecting results
from additional leaves. At this point, searches must continue
outward in the bitmap space from the initial leaf. The traversal
of the leaves is codified starting at step 9 in Algorithm 2 when
the current result set size is less than the desired size S. We first
check that more results are needed, which is the expected case.
We then examine the first neighbor in the priority queue, which
is the most similar leaf in the tree, as measured by the chosen
similarity metric. The bitmaps of this neighbor leaf L are added
into the result set R. In an iterative fashion, the size of R is
again checked, and the next neighbor leaf of the priority queue
is processed, if needed. We currently build priority queues for
each leaf that cover a portion of the database population.
Retrievals are conducted using the EBB tree, which partitions
the feature space into relatively small groups of similar bitmaps.
The extracted bitmap data resides in a dedicated data store and
is used during similarity ranking, whereas the index resides in
search agents. The search agents have a small memory footprint
and can be distributed (i.e., replicated) across a network. The
search agents hold just the navigation portion of the EBB, accessing the priority queues and data inside the data store. When
traversal through the leaf population is required, a leaf’s priority
queue is used for navigation through the data store. From a
practical standpoint, we build priority queues to accommodate
search result sizes that are a portion of the database size |D|.
IV. R ESULTS
The evaluation of a large-scale content-based retrieval system is often subjective. Different users may consider the various
visual–perceptual characteristics to be of different importance.
This case is a driving reason for concepts such as relevance
feedback and customizable queries. In large-scale databases
of satellite imagery with automatically extracted objects, it is
infeasible to have the database ground truth labeled. Due to
the subjective nature of content-based retrieval and the large
scale of our database, we provide some example retrievals from
the system, experiments using deformed shapes, and efficiency
evaluations in the remainder of this section.
A. Shape Deformation Effects
We performed experiments to evaluate the effects of increasing levels of bit differences on the retrieval of objects. For these
experiments, we eroded the encoded bitmap shapes and then
used the eroded bitmap as a query into the database. Fig. 4
shows three example imagery objects, followed by the bitmap
encoded shape, and the bitmap eroded by 2% and 5% of the
pixels. The bitmaps simultaneously were eroded from the top
and bottom. As noted in Section III, bits farther along the yoffset from the center x-axis are more heavily weighted than
bits near the center. For this reason, we chose to first erode
the bitmaps for this experiment in the significant weighted bits.
This approach accentuates the effect of the shape change in the
rankings. The first row in Fig. 4 shows our first airplane shape,
and the effect of the erosion is the loss of the wings on the
plane shape. The second row is the L-shaped building, were the
erosion appears to effectively round off the corners while still
preserving the general appearance of the L shape. The last row
is the water treatment pool, which we basically begin to flatten.
Note that the water treatment pools, followed by the baseball
diamonds, had the largest bitmap cardinality. Therefore, the
percentage-based erosion has the most drastic impact on these
shapes. Given that all our objects are automatically extracted
from the imagery, these experiments help demonstrate to what
degree our retrieval ability is affected by bit differences that
result from imperfect extraction algorithms.
We evaluated the ranking position of the original encoded
shape in the query results, as well as the dissimilarity trend
with increasing erosion. Fig. 6 represents a subset of our
test data, which includes 10 airplanes, 20 baseball diamonds,
SCOTT et al.: ENTROPY-BALANCED BITMAP TREE FOR SHAPE-BASED OBJECT RETRIEVAL
Fig. 4.
1611
Shape erosion examples: imagery object, extracted and aligned shape, and then erosion at 2% and 5% for the test of shape change on retrieval ranking.
10 L-shaped buildings, 10 water treatment pools, and numerous
other shapes. We used 500 test shapes to erode for this experiment, but retrievals were conducted against the full database
of 1.3 million objects. Fig. 5(a) provides a summary of the
erosion of our test data with regard to bit cardinality, related to
percentage. Baseball diamonds and airplanes averages provide
the bounds of the bit erosion as the highest and lowest bit cardinalities, respectively. The average of all test objects is shown by
the middle trend (All). Fig. 5(b) plots the average dissimilarity
of the eroded shape versus the original encoded shape as the
erosion increases. Note that this trend lacks the linearity of the
bits versus percent erosion in Fig. 5(a) due to the bit-weighting
scheme employed. As expected, the higher cardinality baseball
diamond average significantly increases faster under percentage
erosion due to both the increased number of bits that have
changed and the propensity of those eroded bits to be weighted
higher. Fig. 5(c) shows the average rank of an evaluation object
when the system is queried with an eroded version of the object.
It is observed that the airplanes retain the first result backup
through 5% erosion due to the uniqueness of the shape and, as
can be observed in Fig. 4, the airplane shape is still observable
in 5% of the eroded shapes. Baseball diamonds, on the other
hand, quickly drop in average rank by 5% erosion. This result
can be attributed to the fact that a given percentage erosion
in baseball diamonds is more than three times the bits eroded
from an airplane. In addition, as previously emphasized, the bits
eroded are the highest weighted bits, thereby more significantly
affecting the rank of the original object. We also expect that
our database has more objects that loosely resemble baseball
diamonds, e.g., buildings and water treatment pools.
B. Content-Based Object Retrievals
Currently, we support customizable object searches in our
system by separately weighting the effect of the object’s shape
and spectral features, as well as offering size constraints on
the objects. Fig. 6 shows six example objects (top row) used
as queries into GeoIRIS, followed in the next eight rows with
ordered search results. In the clips of the satellite imagery
shown, the relevant objects are bounded by yellow rectangles.
1612
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 5, MAY 2011
Fig. 5. Shape erosion effects. (a) Trends of the average quantity of bits
eroded from shapes relative to the percentage of erosion. (b) Trend of average
dissimilarity between test objects and their eroded counterpart. (c) Trend in
ranking of the original object when queried with its eroded version.
The first column is a query using an airplane object. The search
was customized to designate a shape-only search, without using
object spectral characteristics and constraining search result objects to a size of 25–115 m. These results demonstrate rotational
insensitivity. In this example, one of the challenging issues of
using the DMP for automatic object extraction emerges. Results
3 and 4 are the same plane, extracted from neighboring levels
of the DMP, but with shape encodings that vary enough to not
be detected as duplicates by our existing algorithms (due to
variations in the extracted shape at different levels). This case
happens also with results 6 and 7. In each of these cases, it is
shown that the extracted shape has variations just enough to
change the dimensions and position of the bounding box.
The second column is a similar query with a different plane.
Note that the query plane from column 1 appears as the second
to the last of the results for column 2. Column 3 is a search
with a single baseball diamond among a complex of diamonds.
In this search, we equally weight the shape with the spectral
characteristics of the object and, furthermore, constrain the size
to a range of 30–80 m. We want to ensure that we get large
areas of dirt, which are not very large or very small to be a
baseball diamond. Note that, in these results, we have found
four distinct baseball diamond complexes. In addition, in lower
ranked results, grass infield baseball diamonds are discovered
once we exhaust the database of dirt infield diamonds. Column
4 is an L-shaped building that was queried with no spectral
characteristics and a size constraint of larger than 30 m. By
not utilizing spectral characteristics, we can find objects of
various colors. Column 5 represents a water treatment pool.
These dark objects are not perfectly round but, instead, have
a deep and narrow concavity from the light-colored catwalk
(see Fig. 4, row 3). This query is performed with the shape
weighted 75% and the spectral characteristics weighted at 25%,
with the size constrained to 45–60 m. In this query, some of our
results are not our conceptual water pool yet closely match in
terms of shape and spectral characteristics. The final column
results from querying the system with the dark circle on top of
a cooling tower, with 60/40 shape/spectral weighting and the
size constrained to 30–150 m. Of our imagery collection, we
have three cooling towers, two of which are shown as the top
two results. The third cooling tower, which is not returned, has
large amounts of steam occluding the top opening of the tower
in our imagery. The remaining results are similar in extracted
shape and spectral characteristics.
We conducted an additional experiment using a collection
of ground-truth labeled baseball diamonds, L-shaped buildings,
and water treatment pools. Our nonlabeled data consisted of
31 000 objects from a 2-km spatial proximity of 20 baseball
diamonds, 10 L-shaped buildings, and 10 water treatment pools.
For each test, we withheld 10% of the ground-truth objects
for tenfold cross-validation. After building EBB for each test
collection, we queried the remaining data to measure the recall
in the top 50 of the desired test class. Our baseball diamonds
had an average recall 77.87% in the top 50 results. In addition,
four new baseball diamonds, which were not part of the ground
truth, were discovered in the results. Our L-shaped buildings
exhibited an average recall of 80.48%, and the water treatment
pools exhibited an average recall of 70.04%. In all of these tests,
neither object size nor spectral characteristics was not used
to restrict the result set, as in typical queries in our GeoIRIS
system. The use of object size would allow for improved recall
in a smaller result set, because the unlabeled data averaged
36.25 m, whereas the minimum size of the ground-truth objects
was 37 m.
C. Efficiency
With regard to query efficiency, the search time of shapebased queries is primarily dependent on the number of results desired and the associated cost of computing the bitmap
dissimilarities. Recall from Section III that we build leaf priority queues to enable the nonlinear navigation in the highdimensional bitmap space. Our system utilizes priority queues
that have a target object coverage for retrievals of approximately 20 000. We typically retrieve 6000–12 000 results from
SCOTT et al.: ENTROPY-BALANCED BITMAP TREE FOR SHAPE-BASED OBJECT RETRIEVAL
Fig. 6.
Shape retrieval with EBB. Top row images are query objects (in bounding box), and top ranked results follow.
1613
1614
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 5, MAY 2011
Fig. 8. Content-based retrieval interface using an airplane (left-hand side) that
was automatically extracted from the image database. The large center region
is the fourth ranked object retrieved from the database, shown in context and
marked with the bounding box. The top of the retrieval results are shown on the
right.
Fig. 7. Retrieval efficiency. (a) Average number of bitmap dissimilarities
calculated using (2) to retrieve thousands of the results. (b) Average seconds
to retrieve a result set of increasing result set size (thousands). (c) Timing
comparison for 10-nearest neighbor searches of increasing database size for
database versus EBB.
the search agent for the client interface. We performed retrieval
efficiency experiments using 800 randomly selected objects in
our database. Each object was used to retrieve results sets of
size 1000 and increasing to 25 000, in steps of 1000. Fig. 7
summarizes the results of these experiments. For each retrieval,
we recorded the time (in seconds) and the number dissimilarity calculations used. For example, the average number of
bitmap dissimilarities computed, with (2), to retrieve the top
1000 results was less than 1053. This result is 0.081% of the
comparisons that a brute-force search would require in our
database of 1.3 million objects. Fig. 7(a) shows that the number
of dissimilarity computations (solid line) closely follows the
desired result size (plus marked trend). The number of comparisons needed for brute force is 1.3 million for any number of
desired results. As aforementioned, our priority queues target
coverage of objects is 20 000, which causes the dissimilarity
computations to begin to level off after 20 000 desired results
as we exhaust the navigation priority queues. Fig. 7(b) shows
the timing trend (in seconds) for the experiment. The average
retrieval time for our typical query of 6000 shapes was 1.87 s,
and for 12 000, it was 4.57 s. As previously discussed, the
priority queue sizes that we use began to limit the number of
retrievals possible, and therefore, the timing begins to level
off after 20 000. Note that this priority queue size is simply a
system parameter that can be adjusted to balance the expected
search result sizes against the resources dedicated to managing
priority queues. This level of efficiency for shape retrievals
facilitates integration into complex systems, such as GeoIRIS.
In addition, Fig. 7(c) provides a comparison in timing between
10-nearest neighbor searches against increasing database sizes
for both a traditional RDBMS and the EBB. These results are
average times for 300 test queries as the database size increases.
The notable trends are that searches without the EBB linearly
increase with the size of the database. In contrast, searches with
the EBB are significantly faster and logarithmically increase,
as expected. When considering the two types of efficiency
experiments together, we see that query times are linear with
respect to the desired size of a result set S and are logarithmic
with respect to the number of leaves L. Overall, the efficiency
of the retrieval is O(Slog2 (L)). For all these experiments, we
used a dual quad-core (1.6-GHz) server with 8 GB of RAM,
which was running the back-end PostgreSQL database, as well
as the EBB index agents and search clients.
V. C ONCLUSION
In our GeoIRIS retrievals, we have not expected to retrieve
objects using shape alone; instead, we have combined the results of shape retrieval with object spectral signatures and, possibly, size constraints. In the future, we expect to include object
textural analysis or other relevant algorithms, with shape-based
object retrieval being a component of a larger solution. Fig. 8
SCOTT et al.: ENTROPY-BALANCED BITMAP TREE FOR SHAPE-BASED OBJECT RETRIEVAL
shows the results from our GeoIRIS system when searching
with an automatically extracted airplane. The upper left corner
is a region with the query shape noted by a bounding box. The
larger center region shows the currently selected result in its
larger context, with the result object in a bounding box. This
example shows the fourth ranked result to the query object.
Note that the airplanes are facing in two directions. Given the
alignment steps, we can expect objects, e.g., airplanes, to be
aligned pointing either left or right. To facilitate retrievals with
rotational insensitivity, we augment the retrieval by repeating
the search with a mirrored bitmap of the query. Given our
retrieval efficiency, the search time for 6000 shapes increases
to less than 6 s for a second result set and the subsequent merge
of the result sets.
Our future work will continue toward increasing the scalability of the EBB. Research is needed to determine the practical
limits on the scalability of the EBB for today’s hardware
platforms and algorithms to overcome these limitations. We are
also developing algorithms to provide dynamic manipulation of
the EBB with data inserts, deletes, and updates. One of the challenges of dynamic manipulation is maintaining the efficiency
of the tree after periods of manipulation. As the number of
changes to the tree increases, relative to the original database
size, the statistics of the data needs to be reevaluated and
the tree possibly entirely rebuilt. In addition, we will explore
applications of the EBB to other domains, possibly domainspecific text retrieval with a limited index term set. Another
possible extension of this paper is the generalization of the EBB
from bitmap space to arbitrarily discrete feature spaces.
ACKNOWLEDGMENT
The authors would like to thank DigitalGlobe for providing
QuickBird imagery from the RADII development data set
for use in this paper and the reviewers for their constructive comments, which have significantly helped improve this
manuscript.
R EFERENCES
[1] M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom,
M. Gorkani, J. Hafher, D. Lee, D. Petkovie, D. Steele, and P. Yanker,
“Query by image and video content: The QBIC system,” Computer,
vol. 28, no. 9, pp. 23–32, Sep. 1995.
[2] J. R. Smith and S.-F. Chang, “Visualseek: A fully automated contentbased image query system,” in Proc. 4th ACM Int. Conf. Multimedia,
1996, pp. 87–98.
[3] A. Pentland, R. W. Picard, and S. Sclaroff, “Photobook: Content-based
manipulation of image databases,” Int. J. Comput. Vis., vol. 18, no. 3,
pp. 233–254, Jun. 1996.
[4] T. Gevers and A. W. Smeulders, “PicToSeek: Combining color and shape
invariant features for image retrieval,” IEEE Trans. Image Process., vol. 9,
no. 1, pp. 102–119, Jan. 2000.
[5] T. Gevers and A. W. Smeulders, “Content-based image retrieval: An
overview,” in Emerging Topics in Computer Vision, G. M. S. B. Kang, Ed.
Upper Saddle River, NJ: Prentice-Hall, 2004, ch. 8, pp. 333–384.
[6] M. Lew, N. Sebe, C. Lifi, and R. Jain, “Content-based multimedia information retrieval: State of the art and challenges,” ACM Trans. Multimedia
Comput., Commun., Appl., vol. 2, no. 1, pp. 1–19, Feb. 2006.
[7] M. Datcu, H. Daschiel, A. Pelizzari, M. Quartulli, A. Galoppo,
A. Colapicchioni, M. Pastori, K. Seidel, P. G. Marchetti, and S. D’Elia,
“Information mining in remote sensing image archives: System concepts,”
IEEE Trans. Geosci. Remote Sens., vol. 41, no. 12, pp. 2923–2936,
Dec. 2003.
1615
[8] H. Daschiel and M. Datcu, “Information mining in remote sensing image
archives: System evaluation,” IEEE Trans. Geosci. Remote Sens., vol. 43,
no. 1, pp. 188–199, Jan. 2005.
[9] J. Li and R. M. Narayanan, “Integrated spectral and spatial information
mining in remote sensing imagery,” IEEE Trans. Geosci. Remote Sens.,
vol. 42, no. 3, pp. 673–685, Mar. 2004.
[10] I. Kunttu, L. Lepisto, and J. Rauhamaa, “Fourier-based object description
in defect image retrieval,” Mach. Vis. Appl., vol. 17, no. 4, pp. 211–218,
Sep. 2006.
[11] V. P. Shah, N. H. Younan, S. S. Durbha, and R. L. King, “A systematic
approach to wavelet-decomposition-level selection for image information
mining from geospatial data archives,” IEEE Trans. Geosci. Remote Sens.,
vol. 45, no. 4, pp. 875–878, Apr. 2007.
[12] M. Pesaresi and J. A. Benediktsson, “A new approach for the morphological segmentation of high-resolution satellite imagery,” IEEE Trans.
Geosci. Remote Sens., vol. 39, no. 2, pp. 309–320, Feb. 2001.
[13] J. Weickert, S. Ishikawa, and A. Imiya, “Linear scale-space has first been
proposed in Japan,” J. Math. Imaging Vis., vol. 10, no. 3, pp. 237–252,
May 1999.
[14] F. Mokhtarian and A. K. Mackworth, “A theory of multiscale, curvaturebased shape representation for planar curves,” IEEE Trans. Pattern Anal.
Mach. Intell., vol. 14, no. 8, pp. 789–805, Aug. 1992.
[15] Y. Avrithis, Y. Xirouhakis, and S. Kollias, “Affine-invariant curve normalization for object shape representation,” Mach. Vis. Appl., vol. 13, no. 2,
pp. 80–94, Nov. 2001.
[16] T. Adamek and N. E. O’Connor, “A multiscale representation method for
nonrigid shapes with a single closed contour,” IEEE Trans. Circuits Syst.
Video Technol., vol. 14, no. 5, pp. 742–753, May 2004.
[17] L. J. Latecki, R. Lakamper, and U. Eckhardt, “Shape descriptors for
nonrigid shapes with a single closed contour,” in Proc. IEEE Comput.
Soc. Conf. Comput. Vis. Pattern Recog., 2000, vol. 1, pp. 1424–1429.
[18] D. Zhang and G. Lu, “Review of shape representation and description
techniques,” Pattern Recognit., vol. 37, no. 1, pp. 1–19, Jan. 2004.
[19] H. Blum, “A transformation for extracting new descriptors for
shape,” in Models for the Perception of Speech and Visual Forms,
W. Whaten-Dunn, Ed. Cambridge, MA: MIT Press, 1967, pp. 362–380.
[20] T. B. Sebastian, P. N. Klein, and B. B. Kimia, “Recognition of shapes by
editing shock graphs,” in Proc. ICCV, 2001, pp. 755–762.
[21] S. Brandt, J. Laaksonen, and E. Oja, “Statistical shape features for
content-based image retrieval,” J. Math. Imaging Vis., vol. 17, no. 2,
pp. 187–198, Sep. 2002.
[22] M. Safar and C. Shahabi, “MBC-based shape retrieval: Basics, optimizations and open problems,” Multimedia Tools Appl., vol. 29, no. 2, pp. 189–
206, Jun. 2006.
[23] S. Belongie, J. Malik, and J. Puzicha, “Shape matching and object recognition using shape contexts,” IEEE Trans. Pattern Anal. Mach. Intell.,
vol. 24, no. 4, pp. 509–522, Apr. 2002.
[24] E. Rivlin and I. Weiss, “Local invariants for recognition,” IEEE Trans.
Pattern Anal. Mach. Intell., vol. 17, no. 3, pp. 226–238, Mar. 1995.
[25] R. Mehrotra and J. Gary, “Similar-shape retrieval in shape data management,” Computer, vol. 28, no. 9, pp. 57–62, Sep. 1995.
[26] M. Kliot and E. Rivlin, “Invariant-based shape retrieval in pictorial
databases,” Comput. Vis. Image Understanding, vol. 71, no. 2, pp. 182–
197, Aug. 1998.
[27] G. Scott and C.-R. Shyu, “Knowledge-driven multidimensional indexing
structure for biomedical media database retrieval,” IEEE Trans. Inf. Technol. Biomed., vol. 11, no. 3, pp. 320–331, May 2007.
[28] R. Rao and D. Ballard, “Object indexing using an iconic sparse distributed
memory,” in Proc. IEEE Int. Conf. Comput. Vis., 1995, pp. 24–31.
[29] C.-C. Liu, J.-L. Hsu, and A. L. Chen, “Efficient near neighbor searching
using multi-indexes for content-based multimedia data retrieval,” Multimedia Tools Appl., vol. 13, no. 3, pp. 235–254, Mar. 2001.
[30] C.-R. Shyu, M. Klaric, G. J. Scott, A. S. Barb, C. H. Davis, and
K. Palaniappan, “GeoIRIS: Geospatial information retrieval and indexing
system—Content mining, semantics modeling, and complex queries,”
IEEE Trans. Geosci. Remote Sens., vol. 45, no. 4, pp. 839–852, Apr. 2007.
[31] M. Klaric, G. Scott, C.-R. Shyu, and C. Davis, “Automated object extraction through simplification of the differential morphological profile
for high-resolution satellite imagery,” in Proc. IGARSS, 2005, pp. 1265–
1268.
[32] G. Lu and A. Sajjanhar, “Region-based shape representation and similarity measure suitable for content-based image retrieval,” Multimedia Syst.,
vol. 7, no. 2, pp. 165–174, Mar. 1999.
[33] R. A. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval.
Reading, MA: Addison-Wesley, 1999.
[34] K. Fukunaga, Introduction to Statistical Pattern Recognition. New York:
Academic, 1990.
1616
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 5, MAY 2011
[35] S. Theodoridis and K. Koutroumbas, Pattern Recognition. New York:
Academic, 1999.
[36] M. Ester, H. Kriegel, J. Sander, and X. Xu, “A density-based algorithm
for discovering clusters in large spatial databases with noise,” in Proc.
Int. Conf. Knowl. Discov. Data Mining, 1996, pp. 226–231.
[37] G. Scott and C.-R. Shyu, “EBS k-d tree: An entropy-balanced statistical
k-d tree for image databases with ground-truth labels,” in Proc. Int. Conf.
Image Video Retrieval, vol. 2728, Lecture Notes in Computer Science,
2003, pp. 467–476.
[38] C. E. Shannon, “A mathematical theory of communication,” Bell Syst.
Tech. J., vol. 27, pp. 379–423, Jul.–Oct. 1948.
Grant J. Scott (S’02–M’09) received the B.S. and
M.S. degrees in computer science and the Ph.D. degree in computer engineering and computer science
from the University of Missouri, Columbia, in 2001,
2003, and 2008, respectively.
He is currently serves an Assistant Research
Professor with the Department of Electrical and
Computer Engineering, University of Missouri. He
conducts research as part of the Satellite and Remote
Sensing Group, Center for Geospatial Intelligence
(CGI). During his Ph.D. studies, he was a member
of the Medical and Biological Digital Library Research Laboratory and the
Center for Geospatial Intelligence, University of Missouri, conducting research in high-performance multimedia retrieval systems (databases), hybrid
retrieval systems and protein structural retrieval/comparison engines, and highresolution satellite image processing. During the course of his M.S. degree, he
was a member of the Computational Intelligence Research Laboratory, with
research emphasis on computational intelligence, pattern recognition, neural
networks, fuzzy systems, image processing/machine vision, and bio-medical
image databases. His current research is focused on the automated exploitation
of high-resolution satellite imagery, in particular geospatial database development, imagery feature-extraction algorithm development, and distributed
automatic imagery processing orchestration architectures. His research interests also include high-dimensional indexing and content-based retrieval
in biomedical and geospatial databases, as well as computer vision, pattern
recognition, computational intelligence, databases, parallel/distributed systems,
and information theory in support of media databases systems.
Matthew N. Klaric (S’06) received the B.S. (summa
cum laude) degree in computer science from Saint
Louis University, St. Louis, MO, in 2003. He is currently working toward the Ph.D. degree in computer
science at the University of Missouri, Columbia.
In 2004, he was a Research Assistant with the
Medical and Biological Digital Library Research
Laboratory and the Center for Geospatial Intelligence, University of Missouri. In addition, he has
served as an Instructor for several undergraduate
computer science classes. His research interests include geospatial content-based information retrieval, data mining, computer
vision, and pattern recognition.
Mr. Klaric annually reviews papers for the IEEE International Geoscience
and Remote Sensing Symposium (IGARSS).
Curt H. Davis (S’90–M’92–SM’98–F’08) was born
in Kansas City, MO, on October 16, 1964. He received the B.S. and Ph.D. degrees in electrical engineering from the University of Kansas, Lawrence, in
1988 and 1992, respectively.
He is currently the Naka Endowed Professor of
electrical and computer engineering with the University of Missouri, Columbia (MU) and the Director of
the Center for Geospatial Intelligence. His primary
research involves the use of satellite microwave and
optical remote sensing systems for applications in
the areas of earth observation and science, ice sheet mapping and change
detection, and urban area geospatial information processing. His ice sheet
mapping and change detection research has been funded by the National
Aeronautics and Space Administration (NASA) for more than a decade, and
he is an internationally recognized expert in the measurement of polar ice sheet
change using precision satellite altimeters, the influence of climate on these
changes, and the impact of these changes on global sea levels. His urbanarea research focuses on the automated processing and development of highresolution geospatial information products. Examples include high-resolution
digital elevation models, urban land cover maps, automated feature extraction
of anthropogenic features, and automated change detection. His research results
have been documented in more than 45 refereed journal publications and
70 symposia presentations and proceedings. His most significant scientific
results have been published in top scientific journals such as Science, Nature,
and the Journal of Geophysical Research.
Dr. Davis has recently been named an IEEE Fellow for his “contributions
to satellite remote sensing.” He has received numerous awards throughout his
career, including the National Science Foundation (NSF) Antarctica Service
Medal (1988 and 1989), the International Union of Radio Science (URSI)
Young Scientist Award (1996), and the NASA New Investigator Program
(1996–1999). He served as the Technical Program Cochair of the 2004 IEEE
Geoscience and Remote Sensing Symposium held in Anchorage, AK. He is
currently an Associate Editor for the IEEE T RANSACTIONS ON G EOSCIENCE
AND R EMOTE S ENSING , in which majority of his technical contributions to
remote sensing have been published.
Chi-Ren Shyu (S’89–M’99–SM’07) received the
M.S.E.E. and Ph.D. degrees in electrical and computer engineering from Purdue University, West
Lafayette, IN, in 1994 and 1999, respectively.
Upon completing one year of postdoctoral training
with Purdue, he joined the Department Computer
Engineering and Computer Science, University of
Missouri (MU), Columbia, in October 2000. He
is currently the Paul K. and Diane Shumaker Endowed Professor of engineering and heads the MU
Informatics Institute. His research interests include
geospatial image information mining, visual knowledge understanding and
retrievals, and biomedical informatics.
Dr. Shyu is the recipient of the National Science Foundation Faculty Early
Career Development (NFS CAREER) Award, MU College of Engineering
Faculty Research Award, and various teaching awards. He is a member of
the American Association for the Advancement of Science (AAAS) and the
American Medical Informatics Association (AMIA).