Download data structures and algorithms 2

Document related concepts

Airborne Networking wikipedia , lookup

IEEE 1355 wikipedia , lookup

CAN bus wikipedia , lookup

UniPro protocol stack wikipedia , lookup

Transcript
Data structures and algorithms
for geometric models 2
Jussi Nikander
9.3.2016
MAA-C2005 Geometric models in engineering
Contents of this lecture
• Topological Queries and Data Structures
– Polygon networks and polygon meshes
– Triangular Irregular Networks
• Windowing Queries and Data Structures
The Purpose of This Lecture
• This lecture gives an overview of some of the theory
associated with data structures and algorithms, as well
as why these matters are important
• We will go through a lot of things quite quickly
• The learning goal of this lecture is to gain a rudimentary
understanding on topological queries and data
structures, as well as windowing
Very Quick Recap of the Previous
Lecture
•
•
•
•
•
•
•
External memory, main memory, processor
Primitive data items, data collections
Key values and indices
Algorithm is a sequence of simple commands
Big Oh and algorithm efficiency
Basic data structures: array, linked list, tree, graph
Complex data structures and built using basic data
structures that are modified to fit a specific purpose
• Data structure management algorithms are hidden,
functionality is accessed using public API
Topological Queries and Topological
Data Structures
What is Topology?
• In mathematics, topology is the study of the properties of
space that are preserved under continuous
deformations (topological transformations)
– Connectivity, for example
• This discussion is limited to topologies for surfaces in 2D
and 3D spaces
– And more precisely polygons as well as surfaces built from (or
approximated by) connected polygons
• Connectivity and adjacency relations are extremely
important in many geometric modeling situations
• Here, the structures are called polygon networks
The Polygon Network
• A polygon network consists of a set of polygons (subdivisions
of 2D plane(s) in two or three dimensions) that are connected
through shared edges (line segments delimiting the
subdivisions)
• Polygon networks consist of
– Points that depict line segment endpoints
• Each point is unique, no two points may overlap
– Line segments that are defined by two points
• In some cases some line segments may be half-lines
• Line segments may touch only at line segment endpoints
– Faces defined by two or more line segments
• Minimum of three finite line segments are required to define a finite
face
• Faces may touch only at bounding line segments
Data Structures for Topology
• Polygon network topology can be stored in a data
structure on several levels of detail
• Storage as distinct line segments contains no topology
– So-called line spaghetti
• Storage as distinct polygons contains no polygon
topology
– Shapefiles, for example, store the data in this way (it is faster to
draw on screen)
• For analysis explicit topology is often required
– Analysis is faster
– No errors due to, for example, floating-point imprecision or
source data errors
The Polygon Mesh Model
• Polygon mesh is a 3D model of a surface that consists
of connected polygons
– Often simple polygons, such as in a triangle mesh
• The mesh is typically assumed to model a solid object
– Polygon mesh explicitly includes only the surface of the object
– Volumetric meshes explicitly include the volume
• A mesh is a closed, topological surface
• Meshes – or more complex models generalized from
meshes – are used in many 3D modeling and analysis
tools
Polygon Mesh Example
Source: shapeways.com
The Voronoi and The Delaunay
• Voronoi diagram is a subdivision of a space into subspaces
according to a set of points
– Each subdivision consists of the part of the space closest to one
point in the set
• Delaunay is triangulation of a surface into triangles according
to a set of points, where smallest angle in any triangle is the
largest possible for that point set
– It is the ”best” triangulation for a point set for many applications
• These are used in many point set analysis applications
• Delaunay and Voronoi also happen to be each other’s dual
structures
– Each Voronoi face corresponds to Delaunay point, and each
Delaunay triangle to a Voronoi vertex; edges correspond to
neighboring faces / triangles
Voronoi and Delaunay Example
Source: wikipedia
Data Structures for Polygon Networks
Doubly Connected Edge List
• A graph-theoretical approach to polygon networks
– Each polygon can be viewed as a face in a graph
• The data structure consists of vertices, faces and edges
– Each edge is divided into two directed half-edges
• compare to directed graph
• The half-edges contain most of the relevant information in the
model
– Half-edges go around a face counterclockwise
– Each half-edge has a twin that goes to the opposite direction
• Vertices know their own position and one of the half-edges
incident to the vertex
• Faces know one of the half-edges surrounding it (as well as
one half-edge for each component surrounded by the face)
Structure of a DCEL
Source: Computational Geometry
Structure of DCEL
• DCEL can be implemented using three 2-dimensional
arrays (matrices)
• One array stores the vertices, one the faces, and one
the half-edges
• The array storing faces needs to be able to store
theoretically unbound number of links to half-edges
• Half-edge array contains most of the information
Winged Edge
• Like DCEL, consists of edges, faces (polygons) and
vertices
– Edges are directed, and contain most of the information
– Polygons and vertices just know one edge next to them
• Edges contain most of the important information
– Edges go in one direction
– Edges know incident faces and nodes
– Edges know the previous and next edges around the face they
limit
• Faces and vertices each know one incident edge
– More information may also be included
Structure of a Winged Edge
Source: Okabe et al: Spatial Tessellations
Structure of a Winged Edge
Source: wikipedia
Quad-Edge
• A further development from Winged Edge
– A bit harder to grasp, however
• Contains two important improvements over Wingededge
– Explicitly models both Voronoi diagram and the Delaunay
Triangulation at the same time
– Very simple operations
• Only two operations required for modifications
Structure of a Quad-Edge
• Depicts an edge
• Implicit links to adjacent
faces and vertices
• Explicit links to
neighboring Quad-edges
Source: www.voronoi.com
Quad-Edge Network
Source: www.voronoi.com
Quad-Edge Operations
• In addition to splice,
makeEdge is used to
create new edges to the
structure
Source: www.voronoi.com
Delaunay Triangulation example: TIN
elevation model
This is also technically a triangle mesh
3D Modeling example: Building on
different levels of detail
Source: Liukkonen: Kuntien paikkatiedon
polku kartasta 3D-mallinnukseen
Computing a TIN (Delaunay
triangulation)
The MAX-MIN Angle Criterion
• A Delaunay triangulation maximises the minimum
angles of the triangles
• The optimality of a triangulation can be measured
using an angle vector
• Angle vector is the set of angles of the triangulation
sorted to increasing order
– Therefore, angle vectors can be compared lexicographically
• The angle-vector of a triangulation can be increased
by flipping the bisector of a quadrilateral for which
changing the bisector increases the minimum angle
• For a Delaunay triangulation, the angle vector of the
triangulation is optimal
– That is, the angle vectors of all other triangulations of the
same point set are equal or smaller
• Therefore, a Delaunay triangulation can be calculated
by gradually increasing the angle vector by flipping
edges
• Starting from a random triangulation would, however,
be too inefficient
The Incremental Algorithm
• Instead of starting with a random triangulation, we can
incrementally add points to the triangulation
• Using this method, we need to do only local
modifications to the triangulation after each insertion
• In order to always have a legal triangulation, we use a
auxiliary “super triangle” that covers the whole point
set
• At each iteration of the algorithm
– We add a new randomly selected point to the triangulation
• This will cause local modifications to the structure
– Ensure that the triangulation stays legal by checking what
modifications the insertion caused
• Time efficiency is O(nlogn)
• Space efficiency is O(n)
– During algorithm execution a search structure is required in
order to quickly find the triangle inside which the new point is
added
Windowing Queries and Data Structures
Windowing Queries
• A windowing query asks ”what data items overlap with a
given subdivision of the space they reside in”
• Point query is a windowing query with degenerate query
window
– The nature of point query does, however, allow us to use an
approach different from windoing query to solve it
• Windowing queries are answered using dictionaries
• The best data structure depends on the type of data, and the
type of queries
– Points vs lines and polygons vs continuous surfaces
– Main memory versus external storage
– Point queries, rectangular windowing queries, arbitrary polygonal
windowing queries, etc.
Subdivisions of the Search Space
• Dictionary structures work by partitioning the search
space into smaller segments intelligently
• There are two main approaches to this
– Partition-by-space: the search space itself is divided into
segments of approximately the same size
– Partition-by-data: data items to be searched are divided into
subsets of approximately the same size
• Both approaches have their advantages and
disadvantages
Partition by Space
• The goal is to divide the space into subsets that are
approximately of the same size
• Since the subsets do not overlap, searches over subsets
never need to cover the same space twice (efficient)
• Line or polygon data items may overlap with several
subdivisions and thus may need to be stored multiple times
(inefficient)
– Points can always be defined to belong to only one subdivision
• If the data distribution in the space is very uneven, dense
areas may require many more subdivisions than sparse
areas (inefficient)
– Locations where dense and sparse subdivisions meet may become
problematic
Partition by Space Example: Duplication
Partition by Space Example: Duplication
Notice the duplication
Partition by Space Example: Uneven
distribution
Partition by Space Example: Uneven
distribution
Partition by Space Example: Uneven
distribution
Partition by Data
• The goal is to divide the data items into subsets of
approximately the same size
• Each data item thus needs to be stored only once (efficient)
• Empty parts of the space may not be covered by any
subdivision (efficient)
• The spaces covered by the subdivisions may overlap, and
thus require searches to cover the same area multiple times
(inefficient)
• If the division of data is bad, the empty space covered by
subdivisions may be significant (inefficient)
– Creating good divisions can, however, be a bit complicated
Partition by Data Example: Overlap
Partition by Data Example: Overlap
Notice the overlap
Partition by Data Example: Significant
Empty Space
Bad partition of data into two subsets
Partition by Data Example: Significant
Empty Space
There are no subdivisions covering this space
Better partition methods could have produced better subdivisions
Subdivisions and Data Structures
• Each subdivision corresponds to some element in a
data structure
• In a tree, the root node covers the whole space
– Each child of the root covers one subdivision
• In general: each tree node covers specific part of the
space, and each child of the node covers a specific
subdivision of that part
Root covers the whole space
Children cover overlapping
subdivisions
Children cover overlapping
subdivisions
Children cover overlapping
subdivisions
Finally, there are the
data items (in each
node)
The Point Query and Trapezoidial Map
• The point query asks inside which polygon(s) the query
point is
– Compare this to the windowing query which asks what polygons
overlap the query area
• These kind of queries are constantly being evaluated
when the cursor is being moved over a graphical UI
• The trapezoidial map approach is to divide the polygon
network into subdivisions that are easy to search
through
– In essence, instead of storing data items into an index, an index
is built from data item subdivisions
Searchable Subdivisions
• A polygon network is a subdivision of an area
– Unfortunately, since polygons are arbitrary, it is not a very good
subdivision for searching
– Testing whether a point is inside one polygon takes logarithmic
time, thus testing N polygons takes O(NlogN) time
• For searching the polygons need to be split into trapezoids
– A trapezoid is a convex quadrilateral with at least two parallel sides
– Using trapezoid sides, it is possible to do comparisons such as ”is
the point to the left/right, or above/below certain line”
• A division of a polygon network into trapezoids is called a
trapezoidial map
Trapezoid Map Example
Source: Computational Geometry
Search Structure Using Trapezoidial Map
• A trapezodial map consists of vertical lines (created for the
map) and non-vertical line segments (part of the original
polygon network)
• For searching, these edges are arranged into a tree-like
search structure
• The structure consists of inner and leaf nodes
• Each inner node represents a vertical line or non-vertical line
segment
– Each node has two child nodes
– Vertical lines divide the area left and right (x-nodes)
– Non-vertical line segments divide the area above and below (ynodes)
– When an y-node is encountered, the corresponding line segment
spans the possible x-values left in the search space
Search Structure Using Trapezoidial Map
• Leaf nodes represent trapezoids
• There is a root node from which all searches start
• There may be several paths from the root node to a leaf
node
• Thus, the resulting structure is a rooted, directed, acyclic
graph
• In 3D we must also take into account which polygon(s)
are visible from the current viewpoint, and which are
covered by other data items
– Different approach is thus required
Search Structure Example
Source: Computational Geometry
Quad-tree
• Divides by area
– Each node covers (part of) the area covered by the
whole tree
– If the area covered by a node has only one data
element in it, the node is a leaf
– If there are more elements, the node has four
children
• Each child covers one quarter of the node’s area
• Basic quad-tree is used for either point data
(point QT) or raster data (area QT)
– Different raster values take the role of objects
Quad-tree management
• Insertion to a quad-tree works by recursively selecting
the leaf covering the data and creating new levels as
required until free leaf is encountered
• Deleting removes nodes when children of a particular
node are all leaves and collectively contain at most one
data item
• Worst case efficiency depends on the length of the area
edge
Point quad-tree example
Region Quad-tree example
Quad-Tree for Polygons
• The basic quad-tree contains elements that always fit
completely inside one tree node
• Polygons may span the area covered by several quadtree nodes
• Polygons are approximated by their MBRs
• Each data item is stored in the quad-tree node
corresponding to the smallest subdivision that covers
the MBR of the data item
– Every data item is stored only once!
• One node may contain several data items
Polygon Quad-Tree Example
Source: Design and analysis of spatial data
structures
Quad-Tree for Polygons
• As the number of data items in a node is unbound, an
index is needed for the items in the node
– Each node needs an auxiliary data structure to index the data
• Any windowing query that overlaps a given node must
search the auxiliary structure to see which data items in
the node overlap the query window
Generalizing Quad-Tree To 3D: Oct-Tree
• Quad-tree is two-dimensional: square area is divided
into four smaller squares
• The 3D generalization of a square is a cube
• To divide cube into smaller cubes of equal size, 8 are
required
• Thus the 3D version of quad-tree is called oct-tree
• As with Quad-tree, different oct-tree variants can be
used to store voxels (3D pixels), points, or more
complex 3D data items
Oct-Tree Example
Source: Design and analysis of spatial data
structures
R-Tree
• R-Tree is a balanced tree for multidimensional key values
– Data items are approximated using MBRs
• R-Tree has been designed to be used with external storage
devices
• It is a common indexing structure in databases that store
multidimensional data (e.g. spatial databases)
• R-Tree is a balanced non-binary search tree
–
–
–
–
Each node (except root) has between n and 2n data elements
Each internal node (except root) has between n and 2n child nodes
Root has 0 to 2n elements and 0 to 2n children
Each leaf is on the same level
• Since single node contains a large number of elements, the
tree is wide, but short (only a few links from root to leaf)
R-Tree
• R-Tree nodes – with the exception of the root – are
typically in external storage
– Since the tree is short, only a few I/O operations are required
per tree operation
• Each node has a large number of elements, and thus
powerful means of searching through a node is required
• R-tree efficiency depends on data distribution and
insertion order, which has large effect on MBRs
– Coverage: The total area covered by nodes on a certain level
– Overlap: The amount of area covered by more than one node
– Minimizing both will make tree more efficient, especially overlap
minimization is important
R*-Tree
• Basic R-Tree is pretty bad at minimizing coverage and
overlap due to how elements are inserted and deleted
• R*-Tree is an R-Tree variation that improves the basic version
by
–
–
–
–
Minimizing the area of node MBRs
Minimizing overlap
Minimizing the perimeter of node MBRs
Maximizing storage utilization (elements per node)
• Naturally, these cannot all be optimal at the same time, and
thus a good balance needs to be found
• This is achieved by more sophisticated insertion method
– In some cases a number of elements are removed and reinserted
R*-Tree Example
Source: R-Trees: theory and applications
Quad-Tree and R-Tree uses
• Neo4j graph database can use quad-tree indexing
• The PostGIS Spatial Database can use R-tree as spatial
indexing structure
– The R-tree is implemented using a structure called Generized
Search Tree (GiST)
• The SQLite database includes an R*-tree index
• There is a python implementation for both structures
Constructive Solid Geometry Trees
• CSG is a method of constructing complex objects from
simpler ones
• Each element is considered a set, and boolean set
operations are used to construct models
– Union, intersection, complement (difference)
• The construction is done using a structure called CSG
Tree
• CSG tree is a binary tree, where each node is
associated with an object, and an operation
– The object is constructed from the objects of the child nodes
using the associated operation
CSG Tree Example
Source: wikipedia
CSG Tree
• The CSG tree is an example of a tree structure that is not
used as a dictionary
• The purpose of the tree is to decompose a complex object
into smaller ones / construct a complex object from smaller
ones
• Child order can be important , depending on the operation
– Union and intersection are commutative
– Complement is not
• The tree can also be used to check the relations the object
has with other objects
– e.g. overlap
Questions?