Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data structures and algorithms for geometric models 2 Jussi Nikander 9.3.2016 MAA-C2005 Geometric models in engineering Contents of this lecture • Topological Queries and Data Structures – Polygon networks and polygon meshes – Triangular Irregular Networks • Windowing Queries and Data Structures The Purpose of This Lecture • This lecture gives an overview of some of the theory associated with data structures and algorithms, as well as why these matters are important • We will go through a lot of things quite quickly • The learning goal of this lecture is to gain a rudimentary understanding on topological queries and data structures, as well as windowing Very Quick Recap of the Previous Lecture • • • • • • • External memory, main memory, processor Primitive data items, data collections Key values and indices Algorithm is a sequence of simple commands Big Oh and algorithm efficiency Basic data structures: array, linked list, tree, graph Complex data structures and built using basic data structures that are modified to fit a specific purpose • Data structure management algorithms are hidden, functionality is accessed using public API Topological Queries and Topological Data Structures What is Topology? • In mathematics, topology is the study of the properties of space that are preserved under continuous deformations (topological transformations) – Connectivity, for example • This discussion is limited to topologies for surfaces in 2D and 3D spaces – And more precisely polygons as well as surfaces built from (or approximated by) connected polygons • Connectivity and adjacency relations are extremely important in many geometric modeling situations • Here, the structures are called polygon networks The Polygon Network • A polygon network consists of a set of polygons (subdivisions of 2D plane(s) in two or three dimensions) that are connected through shared edges (line segments delimiting the subdivisions) • Polygon networks consist of – Points that depict line segment endpoints • Each point is unique, no two points may overlap – Line segments that are defined by two points • In some cases some line segments may be half-lines • Line segments may touch only at line segment endpoints – Faces defined by two or more line segments • Minimum of three finite line segments are required to define a finite face • Faces may touch only at bounding line segments Data Structures for Topology • Polygon network topology can be stored in a data structure on several levels of detail • Storage as distinct line segments contains no topology – So-called line spaghetti • Storage as distinct polygons contains no polygon topology – Shapefiles, for example, store the data in this way (it is faster to draw on screen) • For analysis explicit topology is often required – Analysis is faster – No errors due to, for example, floating-point imprecision or source data errors The Polygon Mesh Model • Polygon mesh is a 3D model of a surface that consists of connected polygons – Often simple polygons, such as in a triangle mesh • The mesh is typically assumed to model a solid object – Polygon mesh explicitly includes only the surface of the object – Volumetric meshes explicitly include the volume • A mesh is a closed, topological surface • Meshes – or more complex models generalized from meshes – are used in many 3D modeling and analysis tools Polygon Mesh Example Source: shapeways.com The Voronoi and The Delaunay • Voronoi diagram is a subdivision of a space into subspaces according to a set of points – Each subdivision consists of the part of the space closest to one point in the set • Delaunay is triangulation of a surface into triangles according to a set of points, where smallest angle in any triangle is the largest possible for that point set – It is the ”best” triangulation for a point set for many applications • These are used in many point set analysis applications • Delaunay and Voronoi also happen to be each other’s dual structures – Each Voronoi face corresponds to Delaunay point, and each Delaunay triangle to a Voronoi vertex; edges correspond to neighboring faces / triangles Voronoi and Delaunay Example Source: wikipedia Data Structures for Polygon Networks Doubly Connected Edge List • A graph-theoretical approach to polygon networks – Each polygon can be viewed as a face in a graph • The data structure consists of vertices, faces and edges – Each edge is divided into two directed half-edges • compare to directed graph • The half-edges contain most of the relevant information in the model – Half-edges go around a face counterclockwise – Each half-edge has a twin that goes to the opposite direction • Vertices know their own position and one of the half-edges incident to the vertex • Faces know one of the half-edges surrounding it (as well as one half-edge for each component surrounded by the face) Structure of a DCEL Source: Computational Geometry Structure of DCEL • DCEL can be implemented using three 2-dimensional arrays (matrices) • One array stores the vertices, one the faces, and one the half-edges • The array storing faces needs to be able to store theoretically unbound number of links to half-edges • Half-edge array contains most of the information Winged Edge • Like DCEL, consists of edges, faces (polygons) and vertices – Edges are directed, and contain most of the information – Polygons and vertices just know one edge next to them • Edges contain most of the important information – Edges go in one direction – Edges know incident faces and nodes – Edges know the previous and next edges around the face they limit • Faces and vertices each know one incident edge – More information may also be included Structure of a Winged Edge Source: Okabe et al: Spatial Tessellations Structure of a Winged Edge Source: wikipedia Quad-Edge • A further development from Winged Edge – A bit harder to grasp, however • Contains two important improvements over Wingededge – Explicitly models both Voronoi diagram and the Delaunay Triangulation at the same time – Very simple operations • Only two operations required for modifications Structure of a Quad-Edge • Depicts an edge • Implicit links to adjacent faces and vertices • Explicit links to neighboring Quad-edges Source: www.voronoi.com Quad-Edge Network Source: www.voronoi.com Quad-Edge Operations • In addition to splice, makeEdge is used to create new edges to the structure Source: www.voronoi.com Delaunay Triangulation example: TIN elevation model This is also technically a triangle mesh 3D Modeling example: Building on different levels of detail Source: Liukkonen: Kuntien paikkatiedon polku kartasta 3D-mallinnukseen Computing a TIN (Delaunay triangulation) The MAX-MIN Angle Criterion • A Delaunay triangulation maximises the minimum angles of the triangles • The optimality of a triangulation can be measured using an angle vector • Angle vector is the set of angles of the triangulation sorted to increasing order – Therefore, angle vectors can be compared lexicographically • The angle-vector of a triangulation can be increased by flipping the bisector of a quadrilateral for which changing the bisector increases the minimum angle • For a Delaunay triangulation, the angle vector of the triangulation is optimal – That is, the angle vectors of all other triangulations of the same point set are equal or smaller • Therefore, a Delaunay triangulation can be calculated by gradually increasing the angle vector by flipping edges • Starting from a random triangulation would, however, be too inefficient The Incremental Algorithm • Instead of starting with a random triangulation, we can incrementally add points to the triangulation • Using this method, we need to do only local modifications to the triangulation after each insertion • In order to always have a legal triangulation, we use a auxiliary “super triangle” that covers the whole point set • At each iteration of the algorithm – We add a new randomly selected point to the triangulation • This will cause local modifications to the structure – Ensure that the triangulation stays legal by checking what modifications the insertion caused • Time efficiency is O(nlogn) • Space efficiency is O(n) – During algorithm execution a search structure is required in order to quickly find the triangle inside which the new point is added Windowing Queries and Data Structures Windowing Queries • A windowing query asks ”what data items overlap with a given subdivision of the space they reside in” • Point query is a windowing query with degenerate query window – The nature of point query does, however, allow us to use an approach different from windoing query to solve it • Windowing queries are answered using dictionaries • The best data structure depends on the type of data, and the type of queries – Points vs lines and polygons vs continuous surfaces – Main memory versus external storage – Point queries, rectangular windowing queries, arbitrary polygonal windowing queries, etc. Subdivisions of the Search Space • Dictionary structures work by partitioning the search space into smaller segments intelligently • There are two main approaches to this – Partition-by-space: the search space itself is divided into segments of approximately the same size – Partition-by-data: data items to be searched are divided into subsets of approximately the same size • Both approaches have their advantages and disadvantages Partition by Space • The goal is to divide the space into subsets that are approximately of the same size • Since the subsets do not overlap, searches over subsets never need to cover the same space twice (efficient) • Line or polygon data items may overlap with several subdivisions and thus may need to be stored multiple times (inefficient) – Points can always be defined to belong to only one subdivision • If the data distribution in the space is very uneven, dense areas may require many more subdivisions than sparse areas (inefficient) – Locations where dense and sparse subdivisions meet may become problematic Partition by Space Example: Duplication Partition by Space Example: Duplication Notice the duplication Partition by Space Example: Uneven distribution Partition by Space Example: Uneven distribution Partition by Space Example: Uneven distribution Partition by Data • The goal is to divide the data items into subsets of approximately the same size • Each data item thus needs to be stored only once (efficient) • Empty parts of the space may not be covered by any subdivision (efficient) • The spaces covered by the subdivisions may overlap, and thus require searches to cover the same area multiple times (inefficient) • If the division of data is bad, the empty space covered by subdivisions may be significant (inefficient) – Creating good divisions can, however, be a bit complicated Partition by Data Example: Overlap Partition by Data Example: Overlap Notice the overlap Partition by Data Example: Significant Empty Space Bad partition of data into two subsets Partition by Data Example: Significant Empty Space There are no subdivisions covering this space Better partition methods could have produced better subdivisions Subdivisions and Data Structures • Each subdivision corresponds to some element in a data structure • In a tree, the root node covers the whole space – Each child of the root covers one subdivision • In general: each tree node covers specific part of the space, and each child of the node covers a specific subdivision of that part Root covers the whole space Children cover overlapping subdivisions Children cover overlapping subdivisions Children cover overlapping subdivisions Finally, there are the data items (in each node) The Point Query and Trapezoidial Map • The point query asks inside which polygon(s) the query point is – Compare this to the windowing query which asks what polygons overlap the query area • These kind of queries are constantly being evaluated when the cursor is being moved over a graphical UI • The trapezoidial map approach is to divide the polygon network into subdivisions that are easy to search through – In essence, instead of storing data items into an index, an index is built from data item subdivisions Searchable Subdivisions • A polygon network is a subdivision of an area – Unfortunately, since polygons are arbitrary, it is not a very good subdivision for searching – Testing whether a point is inside one polygon takes logarithmic time, thus testing N polygons takes O(NlogN) time • For searching the polygons need to be split into trapezoids – A trapezoid is a convex quadrilateral with at least two parallel sides – Using trapezoid sides, it is possible to do comparisons such as ”is the point to the left/right, or above/below certain line” • A division of a polygon network into trapezoids is called a trapezoidial map Trapezoid Map Example Source: Computational Geometry Search Structure Using Trapezoidial Map • A trapezodial map consists of vertical lines (created for the map) and non-vertical line segments (part of the original polygon network) • For searching, these edges are arranged into a tree-like search structure • The structure consists of inner and leaf nodes • Each inner node represents a vertical line or non-vertical line segment – Each node has two child nodes – Vertical lines divide the area left and right (x-nodes) – Non-vertical line segments divide the area above and below (ynodes) – When an y-node is encountered, the corresponding line segment spans the possible x-values left in the search space Search Structure Using Trapezoidial Map • Leaf nodes represent trapezoids • There is a root node from which all searches start • There may be several paths from the root node to a leaf node • Thus, the resulting structure is a rooted, directed, acyclic graph • In 3D we must also take into account which polygon(s) are visible from the current viewpoint, and which are covered by other data items – Different approach is thus required Search Structure Example Source: Computational Geometry Quad-tree • Divides by area – Each node covers (part of) the area covered by the whole tree – If the area covered by a node has only one data element in it, the node is a leaf – If there are more elements, the node has four children • Each child covers one quarter of the node’s area • Basic quad-tree is used for either point data (point QT) or raster data (area QT) – Different raster values take the role of objects Quad-tree management • Insertion to a quad-tree works by recursively selecting the leaf covering the data and creating new levels as required until free leaf is encountered • Deleting removes nodes when children of a particular node are all leaves and collectively contain at most one data item • Worst case efficiency depends on the length of the area edge Point quad-tree example Region Quad-tree example Quad-Tree for Polygons • The basic quad-tree contains elements that always fit completely inside one tree node • Polygons may span the area covered by several quadtree nodes • Polygons are approximated by their MBRs • Each data item is stored in the quad-tree node corresponding to the smallest subdivision that covers the MBR of the data item – Every data item is stored only once! • One node may contain several data items Polygon Quad-Tree Example Source: Design and analysis of spatial data structures Quad-Tree for Polygons • As the number of data items in a node is unbound, an index is needed for the items in the node – Each node needs an auxiliary data structure to index the data • Any windowing query that overlaps a given node must search the auxiliary structure to see which data items in the node overlap the query window Generalizing Quad-Tree To 3D: Oct-Tree • Quad-tree is two-dimensional: square area is divided into four smaller squares • The 3D generalization of a square is a cube • To divide cube into smaller cubes of equal size, 8 are required • Thus the 3D version of quad-tree is called oct-tree • As with Quad-tree, different oct-tree variants can be used to store voxels (3D pixels), points, or more complex 3D data items Oct-Tree Example Source: Design and analysis of spatial data structures R-Tree • R-Tree is a balanced tree for multidimensional key values – Data items are approximated using MBRs • R-Tree has been designed to be used with external storage devices • It is a common indexing structure in databases that store multidimensional data (e.g. spatial databases) • R-Tree is a balanced non-binary search tree – – – – Each node (except root) has between n and 2n data elements Each internal node (except root) has between n and 2n child nodes Root has 0 to 2n elements and 0 to 2n children Each leaf is on the same level • Since single node contains a large number of elements, the tree is wide, but short (only a few links from root to leaf) R-Tree • R-Tree nodes – with the exception of the root – are typically in external storage – Since the tree is short, only a few I/O operations are required per tree operation • Each node has a large number of elements, and thus powerful means of searching through a node is required • R-tree efficiency depends on data distribution and insertion order, which has large effect on MBRs – Coverage: The total area covered by nodes on a certain level – Overlap: The amount of area covered by more than one node – Minimizing both will make tree more efficient, especially overlap minimization is important R*-Tree • Basic R-Tree is pretty bad at minimizing coverage and overlap due to how elements are inserted and deleted • R*-Tree is an R-Tree variation that improves the basic version by – – – – Minimizing the area of node MBRs Minimizing overlap Minimizing the perimeter of node MBRs Maximizing storage utilization (elements per node) • Naturally, these cannot all be optimal at the same time, and thus a good balance needs to be found • This is achieved by more sophisticated insertion method – In some cases a number of elements are removed and reinserted R*-Tree Example Source: R-Trees: theory and applications Quad-Tree and R-Tree uses • Neo4j graph database can use quad-tree indexing • The PostGIS Spatial Database can use R-tree as spatial indexing structure – The R-tree is implemented using a structure called Generized Search Tree (GiST) • The SQLite database includes an R*-tree index • There is a python implementation for both structures Constructive Solid Geometry Trees • CSG is a method of constructing complex objects from simpler ones • Each element is considered a set, and boolean set operations are used to construct models – Union, intersection, complement (difference) • The construction is done using a structure called CSG Tree • CSG tree is a binary tree, where each node is associated with an object, and an operation – The object is constructed from the objects of the child nodes using the associated operation CSG Tree Example Source: wikipedia CSG Tree • The CSG tree is an example of a tree structure that is not used as a dictionary • The purpose of the tree is to decompose a complex object into smaller ones / construct a complex object from smaller ones • Child order can be important , depending on the operation – Union and intersection are commutative – Complement is not • The tree can also be used to check the relations the object has with other objects – e.g. overlap Questions?