Download PowerPoint

Document related concepts
no text concepts found
Transcript
Parallel Model Simplification
of Very Large Polygonal
Meshes
by
Dmitry Brodsky
and
Jan Bækgaard Pedersen
What did we do?
 Parallelized an existing mesh
simplification algorithm
• Show that R-Simp [Brodsky & Watson] is well
suited for parallel environments
 Able to simplify large models
 Achieve good
speedup
 Retain good
output quality
30M
20K
Computer graphics
 Scenes are created from models
I am the
Stanford
Bunny
Computer graphics
 Scenes are created from models
 Models are create from polygons
 The more polygons the more
realistic the model
 Triangles are most often used
• Consisting of 3 vertices specifying a face
• Hardware is optimized for triangles
Why simplify?
 Graphics hardware is too slow
• Render ~10k polygons in real-time
 Models are too large
• 100k polygons or more
 Highly detailed models are not
always required
 Trade quality for rendering speed
What is simplification?
 Reduce the number of polygons
 Maintain shape
70,000 Polygons
5,000 Polygons
What is simplification?
 The desired number of polygons
depends on the scene
70,000 Polygons
5,000 Polygons
So what’s the problem?
 Models are becoming very large
• Model acquisition is getting better
 Simplification is time consuming
• Trade-off time for quality
• On the order of hours and days
 Models do not fit into core memory
• Algorithms require 10’s of gigabytes
 32 bits are not enough
What can we do?
 Partition the simplification process
into smaller tasks
 Execute the tasks in parallel or
sequentially
• Reduce contention for core (page faults)
 Not applicable to all algorithms
Surface simplification
 Flat surface patches can be
represented with a few polygons
 Remove excess polygons by
removing edges or vertices
Surface simplification
 Flat surface patches can be
represented with a few polygons
 Remove excess polygons by
removing edges or vertices
Surface simplification
 Flat surface patches can be
represented with a few polygons
 Remove excess polygons by
removing edges or vertices
Removing primitives
 Remove the primitive that causes
the least amount of distortion
 Preserve significant features
• E.g. corners
Removing primitives
 Remove the primitive that causes
the least amount of distortion
 Preserve significant features
• E.g. corners
 Avoid primitives that
form corners
Removing primitives
 Remove the primitive that causes
the least amount of distortion
 Preserve significant features
• E.g. corners
 Avoid primitives that
form corners
 Choose primitives
on flat patches
Conventional algorithms
 Edge collapse
• Iteratively remove edges
[Garland & Heckbert, Hoppe, Lindstrom, Turk]
 Decimation
• Combine polygons, remove vertices to
create large planar patches
[Hanson, Schroeder]
 Clustering
• Spatially cluster vertices or faces
• Poor quality output
[Rossignac & Borrel]
Edge collapse
 High quality output
 Access is in distortion order
Edge collapse
 High quality output
 Access is in distortion order
4
2
1
3
Edge collapse
 High quality output
 Access is in distortion order
• Edges are sorted by distortion
• Can’t exploit access locality
 Data can not be partitioned
 O(n log n ), n is input size
 Large models are problematic
• Take long to simplify
• Have to fit into core memory
Decimation
 Good quality output
 Access is in spatial order
Decimation
 Good quality output
 Access is in spatial order
1
2
3
4
Decimation
 Good quality output
 Access is in spatial order
• Models are usually polygonal soups
• Data reorganization is necessary to
exploit access locality
 Topology information is needed
 Surface partitioning is unintuitive
• Data has to be sorted first
• Should not split planar regions
Memory efficient algorithms
 Edge collapse
[Lindstrom & Turk]
 Cluster refinement
[Garland]
 Modified R-Simp
• Re-organizes and clusters vertices and
faces to improve memory access locality
[Salamon et al.]
What do we do?
 Simplify in reverse - “R”-Simp
• Start with a coarse approximation and
refine by adding vertices
 Access in model order
What do we do?
 Simplify in reverse - “R”-Simp
• Start with a coarse approximation and
refine by adding vertices
 Access in model order
Vertices
Faces
x0, y0, z0
0: v1, v2, z3
1
x1, y1, z1
1:va, vb, vc
2
3
xn, yn, zn
m: vi, vj, vk
What do we do?
 Simplify in reverse - “R”-Simp
• Start with a coarse approximation and
refine by adding vertices
 Access in model order
• Can exploit access locality
• Less reorganization necessary
 Data intuitively partitions
 Linear runtime for an output size
• O(ni log no)
 Produce good quality output
The algorithm
 Partition the model
Initial clustering
 Spatially partition into 8 clusters
• Cluster: A vertex in the output model
The algorithm
 Partition the model
 Main loop
• Choose a cluster to split
Choosing a cluster
 Select the cluster with the largest
surface variation (curvature).
Surface variation
 Computed using face normals and
face area
Surface variation
 Computed using face normals and
face area
• curvedness = ∑normali * areai
The algorithm
 Partition the model
 Loop
• Choose a cluster to split
• Partition the cluster
Splitting a cluster
 Split into 2,
Splitting a cluster
 Split into 2, 4,
Splitting a cluster
 Split into 2, 4, or 8 subclusters
How to split?
 Split based on surface curvature
• Compute the mean normal and directions
of maximum and minimum curvature
• Directions guide the partitioning
Mean Normal
Direction
of Minimum
Curvature
Direction of
Maximum
Curvature
Surface types
 Goal: create large planar patches
• Cylindrical: partitioned into 2
• Hemispherical: partitioned into 4
• Everything else is partitioned into 8
The algorithm
 Partition the model
 Loop
•
•
•
•
Choose a cluster to split
Partition the cluster
Compute surface variation for subclusters
Repeat
 Re-triangulate the new surface
Moving to PR-Simp
 Clusters naturally partition data
 Assign initial clusters to
processors
 Each processor refines to a
specified limit
 Results are reduced and the
surfaces are stitched together
PR-Simp
 Master - Slave configuration
 The dataset is available to all
processors
 Current implementation uses MPI
 Scales to any number of
processors
Master: initialization
 Determine bounding box of model
 Determine initial clusters:
• Axis aligned planes
• # of Procs = fx x fy x fz
 Slaves receive:
• bounding box, fx x fy x fz, and output size
 Processor ID corresponds to a
unique cluster
Slave: simplification
 Determine output size for cluster:
Pout = Pin (Fullout / Fullin)
 Read in the cluster
 Store faces that span processor
boundaries
 Run standard R-Simp algorithm
 Re-triangulate assigned portion of
the simplified surface
Building the output model
 Reduce the results
 Slaves propagate:
• The new triangulated surface
• Faces that span processor boundaries
 Surfaces are stitched together at
each reduction step
 Master outputs the simplified
model
Evaluation
 Ability to simplify
• Some models needed more than 4GB of
core
 Speedup
• Reduce page faulting (memory thrashing)
 Little or no loss of output quality
 Test bed:
• 20 Pentium III 550Mhz with 512MB
• Connected by 100Mbps network
Test subjects
Dragon
David
St. Matthews
871,306
8,253,996
6,755,412
Blade
Stanford Bunny
Happy Buddha
Lucy
1,765,388
69,451
1,087,474
28,045,920
Output quality at 20K
Dragon
David
St. Matthews
871,306
8,253,996
6,755,412
Output quality at 20K
Blade
Stanford Bunny
Happy Buddha
Lucy
1,765,388
69,451
1,087,474
28,045,920
Sequential vs parallel quality
Parallel
Sequential
5K
10K
20K
Quantitative results
 Simplified a 30M polygon model
Quantitative results
 Simplified a 30M polygon model
 No increase in surface error [Metro]
0.08
Total mean error
0.07
0.06
0.05
Bunny R-Simp
Bunny PR-Simp
Dragon R-Simp
Dragon PR-Simp
0.04
0.03
0.02
0.01
0
5000
10000
Number of polygons
20000
Quantitative results
 Simplified a 30M polygon model
 No increase in surface error [Metro]
 Obtained significant speedup for
large models
Model
Speedup
# of Proc.
Bunny
4.70
12
Dragon
5.61
12
Buddha
8.09
12
Blade
8.90
12
St. Matthews
7.89
12
David
8.17
12
Lucy
6.40
16
Quantitative results
 Simplified a 30M polygon model
 No increase in surface error [Metro]
 Obtained significant speedup for
large models
 Output quality is mostly
unaffected by the number of
processors
 Efficiency is approximately 59%
Conclusions
 Large models can be simplified by
using common desktop resources
 The R-Simp algorithm is well
suited for parallelization
• Data can easily be partitioned
• Quality does not significantly degrade as
more processors are added
 Use two step simplification if
quality is very important
Thanks
Thanks to:
Mike Feeley, Norm
Hutchinson, Alan
Wagner, and the
other characters
in the DSG Lab.
Questions??
Quantitative Results
 Simplified a 30M polygon model
 No increase in surface error [Metro]