* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Introduction to GPU - Movement Research Lab
Color vision wikipedia , lookup
Tektronix 4010 wikipedia , lookup
Molecular graphics wikipedia , lookup
List of 8-bit computer hardware palettes wikipedia , lookup
MOS Technology VIC-II wikipedia , lookup
Mesa (computer graphics) wikipedia , lookup
Color Graphics Adapter wikipedia , lookup
Waveform graphics wikipedia , lookup
Rendering (computer graphics) wikipedia , lookup
Indexed color wikipedia , lookup
Spatial anti-aliasing wikipedia , lookup
Free and open-source graphics device driver wikipedia , lookup
BSAVE (bitmap format) wikipedia , lookup
Apple II graphics wikipedia , lookup
Hold-And-Modify wikipedia , lookup
Original Chip Set wikipedia , lookup
GeForce 7 series wikipedia , lookup
InfiniteReality wikipedia , lookup
Stream processing wikipedia , lookup
Framebuffer wikipedia , lookup
Graphics processing unit wikipedia , lookup
General-purpose computing on graphics processing units wikipedia , lookup
GPU Tutorial 이윤진 Computer Game 2007 가을 2007년 11월 다섯째 주, 12월 첫째 주 Contents Introduction to GPU High-level shading languages GPU applications Introduction to GPU 이윤진 Computer Game 2007 가을 2007년 11월 26일 Slide Credits Marc Olano (UMBC) ◦ SIGGRAPH 2006 Course notes David Luebke (University of Virginia) ◦ SIGGRAPH 2005, 2007 Course notes Mark Kilgard (NVIDIA Corporation) ◦ SIGGRAPH 2006 Course notes Rudolph Balaz and Sam Glassenberg (Microsoft Corporation) ◦ PDC 05 Randy Fernando and Cyril Zeller (NVIDIA Corporation) ◦ I3D 2005 GPU GPU: Graphics Processing Unit ◦ Designed for real-time graphics ◦ Present in almost every PC ◦ Increasing realism and complexity Americas Army Growth of GPU (NVIDIA) Growth of GPU (NVIDIA) Performance matrices ◦ since 2000, the amount of horsepower applied to processing 3D vertices and fragments has been growing at a staggering rate Computational Power GPUs are fast… ◦ 3.0 GHz Intel Core2 Duo (Woodcrest Xeon 5160): Computation: 48 GFLOPS peak Memory bandwidth: 21 GB/s peak Price: $874 (chip) ◦ NVIDIA GeForce 8800 GTX: Computation: 330 GFLOPS observed • Memory bandwidth: 55.2 GB/s observed • Price: $599 (board) GPUs are getting faster, faster ◦ CPUs: 1.4× annual growth ◦ GPUs: 1.7×(pixels) to 2.3× (vertices) annual growth Computational Power Computational Power Why are GPUs getting faster so fast? ◦ Arithmetic intensity the specialized nature of GPUs makes it easier to use additional transistors for computation ◦ Economics multi-billion dollar video game market is a pressure cooker that drives innovation to exploit this property Flexible and Precise Modern GPUs are deeply programmable ◦ Programmable pixel, vertex, and geometry engines ◦ Solid high-level language support Modern GPUs support “real” precision ◦ 32 bit floating point throughout the pipeline High enough for many (not all) applications Vendors committed to double precision soon ◦ DX10-class GPUs add 32-bit integers GPU Fundamentals: Graphics Pipeline Graphics State Shade GPU Final Pixels (Color, Depth) Rasterize Fragments (pre-pixels) Assemble Primitives Screenspace triangles (2D) CPU Transform & Light Xformed, Lit Vertices (2D) Vertices (3D) Application Video Memory (Textures) Render-to-texture A simplified graphics pipeline ◦ Note that pipe widths vary ◦ Many caches, FIFOs, and so on not shown GPU Fundamentals: Modern Graphics Pipeline Graphics State GPU Programmable vertex processor! Fragment Shade Processor Final Pixels (Color, Depth) CPU Rasterize Fragments (pre-pixels) Assemble Primitives Screenspace triangles (2D) Xformed, Lit Vertices (2D) Vertices (3D) Application Vertex Transform Processor & Light Video Memory (Textures) Render-to-texture Programmable pixel processor! GPU Fundamentals: Modern Graphics Pipeline Graphics State GPU Programmable primitive assembly! Fragment Processor Final Pixels (Color, Depth) CPU Rasterize Fragments (pre-pixels) Geometry Assemble Processor Primitives Screenspace triangles (2D) Xformed, Lit Vertices (2D) Vertices (3D) Application Vertex Processor Video Memory (Textures) Render-to-texture More flexible memory access! GPU Pipeline: Transform Vertex processor (multiple in parallel) ◦ Transform from “world space” to “image space” ◦ Compute per-vertex lighting GPU Pipeline: Assemble Primitives Geometry processor ◦ How the vertices connect to form a primitive ◦ Per-Primitive Operations GPU Pipeline: Rasterize Rasterizer ◦ Convert geometric rep. (vertex) to image rep. (fragment) Pixel + associated data: color, depth, stencil, etc. ◦ Interpolate per-vertex quantities across pixels GPU Pipeline: Shade Fragment processors (multiple in parallel) ◦ Compute a color for each pixel ◦ Optionally read colors from textures (images) GPU Parallelism GeForce 7900 GTX GPU Programming Simplified computational model ◦ consistent as hardware changes All stages SIMD Fixed conversion / remapping between each stage Vertex (stream) Geometry (stream) Fragment (array) Buffer Example Vertex shader void main() { gl_FrontColor = gl_Color; gl_Position = gl_ProjectionMatrix * gl_ModelViewMatrix * gl_Vertex; } Pixel shader void main() { gl_FragColor = gl_Color; } Vertex (stream) Geometry (stream) Fragment (array) Buffer Vertex Shader One element in / one out No communication Can select fragment address Input: ◦ Vertex data (position, normal, color, …) ◦ Shader constants, Texture data Output: ◦ Required: Transformed clip-space position ◦ Optional: Colors, texture coordinates, normals (data you want passed on to the pixel shader) Restrictions: ◦ Can’t create new vertices Pixel Shader Biggest computational resource One element in / 0 – 1 out Cannot change destination address No communication Input: ◦ Interpolated data from vertex shader ◦ Shader constants, Texture data Output: ◦ Required: Pixel color (with alpha) ◦ Optional: Can write additional colors to multiple render targets Restrictions: ◦ Can’t read and write the same texture simultaneously Example http://www.lighthouse3d.com/opengl/glsl/ Vertex shader void main() { vec4 v = vec4(gl_Vertex); v.z = 0.0; gl_Position = gl_ProjectionMatrix * gl_ModelViewMatrix * gl_Vertex; } Pixel shader void main() { gl_FragColor = vec4(0.8,0.4,0.4,1.0); } Geometry Shader One element in / 0 to ~100 out ◦ Limited by hardware buffer sizes Like vertex: ◦ No communication ◦ Can select fragment address Input: ◦ Entire primitive (point, line, or triangle) ◦ Optional: Adjacency Output: ◦ Zero or more primitives (a homogenous list of points/lines or triangles) Restrictions: ◦ Allow parallel processing but preserve serial order Geometry Shader Applications ◦ Fur/fins, procedural geometry/detailing, ◦ Data visualization techniques, ◦ Wide lines and strokes, … Multiple Passes Communication ◦ None in one pass ◦ Arbitrary read addresses between passes Vertex (stream) Geometry (stream) Fragment (array) Buffer Example Depth buffer Normal buffer Final result Silhouettes Creases Image Space Silhouette Extraction Using Graphics Hardware [Wang 2005] GPU Applications Bump/Displacement mapping Diffuse light without bump Height map Diffuse light with bump GPU Applications Volume texture mapping GPU Applications Cloth simulation GPU Applications GPU Applications Real-time rendering Image processing General purpose GPU (GPGPU) … Contents Introduction to GPU High level shading languages GPU applications GPU Applications Soft Shadows Percentage-closer soft shadows [Fernando 2005]