Nvidia GeForce GTX 200 GPU Technical Brief

Architectural overview

Hide thumbs

Table Of Contents

Table of Contents

Quick Links

Download this manual

Technical Brief

NVIDIA GeForce

GTX 200 GPU

Architectural Overview

Second-Generation Unified GPU

Architecture for Visual Computing

Table of Contents

Need help?

Do you have a question about the GeForce GTX 200 GPU and is the answer not in the manual?

Questions and answers

Summary of Contents for Nvidia GeForce GTX 200 GPU

Page 1 Technical Brief ® NVIDIA GeForce GTX 200 GPU Architectural Overview Second-Generation Unified GPU Architecture for Visual Computing...
Page 2: Table Of Contents
Gaming Beyond: Dynamic 3D Realism ................6 Gaming Beyond: Extreme HD..................7 Gaming Beyond: SLI ..................... 7 Beyond Gaming: High-Performance Visual Computing and Professional Computation....8 GeForce GTX 200 GPU Architecture ................9 More Processor Cores....................9 Graphics Processing Architecture.................. 10 Parallel Computing Architecture..................12 SIMT Architecture .......................
Page 3 Figures Figure 1: Realistic warrior from NVIDIA “Medusa” demo ............6 Figure 2: Far Cry 2 – Extreme HD Dynamic Beauty! (Ubisoft)........... 7 Figure 3: Significant Speedup Using GPU................8 Figure 4: GeForce GTX 280 GPU Graphics Processing Architecture.......... 10 Figure 5: GeForce GTX 280 GPU Parallel Computing Architecture ...........
Page 4: Introduction
The high-end, enthusiast-class GeForce GTX 280 GPU and performance-oriented GeForce GTX 260 GPU are the first members of the GeForce GTX 200 GPU family and deliver the ultimate visual computing and extreme high-definition (HD) gaming experience. We’ll begin by describing architectural design goals and key features, and then dive into the technical implementation of the GeForce GTX 200 GPUs.
Page 5: Geforce Gtx 200 Architectural Design Goals And Key Capabilities
The GeForce GTX 200 GPUs are designed to be fully compliant with Microsoft DirectX 10 and Open GL 2.1. Architectural Design Goals NVIDIA engineers specified the following design goals for the GeForce GTX 200 GPUs: Design a processor with up to twice the performance of GeForce 8800...
Page 6: Gaming Beyond: Dynamic 3D Realism
Better lighting for dramatic and spectacular effect, including ambient occlusion, global illumination, soft shadows, color bleeding, indirect lighting, and accurate reflections. Figure 1: Realistic warrior from NVIDIA “Medusa” demo May, 2008 | TB-04044-001_v01...
Page 7: Gaming Beyond: Extreme Hd
® you an easy, low-cost, high-impact performance upgrade. PC gaming simply doesn’t get any faster or more realistic than running GeForce GTX 200 GPU-based boards in SLI mode on the latest nForce motherboards.
Page 8: Beyond Gaming: High-Performance Visual Computing And Professional Computation
Appendix B lists references and details for these applications. Figure 3: Significant Speedup Using GPU With an understanding of the GeForce GTX 200 GPU design goals and key objectives, let’s delve deeper into its internal architecture, looking at both the graphics and parallel processing capabilities.
Page 9: Geforce Gtx 200 Gpu Architecture
GeForce GTX 200 GPU Architecture GeForce GTX 200 GPUs are the first to implement NVIDIA’s second-generation unified shader and compute architecture. The GeForce GTX 200 GPUs include significantly enhanced features and deliver, on average, 1.5× the performance of GeForce 8 or 9 Series GPUs.
Page 10: Graphics Processing Architecture
Based on traditional processing core designs that can perform integer and floating- point math, memory operations, and logic operations, each processing core is a hardware-multithreaded processor with multiple pipeline stages that execute an instruction for each thread every clock. Various types of threads exist, including pixel, vertex, geometry, and compute. For graphics processing, threads execute a shader program and many related threads often simultaneously execute the same shader program for greater efficiency.
Page 11: Table 2: Geforce 8800 Gtx Vs Geforce Gtx 280
Although not apparent in the above diagram, the architectural efficiency of the GeForce GTX 200 GPUs is substantially enhanced over the prior generation. We’ll be discussing many areas that were improved in more detail, such as texture processing, geometry shading, dual issue, and stream out. In directed tests, GeForce GTX 200 GPUs can attain efficiencies closer to the theoretical performance limits than could prior generations.
Page 12: Parallel Computing Architecture
Parallel Computing Architecture Figure 5 depicts a high-level view of the GeForce GTX 280 GPU parallel computing architecture. A hardware-based thread scheduler at the top manages scheduling threads across the TPCs. You’ll also notice the compute mode includes texture caches and memory interface units. The texture caches are used to combine memory accesses for more efficient and higher bandwidth memory read/write operations.
Page 13: Simt Architecture
Figure 6: TPC (Thread Processing Cluster) SIMT Architecture NVIDIA’s unified shading and compute architecture uses two different processing models. For execution across the TPCs, the architecture is MIMD (multiple instruction, multiple data). For execution across each SM, the architecture is SIMT (single instruction, multiple thread).
Page 14: Larger Register File
Chip TPCs SM per Threads per Total Threads Per Chip GeForce 8 & 12,288 9 Series GeForce 1,024 30,720 GTX 200 GPUs Table 3: Maximum Number of Threads Doing the math results in 32 x 32, or 1,024 maximum concurrent threads that can be managed by each SM.
Page 15: Improved Dual Issue
2:1 anisotropic filtered pixels/clock, or four FP16 bilinear-filtered pixels/clock. Total bilinear texture addressing and filtering capability for an entire high-end GeForce GTX 200 GPU is 80 pixels per clock. GeForce GTX 200 GPUs employ a more efficient scheduler, allowing the chips to attain close to theoretical peak performance in texture filtering.
Page 16: Higher Shader To Texture Ratio
Because games and other visual applications are continually employing more and more complex shaders, the GeForce GTX 200 GPU design shifts the balance to a higher shader to texture ratio. By adding one more SM to each TPC, and keeping texturing hardware constant, the shader to texture ratio is increased by 50%.
Page 17: Geometry Shading And Stream Out
GeForce GTX 200 GPUs compared to the prior generation, providing much faster geometry shading and stream out performance. Figure 8 shows the latest RightMark 3D 2.0 benchmark results, including geometry shading tests. The GeForce GTX 280 GPU is significantly faster than prior generation NVIDIA GPUs and competitive products. Geometry Shader Performance Rightmark 3D 2.0 - Hyperlight Heavy...
Page 18: Power Management Enhancements
HybridPower™ mode (effectively 0 W) Using a HybridPower-capable nForce motherboard, such as those based on the nForce 780a chipset, a GeForce GTX 200 GPU can be fully powered off when not performing intensive graphics operations and graphics output can be handled by the motherboard GPU (mGPU).
Page 19 been modified to improve efficiency of data transfer between the driver and the front end. The memory crossbar between the data assembler and the frame buffer units has been optimized, allowing the GeForce GTX 200 GPUs to run at full speed when performing indexed primitive fetches (unlike the prior generation which suffered some contention between the front end and data assembler).
Page 20: Summary
3D gaming experiences and teraflop performance for demanding high- end compute-intensive applications. NVIDIA SLI technology is taken to new levels with GeForce GTX 200 GPUs and NVIDIA PhysX technology will add amazing new graphical effects to upcoming game titles. CUDA applications will benefit from additional cores, far more threads, double-precision math, and increased register file size.
Page 21: Appendix A: Retrospective
GeForce 9800 GX2 graphics boards to be built more efficiently, while offering up to twice the performance of the GeForce 8800 GTX. As of May 2008, over 70 million NVIDIA GeForce 8 and 9 Series GPUs have shipped and each supports CUDA technology, allowing greatly accelerated performance for mainstream visual computing applications like audio and video encoding and transcoding, image processing, and photo editing.
Page 22: Appendix B: Figure 1 References
GeForce 8800 GTS 512 were run on Asus P5K-V motherboard (Intel G33 based) with 2 GB DDR2 system memory. Based on an extrapolation of 1 min 50 sec 1280 × 720 high-definition movie clip. http://developer.nvidia.com/object/matlab_cuda.html “High performance direct gravitational N-body simulations on graphics processing units paper,” communicated by E.P.J. van den Heuvel “LIBOR,”...
Page 23 No license is granted by implication or otherwise under any patent or patent rights of NVIDIA Corporation. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all information previously supplied. NVIDIA Corporation products are not authorized for use as critical components in life support devices or systems without express written approval of NVIDIA Corporation.

Nvidia GeForce GTX 200 GPU Technical Brief

Introduction

Geforce GTX 200 Architectural Design Goals and Key Capabilities

Geforce GTX 200 GPU Architecture

Summary

Appendix A: Retrospective

Appendix B: Figure 1 References

Quick Links

Need help?

Questions and answers

Summary of Contents for Nvidia GeForce GTX 200 GPU

Page 2: Table Of Contents

Page 4: Introduction

Page 5: Geforce Gtx 200 Architectural Design Goals And Key Capabilities

Page 6: Gaming Beyond: Dynamic 3D Realism

Page 7: Gaming Beyond: Extreme Hd

Page 8: Beyond Gaming: High-Performance Visual Computing And Professional Computation

Page 9: Geforce Gtx 200 Gpu Architecture

Page 10: Graphics Processing Architecture

Page 11: Table 2: Geforce 8800 Gtx Vs Geforce Gtx 280

Page 12: Parallel Computing Architecture

Page 13: Simt Architecture

Page 14: Larger Register File

Page 15: Improved Dual Issue

Page 16: Higher Shader To Texture Ratio

Page 17: Geometry Shading And Stream Out

Page 18: Power Management Enhancements

Page 20: Summary

Page 21: Appendix A: Retrospective

Page 22: Appendix B: Figure 1 References

Nvidia GeForce GTX 200 GPU Technical Brief

Introduction

Geforce GTX 200 Architectural Design Goals and Key Capabilities

Geforce GTX 200 GPU Architecture

Summary

Appendix A: Retrospective

Appendix B: Figure 1 References

Quick Links

Need help?

Questions and answers

Related Manuals for Nvidia GeForce GTX 200 GPU

Summary of Contents for Nvidia GeForce GTX 200 GPU

Table of Contents