Larger Register File; Figure 7: Local Register File 2× Versus 1; Table 3: Maximum Number Of Threads - Nvidia GeForce GTX 200 GPU Technical Brief

Architectural overview
Table of Contents

Advertisement

Chip
GeForce 8 &
9 Series
GeForce
GTX 200
GPUs

Table 3: Maximum Number of Threads

Doing the math results in 32 x 32, or 1,024 maximum concurrent threads that can
be managed by each SM. With 30 SMs in total, the GeForce GTX 280 supports up
to 30,720 concurrent threads in hardware (versus 768 threads/SM × 2 SMs/TPC ×
8 TPCs = 12,288 maximum concurrent threads in GeForce 8800 GTX).

Larger Register File

The local register file size has doubled per SM in GeForce GTX 200 GPUs
compared to GeForce 8 & 9 Series GPUs. The older GPUs could run into
situations with long shaders where registers would be exhausted, generating the
need to swap to memory. A much larger register file permits larger and more
complex shaders to be run on the GeForce GTX 200 GPUs faster and more
efficiently. In terms of die size increase, the additional register file takes only a small
fraction of SM die area.
Games are employing more and more complex shaders that require more register
space. Figure 7 below highlights performance improvements 2× register file size in
3D Mark Vantage.
Figure 7: Local Register File 2× versus 1×
14
May, 2008 | TB-04044-001_v01
TPCs
8
10
2x vs 1x Register File Size
4800
4600
4400
4200
4000
3800
3600
Overall Score
SM per
TPC
2
3
3D Mark Vantage
Extreme Preset
Normal LRF (2x)
Decreased LRF (1x)
GPU Total
Threads per
Total
SM
Threads Per
Chip
768
12,288
1,024
30,720

Advertisement

Table of Contents
loading

Table of Contents