Nvidia GeForce GTX 980 White Paper

Nvidia GeForce GTX 980 White Paper

Hide thumbs Also See for GeForce GTX 980:

Advertisement

Whitepaper
NVIDIA GeForce GTX 980
Featuring Maxwell, The Most Advanced GPU Ever Made.
V1.1

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the GeForce GTX 980 and is the answer not in the manual?

Questions and answers

Subscribe to Our Youtube Channel

Summary of Contents for Nvidia GeForce GTX 980

  • Page 1 Whitepaper NVIDIA GeForce GTX 980 Featuring Maxwell, The Most Advanced GPU Ever Made. V1.1...
  • Page 2: Table Of Contents

    GeForce GTX 980 Whitepaper Table of Contents Introduction ..............................3 Extraordinary Gaming Performance for the Latest Displays ..............3 Incredible Energy Efficiency ........................4 Dramatic Leap Forward In Lighting with VXGI ..................5 GM204 Hardware Architecture In-Depth ..................... 6 Maxwell Streaming Multiprocessor ......................8 PolyMorph Engine 3.0 ..........................
  • Page 3: Introduction

    INTRODUCTION Introduction In our 21-year quest to bring the most realistic 3D graphics to gamers, NVIDIA has introduced a number of innovations. With its hardware transform and lighting engine, NVIDIA’s GeForce 256 ushered in the era of the GPU in 1999, bringing T&L support to consumer graphics for the first time. In late 2006 with the G80 GPU and the GeForce 8800 GTX graphics board, CUDA and the formalization of GPGPU computing as we know it today was first brought to the world.
  • Page 4: Incredible Energy Efficiency

    GeForce GTX 980 Whitepaper INTRODUCTION Incredible Energy Efficiency All GeForce GPUs are made to deliver performance leadership in their respective segment. But PC gamers also expect efficiency—not noisy fans, excessive heat, and over-taxed power supplies. The desktop PC gaming market is seeing increasing popularity of smaller and thinner form factors, and the laptop PC gaming market has grown explosively in the past few years.
  • Page 5: Dramatic Leap Forward In Lighting With Vxgi

    Maxwell GPUs. The first graphics cards to ship with our new GM204 GPU are the GeForce GTX 980 and GeForce GTX 970. For simplicity purposes we’ll be focusing solely on the GeForce GTX 980 in this document. With 2048 CUDA Cores and 4GB of GDDR5 memory, GM204 is the fastest GPU in the world, yet with a TDP of just 165W, it’s also the most efficient.
  • Page 6: Gm204 Hardware Architecture In-Depth

    Multiprocessors (SMs), and memory controllers. GM204 consists of four GPCs, 16 Maxwell SMs (SMM), and four memory controllers. GeForce GTX 980 uses the full complement of these architectural components (if you are not well versed in these structures, we suggest you first read the...
  • Page 7 GeForce GTX 980 Whitepaper IN-DEPTH In GeForce GTX 980, each GPC ships with a dedicated raster engine and four SMMs. Each SMM has 128 CUDA cores, a PolyMorph Engine, and eight texture units. With 16 SMMs, the GeForce GTX 980 ships with a total of 2048 CUDA cores and 128 texture units.
  • Page 8: Maxwell Streaming Multiprocessor

    GM204 HARDWARE ARCHITECTURE GeForce GTX 980 Whitepaper IN-DEPTH from 32 to 64. Again, thanks to the added benefit of higher clocks, pixel fill-rate is actually more than double that of GTX 680: 72 Gpixels/sec for GTX 980 versus 32.2 Gpixels/sec for GTX 680.
  • Page 9: Polymorph Engine 3.0

    GM204 HARDWARE ARCHITECTURE GeForce GTX 980 Whitepaper IN-DEPTH and power that had to be spent to manage data transfer in the more complex datapath organization used by Kepler. Compared to Kepler, the SMM’s memory hierarchy has also changed. Rather than implementing a...
  • Page 10: Gm204 Memory Subsystem

    2048KB L2 cache that is shared across the GPU. In addition, GM204 has made significant enhancements to our memory compression implementation. To reduce DRAM bandwidth demands, NVIDIA GPUs make use of lossless compression techniques as data is written out to memory. The bandwidth savings from this compression is realized a second time when clients such as the Texture Unit later read the data.
  • Page 11: New Display And Video Engines

    4K MST displays in Kepler. GeForce GTX 980 is also the world’s first GPU to support HDMI 2.0. Whereas the previous generation, HDMI 1.4, can support 4K display at 30Hz for “444” RGB pixels, and 60Hz for “420” YUV pixels, with...
  • Page 12 DP 1.2, eDP 1.4 DP 1.2 When combined with a G-SYNC display, the GeForce GTX 980 delivers a gaming experience that’s free of the distracting screen tearing that currently plagues gaming when Vsync is disabled. G-SYNC also eliminates display-subsystem-generated stutter and reduces input lag that gamers put up with today.
  • Page 13: Maxwell: Enabling The Next Frontier In Pc Graphics

    MAXWELL: ENABLING THE NEXT GeForce GTX 980 Whitepaper FRONTIER IN PC GRAPHICS Maxwell: Enabling The Next Frontier in PC Graphics Correct modelling of lighting is the most computationally difficult problem in graphics, and with Maxwell, our objective was to make an enormous leap forward in the capability of the GPU to perform near-photo-real lighting calculations in real time.
  • Page 14 MAXWELL: ENABLING THE NEXT GeForce GTX 980 Whitepaper FRONTIER IN PC GRAPHICS Figure 4: Scene rendered with direct lighting only Figure 5: Same scene with GI enabled, note the indirect light and specular reflections on the floor Because it’s a computationally expensive lighting technique (particularly in highly detailed scenes), GI has been primarily used to render complex CG scenes in movies using offline GPU rendering farms.
  • Page 15 GPU. This new GI technology uses a voxel grid to store scene and lighting information, and a novel voxel cone tracing process to gather indirect lighting from the voxel grid. NVIDIA’s Cyril Crassin describes the technique his paper on the topic...
  • Page 16 MAXWELL: ENABLING THE NEXT GeForce GTX 980 Whitepaper FRONTIER IN PC GRAPHICS The following picture illustrates this approach applied to a single triangle. In the picture, each cube that contains a portion of the purple triangle is highlighted in blue.
  • Page 17 MAXWELL: ENABLING THE NEXT GeForce GTX 980 Whitepaper FRONTIER IN PC GRAPHICS Figure 8: Voxelization results Once the voxel coverage stage is complete, we store information into each voxel describing how the physical geometry will respond to light. This includes encoding the material’s opacity and emissive/reflective properties.
  • Page 18 MAXWELL: ENABLING THE NEXT GeForce GTX 980 Whitepaper FRONTIER IN PC GRAPHICS Figure 9: Results of light injection After these steps, we now have a complete voxel data structure representing the lighting information in the scene. The next and final step is to rasterize the scene. This step is largely the same as it would be for a scene rendered with other lighting approaches;...
  • Page 19 MAXWELL: ENABLING THE NEXT GeForce GTX 980 Whitepaper FRONTIER IN PC GRAPHICS Rendering a real-time reflection from a glossy curved surface has been difficult in the past. In order to render glossy reflections, you would traditionally need to launch hundreds or thousands of scattered secondary rays for each ray that bounces from the original reflector.
  • Page 20 MAXWELL: ENABLING THE NEXT GeForce GTX 980 Whitepaper FRONTIER IN PC GRAPHICS We also use the same approach to quickly compute diffuse or specular lighting with only a few scattered cones. Ultimately as a result, we’re able to compute approximate GI at high frame rates in real time, allowing us to realistically render glossy and metallic surfaces.
  • Page 21: Hardware Acceleration For Vxgi - Multi-Projection And Conservative Raster

    MAXWELL: ENABLING THE NEXT GeForce GTX 980 Whitepaper FRONTIER IN PC GRAPHICS Hardware Acceleration for VXGI – Multi-Projection and Conservative Raster One exciting property of VXGI is that it is very scalable—by changing the density of the voxel grid, and the amount of tracing of that voxel grid that is performed per pixel, it is possible for VXGI to run across a wide range of hardware, including Kepler GPUs, console hardware, etc.
  • Page 22 MAXWELL: ENABLING THE NEXT GeForce GTX 980 Whitepaper FRONTIER IN PC GRAPHICS Figure 11: Illustration of conservative raster coverage rules Hardware support for conservative raster is very helpful for the coverage phase of voxelization. In this phase, fractional coverage of each voxel needs to be determined with high accuracy to ensure the voxelized 3D grid represents the original 3D triangle data properly.
  • Page 23: Tiled Resources

    DirectX 11.2 introduced a feature called Tiled Resources that could be accelerated with an NVIDIA Kepler and Maxwell hardware feature called Sparse Texture. With Tiled Resources, only the portions of the textures required for rendering are stored in the GPU’s memory. Tiled Resources works by breaking textures down into tiles (pages), and the application determines which tiles might be needed and loads them into video memory.
  • Page 24: Raster Ordered View

    MAXWELL: ENABLING THE NEXT GeForce GTX 980 Whitepaper FRONTIER IN PC GRAPHICS Figure 13: Fixed resolution vs multi resolution shadow map quality This is another application of multi-projection that will benefit from the hardware acceleration in Maxwell. In the future, we also believe that tiled resources can be leveraged within VXGI, to save voxel memory footprint.
  • Page 25: Directx 12

    GPU and CPU functions. While the NVIDIA driver very efficiently manages resource allocation and synchronization under DX11, under DX12 it is the game developer’s responsibility to manage the GPU.
  • Page 26 In addition, the DX12 release of DirectX will introduce a number of new features for graphics rendering. Microsoft has disclosed some of these features, at GDC and during NVIDIA’s Editor’s conference. Conservative Raster, discussed earlier in the GI section of this paper, is one such DX graphics feature.
  • Page 27: Advancing The State-Of-The-Art In Image Quality

    ADVANCING THE STATE-OF-THE-ART GeForce GTX 980 Whitepaper IN IMAGE QUALITY Advancing the State-Of-The-Art in Image Quality Game developers and GPU vendors are increasingly implementing more advanced forms of anti-aliasing (AA) to enhance image quality. GM2xx GPUs have a number of new features for much more flexible sampling, enabling further advancements in AA quality and efficiency.
  • Page 28 GeForce GTX 980 Whitepaper IN IMAGE QUALITY NVIDIA engineers have recently developed new AA algorithms that vary, in interleaved fashion, the sample patterns used per pixel either spatially in a single frame (where, for example, each successive pixel uses one of four different 4xAA sample patterns) or interleaved across multiple frames in time.
  • Page 29: Dynamic Super Resolution

    To address the usability and quality issues, NVIDIA has developed a method called Dynamic Super Resolution. In principal, Dynamic Super Resolution works like traditional downsampling, but it has a simple on/off user control, and it uses a 13-tap Gaussian filter during the conversion to display resolution.
  • Page 30 While it’s compatible with all GeForce GPUs, the best performance can be seen when using a GeForce GTX 980. Going forward we could potentially use Maxwell’s more advanced sampling control features, like programmable sample positions and interleaved sampling, to further...
  • Page 31: Conclusion

    ROPs, and enhanced compression algorithms, the GeForce GTX 980 delivers exceptional performance with MSAA. When it comes to AA, the GeForce GTX 980 has been tailored to offer the crispest visuals available on the PC. The GeForce GTX 980 supports new features for sampling control that will enable new AA techniques like MFAA, allowing lower level AA sample patterns to be perceived as higher quality AA, but with the faster performance of lower AA levels.
  • Page 32: Legal Notice

    NVIDIA, the NVIDIA logo, CUDA, FERMI, KEPLER, MAXWELL and GeForce are trademarks or registered trademarks of NVIDIA Corporation in the United States and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.

Table of Contents