Parallel Computing Architecture; Figure 5: Geforce Gtx 280 Gpu Parallel Computing Architecture - Nvidia GeForce GTX 200 GPU Technical Brief

Architectural overview

Table of Contents

Parallel Computing Architecture

Figure 5 depicts a high-level view of the GeForce GTX 280 GPU parallel

computing architecture. A hardware-based thread scheduler at the top manages

scheduling threads across the TPCs. You'll also notice the compute mode includes

texture caches and memory interface units. The texture caches are used to combine

memory accesses for more efficient and higher bandwidth memory read/write

operations. The elements indicated as "atomic" refer to the ability to perform

atomic read-modify-write operations to memory. Atomic access provides granular

access to memory locations and facilitates parallel reductions and parallel data

structure management.

A TPC in compute mode is represented in Figure 6 below. You can see local shared

memory is included in each of the three SMs. Each processing core in an SM can

share data with other processing cores in the SM via the shared memory, without

having to read or write to or from an external memory subsystem. This contributes

greatly to increased computational speed and efficiency for a variety of algorithms.

May, 2008 | TB-04044-001_v01

Table of Contents