Advertisement

Quick Links

ATI CTM Guide
Technical Reference Manual
Version 1.01

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the ATI CTM and is the answer not in the manual?

Questions and answers

Subscribe to Our Youtube Channel

Summary of Contents for AMD ATI CTM

  • Page 1 ATI CTM Guide Technical Reference Manual Version 1.01...
  • Page 2 AMD’s products are not designed, intended, authorized or warranted for use as components in systems intended for surgical implant into the body, or in other applications intended to support or sustain life, or in any other application in which the failure of AMD’s product could create a situation where personal injury, death, or severe property or environmental damage may occur.
  • Page 3: Table Of Contents

    3.4 Texture Instructions..........32 3.4.1 Operations 3.4.2 Semaphore © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 4 5.1 Functions ........... . . 51 5.1.1 amCloseManagedConnection 5.1.2 amCommandBufferConsumed 5.1.3 amOpenManagedConnection 5.1.4 amSubmitCommandBuffer ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 5: Chapter 1: Introduction

    This manual provides a programmatic overview of the CTM. Audience This manual is intended for experienced design engineers. Related Documents • ATI CTM Device Interface included with CTM distibution. • Assembler/Disassembler Guide included with CTM distibution. © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 6 2 Related Documents ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 7: Chapter 2: Specifications

    These commands reside in memory (see ATI CTM Device Interface for further information). This chapter specifies how CTM reads these commands and its behavior upon processing each.
  • Page 8: The Ati Data Parallel Processor Array

    The result of the entire computation is as if the program were executed in SIMD across all index pairs.[x] ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 9: Conditional Operation Unit

    32-bit address range. The address mapping is system- dependent and is described in the ATI CTM Device Interface. The MC computes the memory address as a function of an index pair, (x, y), the number of elements in each row of data (pitch), a base address offset (offset), a tiling format (linear or tiled plus an optional 2x2 superfine tiling on single- channel input data reads), and the bytes per element (bpe) derived from the data format (format).
  • Page 10 16 program inputs, and must be invalidated to guarantee correct reading of data that has changed in memory. The input read cache is invalidated with the inv_inp_cache command (see page 20). ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 11 The constant read caches must be invalidated to guarantee that data that has changed in memory is properly read. The constant read caches can be invalidated with the inv_constf_cache , inv_consti_cache , and inv_constb_cache commands. © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 12: Ctm Commands

    (i0, j0) 1: j0 - (i1, j1) inclusive. 2: i1 3: j1 start_program x'C0000800 'yes 0: Reserved Instruct the Processor Execution Unit to start the program. ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 13 Flush the conditional output cache. set_out_mask x'C0001900 'yes 0: Mask Set the write mask for the output channels. set_cond_out_mask x'C0001A00 'yes 0: Mask Set the write mask for the conditional output buffer. © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 14: Processor Execution Unit Commands

    Section 3.2. These are unsigned integers, of a range given by the bits in use. Parameter 0: i0 Bits Field Name Description 11:0 The i0 domain parameter. 31:12 Reserved Reserved ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 15 The wait_for_idle command takes one, reserved parameter. After receiving this command, the PE blocks all further command processing until all processors in the DPP are idle. Once all processors are idle, processing of subsequent commands resumes. © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 16 The read_perf_counters command takes one parameter. This command transfers the performance counters to an area in memory beginning at the GPU address supplied in the parameter. The counters are written as 32-bit unsigned integers in the order given in Section 2.1.2 . ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 17: Memory Controller Unit Commands

    The pitch in multiples of 4 15:13 Reserved Reserved 17:16 tiling Tiling format (possible values): • 0 - LINEAR • 1 - TILED • 2 - LINEAR_INP_2X2 • 3 - TILED_INP_2X2 © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 18 The second contains pitch, tiling, and format information for the program data. Parameter 0: base address base address Bits Field Name Description 10:0 Reserved Reserved 31:11 base address The 2K-aligned address at which the first program instruction is located. ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 19 Parameter 0: input Bits Field Name Description input The input to which this command applies. 31:4 Reserved Reserved © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 20 ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 21 Data format (possible values): • 0 - UINT16_1 • 1 - UINT8_4 • 2 - FLOAT32_1 • 3 - FLOAT32_2 • 4 - FLOAT32_4 • > 5 - Reserved 31:27 Reserved Reserved © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 22 Data format (possible values): • 0 - UINT16_1 • 1 - UINT8_4 • 2 - FLOAT32_1 • 3 - FLOAT32_2 • 4 - FLOAT32_4 • > 5 - Reserved 31:27 Reserved Reserved ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 23 The set_constf_fmt , set_consti_fmt , and set_constb_fmt commands all have the same form. They take two parameters. The first parameter is the base address for the corresponding constants in memory. The second contains the corresponding pitch, tiling, and format information. © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 24 Flush Cache Commands The flush_out_cache and flush_cond_out_cache commands all take one, reserved parameter. Upon receiving one of these commands, the MC flushes the corresponding cache as described in Section 2.1.4. ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 25: Conditional Output Unit Commands

    Set Conditional Test Command The set_cond_test command takes a single parameter, which specifies the test condition that the CO will perform. This test is active until the next set_cond_test command is received. © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 26 Parameter 0: loc Bits Field Name Description Sets the location of the conditional test: • 0 - DPP • 1 -- PE 31:1 Reserved Reserved ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 27: Dpp Array Instruction Set Architecture

    Instruction Words INST_TYPE_ALU / INST_TYPE_OUTPUT (6 words): • CMN_INST_* • ALU_RGB_ADDR_* • ALU_ALPHA_ADDR_* • ALU_RGB_INST_* • ALU_ALPHA_INST_* • ALU_RGBA_INST_* © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 28: Synchronization Of Instruction Streams

    • TEX instruction dependent on ALU for source register or predicate. Synchronized with the ALU_WAIT bit. • FC instruction dependent on ALU for predicate or ALU result. Synchronized with the ALU_WAIT bit. ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 29: Alu Instructions

    Also, the bit pattern 0x0 represents 2^-10, rather than zero. Example values are shown below: EXPONENT MANTISSA 2^-10 2^-9 2^-8 2^-7 2^-6 You can obtain negative inline constants and the value zero using the input modifiers and swizzles, described below. © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 30: Presubtract

    There are fields to configure 6 inputs per instruction: 3 for RGB and 3 for Alpha. An instruction can read in at most 12 independent colour components (9 RGB components and 3 alpha components). ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 31: The Operation

    • IMOD_NAB - Take negative of absolute value 3.3.4 The Operation Following are the possible math operations the ALU can perform. The three inputs are denoted by A, B, and C. © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 32 • Use OP_ALPHA_DP to get result into Alpha. OP_ALPHA_EX2 • 2 ^ A • Use OP_RGB_SOP to get result into RGB. OP_ALPHA_LN2 • log2(A) • Use OP_RGB_SOP to get result into RGB. ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 33: Instruction Modifiers

    This allows a MOV to be implemented using any of the following instructions, with OMOD_DISABLED set: • MIN(src, src) • MAX(src, src) • CND(src, src, 0) • CMP(src, src, 0) OMOD_DISABLED is not valid with any other ALU operation. © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 34: Writemasks

    (the default mode), the following options are available: • RNDR_TGT_A - Write to render target A register • RNDR_TGT_B - Write to render target B register • RNDR_TGT_C - Write to render target C register ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 35: Setting Predicate Bits

    RGB components to the temporary register, but you will write to all 4 predicate bits. If the instruction result is clamped, the comparison happens on the post-clamped result. If output modifier is disabled, denormals may be compared -- denormals are equivalent to zero. © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 36: Alu Result

    [0.0, 1.0], and set for texture lookups which supply coordinates that are prescaled to the texture dimensions. Uncached reads (used for UMRT) should set the UNSCALED bit: the coordinates supplied to the program are prescaled, and the index coordinate should always be an integer value. ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 37: Operations

    TEX_SEM_WAIT on the first instruction that uses any of the results. This example illustrates the usage: INSTRUCTION TEX_SEM_WAIT TEX_SEM_ACQUIRE r4 = TEXLD(s0, r1) r5 = TEXLD(s1, r2) r6 = TEXLD(s2, r3) r1 = r1 + 1 © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 38: Flow Control

    This orthogonality allows for more creative control of the program behavior, and provides opportunity for optimizations in programs that use a lot of flow control. ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 39: Stacks And Branch Counters

    The branch counter is ignored in hardware while the active bit is set. Processors disabled by looping statements (BREAKLOOP, BREAKREP, and CONTINUE) are also tracked with "loop inactive" counters, however unlike the branch counter, the loop counters cannot be manipulated directly. © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 40: Fields

    Fields Controlling Optional Stack Operation OP - Loop Stack Operations. • FC_OP_JUMP = None • FC_OP_LOOP = Initialize counter and aL, and push loop stack if stay ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 41 Processors deactivated by other flow control are indifferent to the decision to jump by a BREAK or CONTINUE statement. Address Fields BOOL_ADDR - Which of 32 constant booleans to use for jump condition. © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 42: Common Flow Control Statements

    JUMP_ADDR - Which instruction to jump to if conditions pass. JUMP_GLOBAL -- Whether JUMP_ADDR is global, or if OFFSET_ADDR should be added to JUMP_ADDR. 3.5.4 Common Flow Control Statements ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 43 • n indicates how many branch stack frames the BREAK is inside within the current loop. • Lines with no fields filled out indicate no FC instruction is necessary in that spot. © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 44: Optimizations

    The most pervasive caveat is that denormals are flushed to an appropriately signed zero throughout X1K FP. There is no gradual underflow, and identities are not preserved for denormal values. This will be apparent in comparison operations where a denormal is treated as equivalent to zero. ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 45: Alu Non-Transcendental Floating Point

    OPERATION RESULT NOTES x * NaN x is any value. 0.0 * Inf Inf * Inf Inf * -Inf -Inf 0.0 * -0.0 -0.0 © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 46: Alu Transcendental Floating Point

    +Inf +Inf +0.0 +0.0 -Inf +0.0 -0.0 * For RSQ, recall that the square root occurs first. IEEE specifies sqrt(-0.0) -> -0.0; the X1K FP deviates Note: from this. ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 47: Texture Floating Point

    (unless you are in full-flow control mode) or you risk waiting on a younger thread, which shouldn't cause a deadlock but could be bad for performance. In full-flow control mode, you can leave the tex sem wait bit set for those other instructions. © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 48 44 Errata ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 49: Dpp Application Binary Interface

    EM_ATI_R5XX Specifies the ATI X1k series encoding (current value is 122). e_entry No defined entry point virtual address. e_flags EF_ATI_DPP Required processor specific flag (current value is 1). © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 50: Program Code Sections

    A 1 in bit 0 specifies that output writes are uncached. US_PIXSIZE Elf32_Word per-program Largest register used in the program US_FC_CTRL Elf32_Word per-program A 1 in bit 31 specifies full flow control functionality ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 51 4 * ninputs type ELF_NOTE_ATI_INPUTS (current value is 2) name ELF_NOTE_ATI (current value is "ATI DPP") desc 4 * ninputs List of the input indices used in the program. © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 52 The Int32 Constants Note is summarized in the following table. If an object file does not contain an Int32 constants note, then the program loader acts as if the program has no int32 constant references: Field Size in Bytes Value namesz descsz 4 * nint32consts ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 53 Size in Bytes Value namesz descsz type ELF_NOTE_ATI_EARLYEXIT (current value is 7) name ELF_NOTE_ATI (current value is "ATI DPP") desc 1 if the program includes an early exit; otherwise 0. © 2006 Advanced Micro Devices, Inc. ATI CTM Guide v. 1.01...
  • Page 54 50 Executable Files ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.
  • Page 55: Chapter 5: Device Interface

    Chapter 5 Device Interface This chapter describes the interfaces exported by the ATI CTM Library, as detailed in the amDeviceManaged.h file. For more information, refer to the CTM Device Interface HTML files. Functions 5.1.1 amCloseManagedConnection amCloseManagedConnection ( AMmanagedDevice dev ) Close a managed device.
  • Page 56: Amsubmitcommandbuffer

    (in) size of the command buffer in units of 32-bit unsigned integers. Returns Returns a unique 32-bit unsigned integer identifying this command buffer submission request, or zero if the submission has failed. ATI CTM Guide v. 1.01 © 2006 Advanced Micro Devices, Inc.

Table of Contents