Table of Contents

Advertisement

Quick Links

Sun Microsystems UltraSPARC-I User Manual

Advertisement

Table of Contents
loading

Summary of Contents for Sun Microsystems UltraSPARC-I

  • Page 1 Artisan Technology Group is your source for quality new and certified-used/pre-owned equipment SERVICE CENTER REPAIRS WE BUY USED EQUIPMENT • FAST SHIPPING AND DELIVERY Experienced engineers and technicians on staff Sell your excess, underutilized, and idle used equipment at our full-service, in-house repair center We also offer credit for buy-backs and trade-ins •...
  • Page 2 ™ UltraSPARC User’s Manual UltraSPARC-I UltraSPARC-II July 1997 Sun Microelectronics 901 San Antonio Road Palo Alto, CA 94303 Part No: 802-7220-02 This July 1997 -02 Revision is only available on- line. The only changes made were to support hypertext links in the pdf file.
  • Page 3 Sun Microsystems, Inc. Sun, Sun Microsystems, and the Sun logo are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc.
  • Page 4: Table Of Contents

    Contents Preface ............................. Overview ........................A Brief History of SPARC ..................How to Use This Book ....................Section I — Introducing UltraSPARC 1. UltraSPARC Basics........................ Overview ........................Design Philosophy ...................... Component Overview ....................UltraSPARC Subsystem....................2. Processor Pipeline ......................... Introductions........................ Pipeline Stages ......................
  • Page 5 raSPARC User’s Manual Cache Flushing......................Memory Accesses and Cacheability ................. Load Buffer ........................Store Buffer ........................MMU Internal Architecture ....................Introduction........................Translation Table Entry (TTE) ................... Translation Storage Buffer (TSB) ................MMU-Related Faults and Traps ................MMU Operation Summary ..................ASI Value, Context, and Endianness Selection for Translation ......
  • Page 6 Contents Ancillary State Registers..................... Other UltraSPARC Registers ..................Supported Traps ......................9. Interrupt Handling ....................... Interrupt Vectors ......................Interrupt Global Registers..................Interrupt ASI Registers ....................Software Interrupt (SOFTINT) Register..............10. Reset and RED_state......................10.1 Overview ........................10.2 RED_state Trap Vector ....................10.3 Machine State after Reset and in RED_state............
  • Page 7 raSPARC User’s Manual 15.2 Supported Memory Models ..................Section IV — Producing Optimized Code Code Generation Guidelines ....................16.1 Hardware / Software Synergy .................. 16.2 Instruction Stream Issues ................... 16.3 Data Stream Issues....................... Grouping Rules and Stalls ....................17.1 Introduction........................17.2 General Grouping Rules .....................
  • Page 8 Contents Power-Up........................D. IEEE 1149.1 Scan Interface ....................Introduction........................Interface ........................Test Access Port (TAP) Controller ................Instruction Register ..................... Instructions........................Public Test Data Registers..................E. Pin and Signal Descriptions ....................Introduction........................Pin Descriptions......................Signal Descriptions...................... ASI Names ..........................Introduction........................G.
  • Page 9 UltraSPARC User’s Manual Sun Microelectronics viii Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 10 Preface Overview Welcome to the UltraSPARC User’s Manual. This book contains information about the architecture and programming of UltraSPARC™, Sun Microsystems’ family of SPARC-V9-compliant processors. It describes the UltraSPARC-I and UltraSPARC-II processor implementasions. This book contains information on: • The UltraSPARC system architecture •...
  • Page 11 Architecture Manual, Version 9; they are numbered throughout the body of the text, and are cross referenced in Appendix C that book. This book, the UltraSPARC User’s Manual, describes the UltraSPARC-I and UltraSPARC-II implementations of the SPARC-V9 architecture. It provides specif- ic information about UltraSPARC processors, including how each SPARC-V9 im- plementation dependency was resolved.
  • Page 12 Preface Textual Conventions This book uses the same textual conventions as The SPARC Architecture Manual, Version 9. They are summarized here for convenience. Fonts are used as follows: • Italic font is used for register names, instruction fields, and read-only register fields.
  • Page 13 raSPARC User’s Manual • Chapter 4, “Overview of the MMU, “ describes the UltraSPARC MMU, its architecture, how it performs virtual address translation, and how it is programmed. Section II, “Going Deeper,” presents detailed information about UltraSPARC ar- chitecture and programming. Section II contains the following chapters: •...
  • Page 14 Preface • Chapter 15, “SPARC-V9 Memory Models,” describes the supported memory models (which are documented fully in The SPARC Architecture Manual, Version 9). Low-level programmers and operating system implementors should study this chapter to understand how their code will interact with the UltraSPARC cache and memory systems.
  • Page 15 UltraSPARC User’s Manual Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 16: Section I - Introducing Ultrasparc

    Section I — Introducing UltraSPARC UltraSPARC Basics ................Processor Pipeline ................Cache Organization ................Overview of the MMU ................ Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 17 UltraSPARC User’s Manual Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 18: Ultrasparc Basics

    UltraSPARC Basics 1.1 Overview UltraSPARC is a high-performance, highly integrated superscalar processor im- plementing the 64-bit SPARC-V9 RISC architecture. UltraSPARC is capable of sus- taining the execution of up to four instructions per cycle, even in the presence of conditional branches and cache misses. This is due mainly to the asynchronous aspect of the units feeding instructions and data to the rest of the pipeline.
  • Page 19 (four), short latencies, and multiple bypasses do not affect the cycle time significantly. Table 1-1 Implementation Technologies and Cycle Times UltraSPARC-I UltraSPARC-II Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com 0.5 µ CMOS 0.35 µ CMOS...
  • Page 20: Component Overview

    1. UltraSPARC Basics 1.3 Component Overview Figure 1-1 shows a block diagram of the UltraSPARC processor. Memory Management Unit (MMU) Prefetch and Dispatch Unit (PDU) Instruction Cache and Buffer iTLB dTLB Grouping Logic Integer Reg and Annex Load / Store Unit (LSU) Integer Execution Unit (IEU) Data Load...
  • Page 21 UltraSPARC User’s Manual • Integer Execution Unit (IEU) with two Arithmetic and Logic Units (ALUs) • Load/Store Unit (LSU) with a separate address generation adder • Load buffer and store buffer, decoupling data accesses from the pipeline • A 16Kb Data Cache (D-Cache) •...
  • Page 22 Four sets of global registers (normal, alternate, MMU, and interrupt globals) • The trap registers (See Table 1-2 for supported trap levels) Table 1-2 Supported Trap Levels UltraSPARC-I UltraSPARC-II MAXTL Trap Levels 1.3.4 Floating-Point Unit (FPU) The FPU is partitioned into separate execution units, which allows the UltraSPARC processor to issue and execute two floating-point instructions per...
  • Page 23 raSPARC User’s Manual 3.6 Memory Management Unit (MMU) The MMU provides mapping between a 44-bit virtual address and a 41-bit phys- ical address. This is accomplished through a 64-entry iTLB for instructions and a 64-entry dTLB for data; both TLBs are fully associative. UltraSPARC provides hardware support for a software-based TLB miss strategy.
  • Page 24: Preface

    1. UltraSPARC Basics Table 1-3 Supported E-Cache Sizes E-Cache Size UltraSPARC-I UltraSPARC-II 512 Kb 1 Mb 2 Mb 4 Mb 8 Mb 16 Mb The ECU provides overlap processing during load and store misses. For instance, stores that hit the E-Cache can proceed while a load miss is being processed. The ECU can process reads and writes indiscriminately, without a costly turn-around penalty (only 2 cycles).
  • Page 25: How To Use This Book

    Table 1-5 shows the possible ratios between the processor and system clock fre- quencies for each UltraSPARC model. Table 1-5 Model-Dependent Processor : System Clock Frequency Ratios Frequency Ratio UltraSPARC-I UltraSPARC-II 2 : 1 3 : 1 4 : 1...
  • Page 26: Processor Pipeline

    Processor Pipeline 2.1 Introductions UltraSPARC contains a 9-stage pipeline. Most instructions go through the pipe- line in exactly 9 stages. The instructions are considered terminated after they go through the last stage (W), after which changes to the processor state are irrevers- ible.
  • Page 27: Pipeline Stages

    raSPARC User’s Manual 2 Pipeline Stages This section describes each pipeline stage in detail. Figure 2-2 illustrates the pipe- line stages. (Results in Annex) IST_data Tag Check D-Cache LDQ/STQ D-Cache Data FPST_data FP add address bus G ALU data bus FP mul instruction bus G mul...
  • Page 28 2. Processor Pipeline 2.2.1 Stage 1: Fetch (F) Stage Prior to their execution, instructions are fetched from the Instruction Cache (I-Cache) and placed in the Instruction Buffer, where eventually they will be se- lected to be executed. Accessing the I-Cache is done during the F Stage. Up to four instructions are fetched along with branch prediction information, the pre- dicted target address of a branch, and the predicted set of the target.
  • Page 29 UltraSPARC User’s Manual 2.2.4 Stage 4: Execution (E) Stage Data from the integer register file is processed by the two integer ALUs during this cycle (if the instruction group includes ALU operations). Results are comput- ed and are available for other instructions (through bypasses) in the very next cy- cle.
  • Page 30 2. Processor Pipeline The physical address of a store is sent to the Store Buffer during this stage. To avoid pipeline stalls when store data is not immediately available, the store ad- dress and data parts are decoupled and sent to the Store Buffer separately. : The X stage of the FGU.
  • Page 31 UltraSPARC User’s Manual Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 32: Cache Organization

    Cache Organization 3.1 Introduction 3.1.1 Level-1 Caches UltraSPARC’s Level-1 D-Cache is virtually indexed, physically tagged (VIPT). Virtual addresses are used to index into the D-Cache tag and data arrays while accessing the D-MMU (that is, the dTLB). The resulting tag is compared against the translated physical address to determine D-Cache hits.
  • Page 33 ficient to use block commit stores in the loop, followed by a single FLUSH in- struction to flush the pipeline. Note: The size of each I-Cache set is the same as the page size in UltraSPARC-I and UltraSPARC-II; thus, the virtual index bits equal the physical index bits. 1.1.2 Data Cache (D-Cache) The D-Cache is a write-through, nonallocating-on-write-miss 16-Kb direct mapped cache with two 16-byte sub-blocks per line.
  • Page 34 3. Cache Organization Instruction fetches bypass the E-Cache when: • The I-MMU is disabled, or • The processor is in RED_state, or • The access is mapped by the I-MMU as physically noncacheable Data accesses bypass the E-Cache when: • The D-MMU enable bit (DM) in the LSU_Control_Register is clear, or •...
  • Page 35 UltraSPARC User’s Manual Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 36: Overview Of The Mmu

    Overview of the MMU 4.1 Introduction This chapter describes the UltraSPARC Memory Management Unit as it is seen by the operating system software. The UltraSPARC MMU conforms to the require- ments set forth in The SPARC Architecture Manual, Version 9. Note: The UltraSPARC MMU does not conform to the SPARC-V8 Reference MMU Specification.
  • Page 37 raSPARC User’s Manual 8K-byte Virtual Page Number Page Offset 8 Kb 8K-byte Physical Page Number Page Offset 64K-byte Virtual Page Number Page Offset 64 Kb 64K-byte Physical Page Number Page Offset 512K-byte Virtual Page Number Page Offset 512 Kb 512K-byte PPN Page Offset 4M-byte Virtual Page Number Page Offset...
  • Page 38 4. Overview of the MMU FFFF FFFF FFFF FFFF FFFF F800 0000 0000 FFFF F7FF FFFF FFFF Out of Range VA (VA “Hole”) 0000 0800 0000 0000 0000 07FF FFFF FFFF 0000 0000 0000 0000 Figure 4-2 UltraSPARC’s 44-bit Virtual Address Space, with Hole (Same as Figure 14-2) Throughout this document, when virtual address fields are specified as Note: 64-bit quantities, they are assumed to be sign-extended based on VA<43>.
  • Page 39 UltraSPARC User’s Manual Translation Translation Software Storage Translation Look-aside Buffers Buffer Table Memory O/S Data Structure Figure 4-3 Software View of the UltraSPARC MMU Aliasing between pages of different size (when multiple VAs map to the same PA) may take place, as with the SPARC-V8 Reference MMU. The reverse case, when multiple mappings from one VA/context to multiple PAs produce a multi- ple TLB match, is not detected in hardware;...
  • Page 40: Section Ii - Going Deeper

    Section II — Going Deeper Cache and Memory Interactions ............MMU Internal Architecture ............... UltraSPARC External Interfaces ............Address Spaces, ASIs, ASRs, and Traps .......... 145 Interrupt Handling ................161 10. Reset and RED_state ................169 11. Error Handling ..................175 Artisan Technology Group - Quality Instrumentation ...
  • Page 41 UltraSPARC User’s Manual Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 42: Cache And Memory Interactions ........................................................................................ Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-Source | Www.artisantg.com

    Cache and Memory Interactions 5.1 Introduction This chapter describes various interactions between the caches and memory, and the management processes that an operating system must perform to maintain data integrity in these cases. In particular, it discusses: • When and how to invalidate one or more cache entries •...
  • Page 43 raSPARC User’s Manual Cache flushing is required in the following cases: I-Cache: Flush is needed before executing code that is modified by a local store instruction other than block commit store, see Section 3.1.1.1, “Instruction Cache (I-Cache).” This is done with the FLUSH instruction or using ASI accesses. See Section A.7, “I-Cache Diagnostic Accesses,”...
  • Page 44: Memory Accesses And Cacheability

    5. Cache and Memory Interactions Note: A change in virtual color when allocating a free page does not require a D-Cache flush, because the D-Cache is write-through. 5.2.2 Committing Block Store Flushing In UltraSPARC, stable storage must be implemented by software cache flush. Data that is present and modified in the E-Cache must be written back to the sta- ble storage.
  • Page 45 raSPARC User’s Manual 3.1 Coherence Domains Two types of memory operations are supported in UltraSPARC: cacheable and noncacheable accesses, as indicated by the page translation. Cacheable accesses are inside the coherence domain; noncacheable accesses are outside the coherence domain. SPARC-V9 does not specify memory ordering between cacheable and noncache- able accesses.
  • Page 46 5. Cache and Memory Interactions Noncacheable accesses with the E-bit set (that is, those having side-effects) are all strongly ordered with respect to other noncacheable accesses with the E-bit set. In addition, store buffer compression is disabled for these accesses. Speculative loads with the E-bit set cause a trap (with SFSR.FT=2, spec- data_access_exception...
  • Page 47 raSPARC User’s Manual Note: A MEMBAR #MemIssue or MEMBAR #Sync is needed if ordering of cacheable accesses following noncacheable accesses must be maintained in PSO or RMO. Due to load and store buffers implemented in UltraSPARC, the above example may not work in PSO and RMO modes without the MEMBARs shown in the pro- gram segment.
  • Page 48 5. Cache and Memory Interactions 5.3.2.4 MEMBAR #StoreStore and STBAR Forces all stores after the MEMBAR to wait until all stores before the MEMBAR have reached global visibility. Note: STBAR has the same semantics as MEMBAR #StoreStore; it is included for SPARC-V8 compatibility.
  • Page 49 raSPARC User’s Manual Note: MEMBAR #Sync is a costly instruction; unnecessary usage may result in substantial performance degradation. 3.2.8 Self-Modifying Code (FLUSH) The SPARC-V9 instruction set architecture does not guarantee consistency be- tween code and data spaces. A problem arises when code space is dynamically modified by a program writing to memory locations containing instructions.
  • Page 50 5. Cache and Memory Interactions Note: Atomic accesses with non-faulting ASIs are not allowed, because these ASIs have the load-only attribute. 5.3.3.1 SWAP Instruction SWAP atomically exchanges the lower 32 bits in an integer register with a word in memory. This instruction is issued only after store buffers are empty. Subse- quent loads interlock on earlier SWAPs.
  • Page 51 3.5 PREFETCH Instructions Table 5-2 shows which UltraSPARC models support the PREFETCH{A} instruc- tions. Table 5-2 PREFETCH{A} Instruction Support UltraSPARC-I UltraSPARC-II PREFETCH{A} UltraSPARC models that do not support PREFETCH treat it as a NOP. 3.5.1 PREFETCH Behavior and Limitations UltraSPARC processors that do support PREFETCH behave in the following ways: •...
  • Page 52 5. Cache and Memory Interactions • Some conditions, noted below, cause an otherwise supported PREFETCH to be treated as a NOP and removed from the load buffer when it reaches the front of the queue. • No PREFETCH will cause a trap except: •...
  • Page 53 raSPARC User’s Manual 3.6 Block Loads and Stores Block load and store instructions work like normal floating-point load and store instructions, except that the data size (granularity) is 64 bytes per transfer. See Section 13.6.4, “Block Load and Store Instructions,” on page 230 for a full descrip- tion of the instructions.
  • Page 54: Load Buffer

    5. Cache and Memory Interactions CALL, or JMPL instruction. Instructions should not be placed within 256 bytes of locations with side effects. See Section 16.2.10, “Return Address Stack (RAS),” on page 272 for other information about JMPLs and RETURNs. 5.3.9 Instruction Prefetch When Exiting RED_state Exiting RED_state by writing 0 to PSTATE.RED in the delay slot of a JMPL is not recommended.
  • Page 55: Store Buffer

    raSPARC User’s Manual long as they do not require the register that is being loaded. An instruction that attempts to use the data that is being loaded by an instruction in the load buffer is called a ‘use’ instruction. The pipelines are not fully decoupled, because UltraSPARC still supports the no- tion of precise traps, and loads that are younger than a trapping instruction must not execute, except in the case of deferred traps.
  • Page 56: Mmu Internal Architecture

    MMU Internal Architecture 6.1 Introduction This chapter provides detailed information about the UltraSPARC Memory Man- agement Unit. It describes the internal architecture of the MMU and how to pro- gram it. 6.2 Translation Table Entry (TTE) The Translation Table Entry, illustrated in Figure 6-1, is the UltraSPARC equiva- lent of a SPARC-V8 page table entry;...
  • Page 57 raSPARC User’s Manual VA_tag<63:22>: Virtual Address Tag. The virtual page number. Bits 21 through 13 are not maintained in the tag, since these bits are used to index the smallest direct-mapped TSB of 64 entries. Note: Software must sign-extend bits VA_tag<63:44> to form an in-range VA. Valid: If the Valid bit is set, the remaining fields of the TTE are meaningful.
  • Page 58 6. MMU Internal Architecture Soft<5:0>, Soft2<8:0>: Software-defined fields, provided for use by the operating system. The Soft and Soft2 fields may be written with any value; they read as zero. Diag: Used by diagnostics to access the redundant information held in the TLB structure.
  • Page 59: Translation Storage Buffer (Tsb)

    raSPARC User’s Manual Note: The E-bit does not force an uncacheable access. It is expected, but not required, that the CP and CV bits will be set to zero when the E-bit is set. Privileged. If the P bit is set, only the supervisor can access the page mapped by the TTE.
  • Page 60 6. MMU Internal Architecture No hardware TSB indexing support is provided for the 512 Kb and 4 Mb page TTEs. Since the TSB is entirely software managed, however, the operating system may choose to place these larger page TTEs in the TSB by forming the appropri- ate pointers.
  • Page 61 raSPARC User’s Manual A typical TLB miss and refill sequence is as follows: A TLB miss causes either an or a instruction_access_MMU_miss exception. data_access_MMU_miss The appropriate TLB miss handler loads the TSB Pointers and the TTE Tag Target with loads from the MMU alternate space Using this information, the TLB miss handler checks to see if the desired TTE exists in the TSB.
  • Page 62: Mmu-Related Faults And Traps

    6. MMU Internal Architecture The TSB Tag Target (described in Section 6.9, “MMU Internal Registers and ASI Operations,” on page 55) is formed by aligning the missing access VA (from the Tag Access register) and the current context to positions found in the description of the TTE tag.
  • Page 63 raSPARC User’s Manual Note: The , and fast_instruction_access_MMU_miss fast_data_access_MMU_miss traps are generated instead of fast_data_access_protection , and instruction_access_MMU_miss data_access_MMU_miss data_access_protection traps, respectively. 4.1 Instruction_access_MMU_miss Trap This trap occurs when the I-MMU is unable to find a translation for an instruc- tion access;...
  • Page 64 6. MMU Internal Architecture • An invalid LDA/STA ASI value, invalid virtual address, read to write-only register, or write to read-only register, but not for an attempted user access to a restricted ASI (see the trap described below). privileged_action • An access (including FLUSH) with an ASI other than ASI_{PRIMARY,SECONDARY}_NO_FAULT{_LITTLE} to a page marked with the NFO (no-fault-only) bit.
  • Page 65: Mmu Operation Summary

    raSPARC User’s Manual 5 MMU Operation Summary Table 6-4 on page 51 summarizes the behavior of the D-MMU; Table 6-5 on page 51 summarizes the behavior of the I-MMU for normal (non-UltraSPARC-internal) ASIs. In each case, for all conditions the behavior of the MMU is given by one of the following abbreviations: Abbrev Meaning...
  • Page 66 6. MMU Internal Architecture • Attempted access using a restricted ASI in non-privileged mode. The MMU signals a exception for this case. privileged_action • An atomic instruction (including 128-bit atomic load) issued to a memory address marked uncacheable in a physical cache (that is, with CP=0), including cases in which the D-MMU is disabled.
  • Page 67: Asi Value, Context, And Endianness Selection For Translation

    raSPARC User’s Manual See Section 8.3, “Alternate Address Spaces,” on page 146 for a summary of the UltraSPARC ASI map. 6 ASI Value, Context, and Endianness Selection for Translation The MMU uses a two-step process to select the context for a translation: The ASI is determined (conceptually by the Integer Unit) from the instruction, trap level, and the processor endian mode The context register is determined directly from the ASI.
  • Page 68 6. MMU Internal Architecture Table 6-6 ASI Mapping for Instruction Accesses Condition for Instruction Access Resulting Action PSTATE.TL Endianness ASI Value (in SFSR) ASI_PRIMARY > 0 ASI_NUCLEUS Table 6-7 ASI Mapping for Data Accesses Condition for Data Access Access Processed with: PSTATE.
  • Page 69: Mmu Behavior During Reset, Mmu Disable, And Red_State

    raSPARC User’s Manual 7 MMU Behavior During Reset, MMU Disable, and RED_state During global reset of the UltraSPARC CPU, the following actions occur: • No change occurs in any block of the D-MMU. • No change occurs in the datapath or TLB blocks of the I-MMU. •...
  • Page 70: Compliance With The Sparc-V9 Annex F

    6. MMU Internal Architecture Note: No reset of the TLB is performed by a chip reset or by entering RED_state. Before the MMUs are enabled, the operating system software must explicitly write each entry with either a valid TLB entry or an entry with the valid bit set to zero.
  • Page 71 raSPARC User’s Manual Warning – STXA to an MMU register requires either a MEMBAR #Sync, FLUSH, DONE, or RETRY before the point that the effect must be visible to load / store / atomic accesses. Either a FLUSH, DONE, or RETRY is needed before the point that the effect must be visible to instruction accesses: MEMBAR #Sync is not sufficient.
  • Page 72 6. MMU Internal Architecture 6.9.2 I-/D-TSB Tag Target Registers The I- and D-TSB Tag Target registers are simply bit-shifted versions of the data stored in the I- and D-Tag Access registers, respectively. Since the I- or D-Tag Ac- cess register is updated on an I- or D-TLB miss, respectively, the I- and D-Tag Tar- get registers appear to software to be updated on an I or D TLB miss.
  • Page 73 raSPARC User’s Manual Compatibility Note The single context register of the SPARC-V8 Reference MMU has been replaced in UltraSPARC by the three context registers shown in Figures 6-4, 6-5, and 6-6. Note: A STXA to the context registers requires either a MEMBAR #Sync, FLUSH, DONE, or RETRY before the point that the effect must be visible to data accesses.
  • Page 74 6. MMU Internal Architecture Table 6-11 MMU Synchronous Fault Status Register FT (Fault Type) Field FT<6:0> Fault Type Privilege violation Speculative Load or Flush instruction to page marked with E-bit. This bit is zero for internal ASI accesses. Atomic (including 128-bit atomic load) to page marked uncacheable. This bit is zero for internal ASI accesses, except for atomics to DTLB_DATA_ACCESS_REG (5D ), which update according to the TLB entry accessed.
  • Page 75 raSPARC User’s Manual Fault Valid. Set when the MMU detects a fault; it is cleared only on an explicit ASI write of 0 to the SFSR register. When FV is not set, the values of the remaining fields in the SFSR and SFAR are undefined. The SFSR and the Tag Access registers both maintain state concerning a previous translation causing an exception.
  • Page 76 6. MMU Internal Architecture 6.9.5.2 D-MMU Fault Address The Synchronous Fault Address register contains the virtual memory address of the fault recorded in the D-MMU Synchronous Fault Status register. There is no I-SFAR, since the instruction fault address is found in the trap program counter (TPC).
  • Page 77 raSPARC User’s Manual Split: When Split=1, the TSB 64 Kb Pointer address is calculated assuming separate (but abutting and equally-sized) TSB regions for the 8 Kb and the 64 Kb TTEs. In this case, TSB_Size refers to the size of each TSB, and therefore the TSB 8Kb Pointer address calculation is not affected by the value of the Split bit.
  • Page 78 6. MMU Internal Architecture TLB Data In register for automatic replacement also uses the Tag Access register, but typically the value written into the Tag Access register by the MMU hardware is appropriate. Note: Any update to the Tag Access registers immediately affects the data that is returned from subsequent reads of the Tag Target and TSB Pointer registers.
  • Page 79 raSPARC User’s Manual The I-/D-TSB 8 Kb/64 Kb Pointer registers are defined as follows: VA<63:0> Figure 6-11 I-/D-MMU TSB 8 Kb/64 Kb Pointer and D-MMU Direct Pointer Register VA<63:0>: The full virtual address of the TTE in the TSB, as determined by the MMU hardware.
  • Page 80 6. MMU Internal Architecture The Data In and Data Access registers are the means of reading and writing the TLB for all operations. The TLB Data In register is used for TLB-miss and TSB- miss handler automatic replacement writes; the TLB Data Access register is used for operating system and diagnostic directed writes (writes to a specific TLB en- try).
  • Page 81 raSPARC User’s Manual An ASI store to the TLB Data In register initiates an automatic atomic replace- ment of the TLB Entry pointed to by the current contents of the TLB Replacement register “Replace” field. The TLB data and tag are formed as in the case of an ASI store to the TLB Data Access register described above.
  • Page 82 6. MMU Internal Architecture VA<63:12>: The virtual page number of the TTE to be removed from the TLB. This field is not used by the MMU for the Demap Context operation, but must be in-range. The virtual address for demap is checked for out-of- range violations, in the same manner as any normal MMU access.
  • Page 83: Mmu Bypass Mode

    raSPARC User’s Manual 9.11 I-/D-Demap Page (Type=0) Demap Page removes the TTE (from the specified TLB) matching the specified virtual page number and context register. The match condition with regard to the global bit is the same as a normal TLB access; that is, if the global bit is set, the contexts need not match.
  • Page 84: Tlb Hardware

    6. MMU Internal Architecture 6.11 TLB Hardware 6.11.1 TLB Operations The TLB supports exactly one of the following operations per clock cycle: • Normal translation. The TLB receives a virtual address and a context identifier as input and produces a physical address and page attributes as output. •...
  • Page 85 raSPARC User’s Manual Due to the implementation of the UltraSPARC pipeline, the MMU can and will set a TLB entry’s used bit as if the entry were hit when the load or store is an an- nulled or mispredicted instruction. This can be considered to cause a very slight performance degradation in the replacement algorithm, although it may also be argued that it is desirable to keep these extra entries in the TLB.
  • Page 86 6. MMU Internal Architecture UltraSPARC Code Example 6-1 Pseudo-code for D-MMU Pointer Logic int64 GenerateTSBPointer( int64 va, // Missing virtual address PointerType type, // 8K_POINTER or 64K_POINTER int64 TSBBase, // TSB Register<63:13> << 13 Boolean split, // TSB Register<12> int TSBSize) // TSB Register<2:0>...
  • Page 87 UltraSPARC User’s Manual Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 88: Ultrasparc External Interfaces

    See Appendix E, “Pin and Signal Descriptions,” for a description of the external interface pins and signals (including buses, control signals, clock inputs, etc.) See the UltraSPARC-I Data Sheet for information about the electrical and mechan- ical characteristics of the processor, including pin and pad assignments. The Bib- liography on page 363 describes how to obtain the data sheet.
  • Page 89 raSPARC User’s Manual The UltraSPARC Data Buffer isolates UltraSPARC and its E-Cache from the main system data bus, so the interface can operate at processor speed (reduced load- ing). The UDB also provides overlapping between system transactions and local E-Cache transactions, even when the latter needs to use part of the data buffer. UltraSPARC includes the logic to control the UDB;...
  • Page 90 7. UltraSPARC External Interfaces • As an interconnect slave, UltraSPARC responds to noncached reads of its interconnect port ID, which are generated by other UltraSPARCs on the interconnect. Slave Writes to UltraSPARC are not supported. UltraSPARC is both an interrupter and an interrupt receiver. It can generate inter- rupt requests to other interrupt receivers, and it can receive interrupt requests from other interrupters.
  • Page 91: Interaction Between E-Cache And Udb

    raSPARC User’s Manual Figure 7-2 illustrates how data and ECC bytes are arranged and addressed within a quadword (for big-endian accesses). ad Lo Bytes Byte 0 Byte 1 Byte 2 Byte 3 Byte 4 Byte 5 Byte 6 Byte 7 ad Hi Bytes Byte 8 Byte 9...
  • Page 92 16 parity bits for da- ta. Table 7-3 lists the E-Cache sizes that each UltraSPARC model supports. Table 7-3 Supported E-Cache Sizes (Same as Table 1-5) E-Cache Size UltraSPARC-I UltraSPARC-II 512 Kb 1 Mb 2 Mb...
  • Page 93 E-Cache read misses or noncacheable reads. Table 7-3 shows the supported buffer depth for each UltraSPARC model. Table 7-4 Supported Read Buffer Depth UltraSPARC-I UltraSPARC-II # of Entries • A model-dependent number of 64-byte buffers to hold writebacks, block stores, and outgoing interrupt vectors.
  • Page 94 7. UltraSPARC External Interfaces 7.3.2.1 Coherent Read Hit (1–1–1 and 2–2 Modes) Figure 7-3 shows the 1–1–1 Mode timing for coherent reads that hit the E-Cache. UltraSPARC makes no distinction between burst reads (which are supported by some RAMs) and two consecutive reads; the signals used for a single read are du- plicated for each subsequent read.
  • Page 95 raSPARC User’s Manual CPU CLK SRAM CLK AM CYCLE YN_WR_L TOE_L ECAT A0_tag A1_tag A2_tag TDATA D0_tag D1_tag D2_tag YN_WR_L DOE_L ECAD A0_data A1_data A2_data EDATA D0_data D1_data D2_data Figure 7-4 Timing for Coherent Read Hit (2–2 Mode) 3.2.2 Coherent Write Hits (1–1–1 and 2–2 Modes) Writes to the E-Cache are processed through independent tag and data transac- tions.
  • Page 96 7. UltraSPARC External Interfaces data address is presented on the ECAD pins in the cycle after the request (cycle 4 for W0) and the data is sent in the following cycle (cycle 5). Systems running in 2–2 Mode incur no read-to-write bus turnaround penalty. CYCLE TSYN_WR_L TOE_L...
  • Page 97 raSPARC User’s Manual CYCLE TSYN_WR_L TOE_L A0_tag A1_tag A2_tag ECAT A0_tag A1_tag A2_tag TDATA D0_tag D1_tag D2_tag D0_tag D1_tag D2_tag DSYN_WR_L DOE_L ECAD A0_data A1_data A2_data EDATA D0_data D1_data D2_data Figure 7-7 Timing for Coherent Writes with E-to-M State Transition (1–1–1 Mode) Otherwise, the tag port is available for a tag check of a younger store during the data write.
  • Page 98 7. UltraSPARC External Interfaces 7.3.2.3 Coherent Write Misses If a coherent write misses in the E-Cache, the corresponding cache line is victim- ized. When the victimized line is dirty, a writeback transaction is scheduled. In any case, a read-to-own transaction is scheduled for the required write address. When the read completes, the new data overwrites it in the cache.
  • Page 99: Sysaddr Bus Arbitration Protocol

    raSPARC User’s Manual 4 SYSADDR Bus Arbitration Protocol This section specifies the distributed arbitration protocol for driving a request packet on the SYSADDR bus. 4.1 SYSADDR Bus Interconnection Topology SYSADDR accommodates a maximum of four bus masters (which can be either UltraSPARCs or I/O ports), as well as a System Controller (SC).
  • Page 100 7. UltraSPARC External Interfaces 7.4.2 Distributed Arbitration The SYSADDR bus uses a distributed arbitration protocol to provide the lowest possible latency for bus ownership, at the same time meeting the minimum cycle time requirements of the interconnect. The arbitration protocol has the following features: •...
  • Page 101 raSPARC User’s Manual Addr_Valid is driven following the same rules as SYSADDR signals. Addr_Valid must be deasserted in the last cycle it is driven. The SC must contain a holding amplifier to maintain the previously asserted state of each Addr_Valid signal when it is undriven. 4.3.1 Arbitration Rules The interface that is currently driving (or allowed to drive) SYSADDR and Addr_Valid is called the C...
  • Page 102 7. UltraSPARC External Interfaces The C may drive SYSADDR at any time up to and including URRENT RIVER the cycle in which it deasserts its request. If the C ’s request was deasserted during the last cycle and URRENT RIVER one or more other requests were asserted, arbitration occurs during this cycle to decide who can drive during the next cycle.
  • Page 103 raSPARC User’s Manual UltraSPARC has a mode that keeps its request asserted on the bus until it sees an- other request on the bus, even if it has no more pending requests. This eliminates one cycle of arbitration latency. This mode is enabled by hard-wiring any of the unused Node_RQ<N>...
  • Page 104 7. UltraSPARC External Interfaces 7.4.3.4 Arbitration Timing Figures 7-12 through 7-18 illustrate the arbitration protocol timing. They also show how SYSADDR ownership changes from requestor to requestor. The figures show the minimum arbitration latencies, which are as follows: • 0 cycles if UltraSPARC or SC is C 7-11) URRENT RIVER...
  • Page 105 raSPARC User’s Manual Figure 7-13 shows the timing when the ownership changes between two UltraSPARCs. In this case, Port does not assert a request after its current one. RIVER Req<0> Req<1> SYSADDR Cycle 0 Cycle 1 Cycle 0 Cycle 1 Addr_Valid<0>...
  • Page 106 7. UltraSPARC External Interfaces RIVER Req<0> SC Request SYSADDR Cycle 0 Cycle 1 Cycle 0 Cycle 1 Addr_Valid<0> Port drives SYSADDR & SC drives SYSADDR & Addr_Valid<0> Addr_Valid<0> Addr_Valid<0> Undriven Figure 7-15 Arbitration: SC Arbitrates and Sends a Packet to Port Figure 7-16 shows the timing when the SC relinquishes ownership after is has driven a request packet.
  • Page 107: Ultrasparc Interconnect Transaction Overview

    raSPARC User’s Manual In Figure 7-18, the SC becomes URRENT RIVER RIVER Req<0> SC Request SYSADDR Cycle 1 Cycle 2 Request Arbitration First Cycle Asserted Occurs of Packet Figure 7-18 Arbitration: SC Becomes C URRENT RIVER 5 UltraSPARC Interconnect Transaction Overview The are four interconnect transaction categories: P_REQ transaction request from UltraSPARC to the system on the SYSADDR bus.
  • Page 108 7. UltraSPARC External Interfaces S_REPLY acknowledgment is generated by the system to the processor on point-to-point unidirectional wires, which initiates transfer of data. It is generated in response to a P_REQ or P_REPLY from that processor. Any UltraSPARC event (such as a load or store miss) that causes an interconnect transaction completes before any snoop activity can result in the invalidation or copyback of that line.
  • Page 109: Cache Coherence Protocol

    UltraSPARC User’s Manual If UltraSPARC receives the S_REQ for the dirty cache block in the Writeback Buffer after the S_WAB/S_WBCAN reply for the Writeback transaction and before the S_RBU/S_RBS reply for the read transaction, the S_REQ completes atomically and can either result in P_SACK or P_SNACK.
  • Page 110 7. UltraSPARC External Interfaces 7.6.1 State Transitions Figure 7-20 on page 95 shows the cache coherency state diagram. Table 7-9 on page 97 describes these transitions. It also shows the transactions that are initiat- ed by either UltraSPARC or the SC, along with the expected acknowledgment fol- lowing each transaction.
  • Page 111 PREFETCH{A} instructions, which are not supported by all UltraSPARC models. Table 7-8 shows which UltraSPARC models support the PREFETCH{A} instruc- tions. Table 7-8 PREFETCH{A} Instruction Support UltraSPARC-I UltraSPARC-II PREFETCH{A} Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 112 7. UltraSPARC External Interfaces Table 7-9 Transitions Allowed for Cache Coherence Protocol Transaction Req Transition Description Acknowledgment to/from Port Load miss; data coming from memory to an invalid P_RDS_REQ S_RBU line (no other cache has the data). Load miss; data provided by another cache or memory P_RDS_REQ S_RBS to an invalid line (another cache has the data)
  • Page 113 raSPARC User’s Manual 6.2 Cache Coherence Model UltraSPARC supports a variety of cache coherent system implementations. UltraSPARC can be used in a system that keeps a non-uniform copy of the E-Cache tags. Non-uniform means that it does not maintain all five of the MOESI states.
  • Page 114 7. UltraSPARC External Interfaces UltraSPARC UltraSPARC . . . Etag k Etag 1 WB Buffer WB Buffer N–1 N–1 Main Memory System Controller Valid B Invalid DtagTB 1 DtagTB k M–1 Dtag k Dtag 1 . . . N–1 N–1 Figure 7-21 Cache Coherence Model Using Centralized Duplicate Tags (Dtags) In the example shown in Figure 7-21, two UltraSPARCs cache the same data...
  • Page 115 UltraSPARC User’s Manual SC decodes the request packet and determines the transaction type and physical address. If it is a coherent read or write transaction, the SC takes the full address and interrogates the Dtags and any valid DtagTBs. If Dtag reads can occur every cycle, there may need to be some bypassing of Dtag updates;...
  • Page 116 7. UltraSPARC External Interfaces 7.6.4 Cache Coherence Sequence in Systems without Dtags The following is an example sequence of events for the coherence model shown in Figure 7-21 on page 99, except that there are no duplicate tags. Typically, this is a system with a single UltraSPARC and a cache-coherent I/O interface.
  • Page 117: Cache Coherent Transactions

    Table 7-10 shows the number of outstanding ReadToShare transactions that each UltraSPARC model supports. Table 7-10 Supported Number of Outstanding ReadToShare Transactions UltraSPARC-I UltraSPARC-II Number 7.1.1 Error Handling The system can reply with S_RTO (time-out, typically if the address is for unim- plemented memory), or S_ERR (bus error, typically if the access is illegal).
  • Page 118 O), UltraSPARC sets the Dirty Victim Pending (DVP) bit in the request packet. Table 7-11 shows the number of outstanding ReadToOwn transactions that each UltraSPARC model supports. Table 7-11 Supported Number of Outstanding ReadToOwn Transactions Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com UltraSPARC-I UltraSPARC-II...
  • Page 119 Table 7-12 shows the number of outstanding ReadToDiscard transactions that each UltraSPARC model supports. Table 7-12 Supported Number of Outstanding ReadToDiscard Transactions UltraSPARC-I UltraSPARC-II Number 7.4.1 Error Handling The system can reply with S_RTO (time-out, typically if the address is for unim- plemented memory), or S_ERR (bus error, typically if the access is illegal).
  • Page 120 7. UltraSPARC External Interfaces If the Writeback is to be cancelled because of an intervening invalidation (S_CPI_REQ or S_INV_REQ) for the victimized datum (due to a P_RDO_REQ or P_WRI_REQ from another UltraSPARC), SC cancels the Writeback with S_WBCAN and no data is written. If the Writeback is not cancelled, SC issues S_WAB and UltraSPARC drives the 64-byte block of data aligned on a 64-byte boundary (A<5:4>=0) onto SYSDATA.
  • Page 121 raSPARC User’s Manual 7.7 Invalidate (S_INV_REQ) Invalidate request from SC to UltraSPARC. SC generates S_INV_REQs to service a ReadToOwn (P_RDO_REQ) or WriteInvalidate (P_WRI_REQ) request from an- other processor. Etag transitions to I. UltraSPARC issues its P_REPLY depending on the state of the E-Cache line and the setting of the No Dual tag Present (NDP) bit in the S_INV_REQ.
  • Page 122 7. UltraSPARC External Interfaces If NDP=0, UltraSPARC replies with: • P_SACK or P_SACKD if the block is in the E-Cache or has been victimized from the E-Cache but not yet written back Note that UltraSPARC can reply with P_SACK even if the block has been victimized from the E-Cache. UltraSPARC also asserts P_SACK if the block is not in the cache, but this is an error condition in systems that support Dtags (NDP=0).
  • Page 123 Dtags. Section 7.10, “S_REQ,” on page 111 for more tim- ing information. SC can buffer the P_SACKD reply and cancel the P_WRB_REQ when it appears. UltraSPARC-I supports one outstanding coherent system request. SC can send its next coherent request on the cycle after the S_CRAB reply. 7.10 CopybackToDiscard (S_CPD_REQ) Non-destructive copyback request from SC to UltraSPARC.
  • Page 124: Non-Cached Data Transactions

    7. UltraSPARC External Interfaces • P_SNACK if the block is not present in the E-Cache or the writeback buffer. The P_SACK or P_SACKD reply indicates that UltraSPARC is ready to transfer the requested data. SC initiates the data transfer by sending S_CRAB. If NDP=0 and the block was not present in the cache, UltraSPARC drives undefined data in response to the S_CRAB.
  • Page 125 SYSDATA Table 7-13 shows the number of outstanding NonCachedRead transactions that each UltraSPARC model supports. Table 7-13 Supported Number of Outstanding NonCachedRead Transactions UltraSPARC-I UltraSPARC-II Number 8.2 NonCachedBlockRead (P_NCBRD_REQ) Noncached Block Read Request. UltraSPARC reads 64 bytes of noncached data with this transaction.
  • Page 126: S_Rto/S_Err

    S_RTO or S_ERR, the state of the line is not changed (tag or data) and the store is not completed. 7.10 S_REQ UltraSPARC-I can support at most one outstanding S_REQ transaction for copy- back/invalidate from SC. SC must block subsequent S_REQs to the same UltraSPARC-I, even when the requests are from different UltraSPARCs and for data at different addresses.
  • Page 127: Writeback Issues

    Table 7-15 Worst-Case Delay Between S_REQ and P_REPLY when NDP=1 UltraSPARC Model Cycles UltraSPARC-I UltraSPARC-II ~50–60 An S_REQ operates on the E-Cache atomically with respect to other cache events. Invalidates do not necessarily propagate to the D-Cache until software completes a store and a MEMBAR #StoreLoad.
  • Page 128 UltraSPARC-I UltraSPARC-II Number UltraSPARC-I issues only one Writeback transaction at a time. The Writeback and its associated read transaction (with DVP=1) both must complete (receive their respective S_REPLYs) before UltraSPARC-I issues a second read with DVP=1. UltraSPARC-I can issue a subsequent read transaction with DVP=0 while there is a previous Writeback pending.
  • Page 129 raSPARC User’s Manual 11.1 Clean Victim Handling When the victimized line is clean (E, S, or I state), the read request for the new line is issued with DVP=0, and the following rules apply: UltraSPARC inhibits reading and writing the victimized line by blocking any activity to the same E-Cache index, except for loads and stores of the first level caches.
  • Page 130 S_CPI_REQ or S_INV_REQ. SC must remember that there is a pending Write- back Cancellation and treat all subsequent P_SACKDs like P_SNACKs. UltraSPARC-I supports only one outstanding Writeback, so it is clear which Writeback the P_SACKD causes to be cancelled. For UltraSPARC-II, SC must buffer the address from the S_REQ to determine which Writeback to cancel.
  • Page 131: Interrupts (P_Int_Req)

    raSPARC User’s Manual tions proceed asynchronously and may complete in any order. As long as either the read or the Writeback is outstanding, UltraSPARC maintains the victimized block in the coherence domain. While the victimized block is in the coherence domain, UltraSPARC must honor Copyback requests for the block from SC.
  • Page 132: P_Reply And S_Reply

    7. UltraSPARC External Interfaces After software clears BUSY in the Interrupt Vector Receive register, UltraSPARC sends a P_IAK reply. UltraSPARC supports only one outstanding P_INT_REQ transaction; SC can send the next P_INT_REQ request on the cycle after the P_IAK reply. When UltraSPARC sends an interrupt: If SC can deliver the interrupt transaction to the target (that is, if the target UltraSPARC does not have another outstanding interrupt), SC issues an...
  • Page 133 raSPARC User’s Manual Class Master ID (MID) Type Cycle 1 Cycle 2 Figure 7-22 P_REPLY Packet Format (Cycle 2 not present in all P_REPLYs) P_REPLYs take either one or two interconnect clock cycles. The first cycle con- tains the P_REPLY type, and the Class bit. The second cycle, if present, contains the Master ID (MID) of the UltraSPARC that generated the original request.
  • Page 134 7. UltraSPARC External Interfaces Table 7-18 specifies the P_REPLY types. Table 7-18 P_REPLY Type Definitions Type Definition P_IDLE Idle. The default state when no reply is asserted. UltraSPARC drives P_IDLE after Power-On Reset. P_RERR Read Error. Returned by UltraSPARC in response to a noncached block read request from SC. No data is transferred.
  • Page 135 raSPARC User’s Manual S_REPLY takes a single interconnect clock cycle. SC asserts S_REPLY to initiate data transfer to/from UltraSPARC and to acknowledge P_REQs from UltraSPARC. Table 7-19 specifies the S_REPLY encodings. Table 7-19 S_REPLY Encoding REPLY Name Reply to Transaction Type Idle Default State...
  • Page 136 7. UltraSPARC External Interfaces SC can pipeline some S_REPLYs that do not have an accompanying data transfer (S_OAK, S_RTO, S_ERR), even while data is being transferred on SYSDATA due to a previous S_REPLY. See Figure 7-28 on page 124. Even though S_WBCAN or S_INAK do not have an accompanying data transfer, SC cannot pipeline these S_REPLYs;...
  • Page 137 UltraSPARC User’s Manual Table 7-20 S_REPLY Type Definitions Type Definition S_IDLE Idle. Default state; no reply is asserted. SC should drive S_IDLE after Power-On Reset. S_RTO Read Time-out. No data is transferred. SC uses S_RTO to indicate time-outs on read transactions. UltraSPARC generates an exception and logs time out status instruction_access_error...
  • Page 138 7. UltraSPARC External Interfaces 7.13.3 P_REPLY and S_REPLY Timing The following figures show the data flow on SYSDATA due to S_REPLY and P_REPLY with no data stalls. Figure 7-25 also shows the timing of the interconnect_ECC_Valid signal with respect to the S_REPLY. Section 7.13.4 dis- cusses data flow timing with data stalls.
  • Page 139 UltraSPARC User’s Manual S_REQ S_REQ S_REQ2 P_REPLY P_SACK S_REPLY to Get Data S_CRAB Earliest S_REQ2 Figure 7-27 Back-to-Back Coherent S_REQs to UltraSPARC S_REPLY to UltraSPARC S_WAS S_WAS2 S_RBU3 Data on Bus D[1] D[2] D[3] P_REQ from UltraSPARC NCWR1 NCWR1 NCWR2 NCWR2 RDS3 RDS3...
  • Page 140 7. UltraSPARC External Interfaces Thus, the sourcing of the first quadword is always with respect to the S_REPLY. Data_Stall determines the number of clock cycles that the quadword stays on SYSDATA (that is, the number of stalls). Figure 7-29 shows the data stall timing to UltraSPARC sourcing data. When UltraSPARC is sinking data, SC can assert Data_Stall in the same system clock cycle that the S_REPLY is asserted.
  • Page 141: Multiple Outstanding Transactions

    UltraSPARC-I supports only one outstanding 64-byte read (P_RD*_REQ or P_NCBRD_REQ in Class 0). In addition, since a single read buffer is used for all reads, UltraSPARC-I supports only one outstanding read of any type. Thus, P_RD*_REQ or P_NCBRD_REQ in Class 0 and P_NCRD_REQ in Class 1 cannot be outstanding simultaneously.
  • Page 142 7. UltraSPARC External Interfaces 7.14.2 Minimal Ordering Requirements An SC can be less strict about the ordering requirements for asserting S_REPLYs in Class 0 and 1, with respect to the original address packet. This may allow sim- pler SCs to be built. The details also may be useful for understanding how to gen- erate useful test cases and which test cases are not possible.
  • Page 143 (DVP=0). 14.5 Limiting the Number of Transactions in a Class UltraSPARC-I limits the number of transactions in Class 1 and also limits the number of outstanding 16-byte noncacheable stores and block stores. UltraSPARC-II also has the ability to limit the number of outstanding Class 0 64- byte reads, and the number of Writebacks in Class 1.
  • Page 144: Transaction Set Summary

    7. UltraSPARC External Interfaces Even though S_WBCAN and S_INAK have no data transfer, they must be sched- uled as if they used SYSDATA; that is, they can be issued only when an S_WAB or S_WAS would have been allowed. They do not add any SYSDATA use cycles, however, for deciding when and which S_REPLYs can be issued after them.
  • Page 145 P_INT_REQ S_WAB or S_INAK UltraSPARC-I supports only one outstanding writeback transaction. The writeback and its concomitant dirty victim read transaction must both complete before a second writeback or a second dirty victim read is issued. UltraSPARC-II supports two outstanding writeback transactions.
  • Page 146: Transaction Sequences

    7. UltraSPARC External Interfaces 7.16 Transaction Sequences This section describes the basic coherent transaction sequences, illustrating the sequence of events that transpire as a function of cache states and transaction type. The transaction sequences are described in separate tables for each interesting combination of transaction and initial state.
  • Page 147 UltraSPARC User’s Manual 7.16.3 ReadToShare Block Condition: Load miss on Processor 1; another processor (P2) has the data exclu- sively. Table 7-27 ReadToShare One Processor Has it Exclusively Processor 1 System Processor 2 Processor 3 Initial state: Etag{I} Initial state: Etag{E} Initial state: Etag{I} P_RDS_REQ to System S_CPB_REQ to P2...
  • Page 148 7. UltraSPARC External Interfaces When Processor 2’s initial state is Etag{M} the sequence is the same, except that Processor 2 transitions to Etag{O}. Processor 3 initial state is Etag{I} by definition in this case, and no transaction is generated to it by SC. When Processor 2’s initial state is Etag{S} the sequence is the same.
  • Page 149 UltraSPARC User’s Manual Table 7-30 ReadToOwn for Write Permission Processor 1 System Processor 2 Processor 3 Initial state: Etag{S} Initial state: Etag{O} Initial state:Etag{S} P_RDO_REQ to System S_INV_REQ to P2 S_INV_REQ to P3 P2 updates Etag{O P3 updates Etag{S P_SACK to System P_SACK to System S_OAK to P1 (no data is transferred)
  • Page 150 7. UltraSPARC External Interfaces The following transaction sequence is the same as for Section 7.16.1, “Read- ToShare Block,” except that the miss generates a dirty victim block. UltraSPARC always issues the read request before the Writeback request, but the requests can be completed in any order.
  • Page 151 raSPARC User’s Manual Table 7-33 Victim Writeback: Writeback Serviced Before Read Miss Processor 1 System Processor 2 Processor 3 Start read from memory S_RBU reply to P1 P1 reads the data Final state: Final state: updates Etag2{I No change No change 16.10 ReadToShare Dirty Victimized Block Condition: Load miss by another processor (P2) on a dirty line for which Proces- sor 1’s Writeback transaction has not yet completed.
  • Page 152 7. UltraSPARC External Interfaces 7.16.11 ReadToOwn Dirty Victimized Block Condition: Store miss by another processor (P2). The transaction sequence shown in Table 7-35 is the same as in Section 7.16.8, “Victim Writeback,” except that another processor P2 makes a ReadToOwn re- quest for the victimized block in P1 before the Writeback transaction from P1 has been acknowledged by System.
  • Page 153: Interconnect Packet Formats

    raSPARC User’s Manual 16.12 ReadToOwn Dirty Victimized Block Condition: Store hit by another processor (P2). The following transaction sequence is the same as for Section 7.16.5, “Read- ToOwn Block,” except that P2 already has the block in the Shared state (store hit), and P1 has the victimized block in the Owned state (due to the previous Read- ToShare request from P2).
  • Page 154 7. UltraSPARC External Interfaces 7.17.1 Request Packets The SYSADDR bus is a 36-bit transaction request bus with one odd-parity bit (SYADDR<35>. The request packet comprises 72 bits and is carried on SYSADDR in two successive interconnect clock cycles. Figure 7-31 shows the P_REQ and S_REQ types. Packet Type Initiated by UltraSPARC Initiated by SC...
  • Page 155 UltraSPARC User’s Manual First Cycle Second Cycle Parity Parity Class Class Master ID Physical Address<8:6> Physical Address<40:39> Reserved Transaction Type Reserved 22-13 Physical Address<38:14> Physical Address<16:4> Figure 7-32 Packet Format: Coherent P_REQ and S_REQ Transactions First Cycle Second Cycle Parity Parity Class Class...
  • Page 156 7. UltraSPARC External Interfaces 7.17.2 Packet Description 7.17.2.1 Master ID (MID) MID is a 5-bit field. It identifies the source Interconnect master port that made this request. MasterID is the same as the port_ID bits. SC can be useMID to main- tain ordering for transactions with the same MID, and to parallelize requests with different MIDs.
  • Page 157 UltraSPARC User’s Manual 7.17.2.4 Physical Address PA<40:4> Bits PA<40:4> of the 41-bit physical address space accessible to UltraSPARC. The low order 4 bits PA<3:0> of the physical address are implied in the bytemask in P_NCRD_REQ and P_NCWR_REQ transactions. All other transactions transfer 64-byte blocks and thus, PA<3:0>=0.
  • Page 158: Writeinvalidate

    7. UltraSPARC External Interfaces perform any tag match on its Etag for S_CPD_REQ, in order to accelerate its P_REPLY. In this case, the SC’s copyback request is itself an error, indicating that the Dtags do not accurately reflect the state of the processor’s E-Cache. 7.17.2.9 Target ID<4:0>...
  • Page 159 UltraSPARC User’s Manual • Requiring that software include MEMBARs around loads and stores that can cause misses and block stores to the same line. UltraSPARC blocks the issue of instruction fetch miss requests (P_RDSA_REQ) while there are outstanding block stores; it also inhibits issuing block stores while there are outstanding instruction fetch miss requests.
  • Page 160: Address Spaces, Asis, Asrs, And Traps

    Address Spaces, ASIs, ASRs, and Traps 8.1 Overview A SPARC-V9 processor provides an Address Space Identifier (ASI) with every ad- dress sent to memory. The ASI is used to distinguish between different address spaces, provide an attribute that is unique to an address space, and to map inter- nal control and diagnostics registers within a processor.
  • Page 161 raSPARC User’s Manual 3 Alternate Address Spaces The SPARC-V9 Address Space Identifier (ASI) is evenly divided into restricted and nonrestricted halves. ASIs in the range 00 ..7F are restricted; ASIs in the range 80 .. FF are non-restricted. An attempt by non-privileged software to ac- cess a restricted ASI causes a trap.
  • Page 162 8. Address Spaces, ASIs, ASRs, and Traps Table 8-1 Mandatory SPARC-V9 ASIs ASI Name (Suggested Macro Syntax) Access Description Section Value ASI_NUCLEUS (ASI_N) Implicit address space, nucleus privilege, TL > 0, ASI_NUCLEUS_LITTLE (ASI_NL) Implicit address space, nucleus privilege, TL > 0, little endian ASI_AS_IF_USER_PRIMARY (ASI_AIUP) Primary address space, user privilege ASI_AS_IF_USER_SECONDARY...
  • Page 163 raSPARC User’s Manual Table 8-2 UltraSPARC Extended (non-SPARC-V9) ASIs ASI Name (Suggested Macro Syntax) Access Description Section ASI_PHYS_USE_EC — Physical address, external cache- 6.10 (ASI_PHYS_USE_EC) able only ASI_PHYS_BYPASS_EC_WITH_EBIT — Physical address, non-cacheable, 6.10 (ASI_PHYS_BYPASS_EC_WITH_EBIT) with side-effect ASI_PHYS_USE_EC_LITTLE — Physical address, external cache- 6.10 (ASI_PHYS_USE_EC_L) able only, little endian...
  • Page 164 8. Address Spaces, ASIs, ASRs, and Traps Table 8-2 UltraSPARC Extended (non-SPARC-V9) ASIs (Continued) ASI Name (Suggested Macro Syntax) Access Description Section Value ASI_ITLB_DATA_ACCESS_REG ..1F8 I-MMU TLB Data Access Regis- 6.9.9 (ASI_ITLB_DATA_ACCESS_REG) ASI_ITLB_TAG_READ_REG ..1F8 I-MMU TLB Tag Read Register 6.9.9 (ASI_ITLB_TAG_READ_REG) ASI_IMMU_DEMAP I-MMU TLB demap...
  • Page 165 raSPARC User’s Manual Table 8-2 UltraSPARC Extended (non-SPARC-V9) ASIs (Continued) ASI Name (Suggested Macro Syntax) Access Description Section ASI_BLOCK_AS_IF_USER_PRIMARY — Primary address space, block 13.6.4 (ASI_BLK_AIUP) load/store, user privilege ASI_BLOCK_AS_IF_USER_SECONDAR — Secondary address space, block 13.6.4 Y (ASI_BLK_AIUS) load/store, user privilege ASI_ECACHE_W (ASI_EC_W) <40:39>=1 E-Cache data RAM diagnostic...
  • Page 166 8. Address Spaces, ASIs, ASRs, and Traps Table 8-2 UltraSPARC Extended (non-SPARC-V9) ASIs (Continued) ASI Name (Suggested Macro Syntax) Access Description Section Value ASI_UDB_INTR_R Incoming interrupt vector data 9.3.1 register 0 ASI_UDB_INTR_R Incoming interrupt vector data 9.3.1 register 1 ASI_UDB_INTR_R Incoming interrupt vector data 9.3.1 register 2...
  • Page 167 State After Reset and in RED_state,” on page 172 for the state of this regis- ter after reset. Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com Consult the UltraSPARC-I Data Sheet for the contents of this register’s ID field.
  • Page 168 8. Address Spaces, ASIs, ASRs, and Traps Note: Accesses to the UPA Port ID Register from the local processor return undefined data. Similar state information can be accessed from the UPA Configuration Register, described in Section 8.3.3.2, “UPA Configuration Register,” on page 154. —...
  • Page 169 Table 10-1, “Machine State After Reset and in RED_state,” on page 172 for the state of this register after reset. Figure 8-2 shows the UPA_CONFIG register for UltraSPARC-I. Figure 8-3 shows the UPA_CONFIG register for UltraSPARC-II. — PCON PCAP...
  • Page 170 Note: UltraSPARC-II supports only two combinations of values for the WB and SCIQ0 subfields: WB=0 and SCIQ0=0, which is identical to UltraSPARC-I’s configuration, or WB=1 and SCIQ0=2, which is UltraSPARC-II’s “natural” configuration Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 171: Ancillary State Registers

    raSPARC User’s Manual MID<4:0>: Module (processor) ID register. Identifies the slot in which the module resides; hardwired to the slot number from the connector pins. PCAP<16:0>: Processor Capabilities. Shadows the following fields in the UPA_PORT_ID Register. • PINT_RDQ<16:15> • PREQ_DQ<14:9> •...
  • Page 172 8. Address Spaces, ASIs, ASRs, and Traps Suggested Assembly Language Syntax %y, reg ,reg_or_imm, %y %ccr, reg ,reg_or_imm, %ccr %asi, reg ,reg_or_imm, %asi %tick, reg %pc reg %fprs, reg ,reg_or_imm, %fprs 8.4.3 Non-SPARC-V9 ASRs Non-SPARC-V9 ASRs are listed in Table 8-4 on page 157. Table 8-4 Non-SPARC-V9 ASRs ASR Name/Syntax...
  • Page 173: Other Ultrasparc Registers

    raSPARC User’s Manual Suggested Assembly Language Syntax %pcr, reg ,%pcr %pic, reg ,%pic %gsr, reg ,%gsr ,%clear_softint ,%set_softint %softint, reg ,%softint %tick_cmpr, reg ,%tick_cmpr %dcr, reg ,%dcr 5 Other UltraSPARC Registers Table 8-5 lists additional sets of 64-bit global registers supported by UltraSPARC. Table 8-5 Other UltraSPARC Registers Register Name...
  • Page 174 8. Address Spaces, ASIs, ASRs, and Traps Table 8-6 Traps Supported in UltraSPARC (Continued) Exception or Interrupt Request Globals Priority illegal_instruction privileged_opcode fp_disabled fp_exception_ieee_754 fp_exception_other tag_overflow clean_window ..027 division_by_zero data_access_exception data_access_error 4, 10 mem_address_not_aligned LDDF_mem_address_not_aligned STDF_mem_address_not_aligned privileged_action interrupt_level_n (n=1..15) ..04F 32–n interrupt_vector PA_watchpoint...
  • Page 175 UltraSPARC User’s Manual Some ASIs must be used with specific types of loads and stores; for example, block ASIs can be used only with LDDFA/STDFA. When these ASIs are used with incorrect opcodes, they do not take mem_address_not_aligned traps for memory and register alignment required by the ASI. For example, block ASIs require illegal_instruction 64-byte alignment, but an LDFA opcode with a block ASI checks only for 4-byte alignment.
  • Page 176: Interrupt Handling

    Interrupt Handling 9.1 Interrupt Vectors Processors and I/O devices can interrupt a selected processor by assembling and sending an interrupt packet consisting of three 64-bit words of interrupt data. The contents of this data are defined by software convention. This allows hard- ware interrupts and cross calls to have the same hardware mechanism for inter- rupt delivery and to share a common software interface for processing.
  • Page 177 UltraSPARC User’s Manual Note: The processor may not send an interrupt vector to itself. This will cause undefined interrupt vector data to be returned. Code Example 9-1 Code Sequence For Interrupt Dispatch Read state of ASI_INTR_DISPATCH_STATUS; Error if BUSY <no pending interrupt dispatch packet> Repeat Begin atomic sequence (PSTATE.IE ←...
  • Page 178: Interrupt Global Registers

    9. Interrupt Handling dler. All of the external interrupt packets are processed at the highest interrupt priority level; they are then re-prioritized as lower priority interrupts in the soft- ware handler. The following pseudo-code sequence illustrates interrupt receive handling. Code Example 9-2 Code Sequence for an Interrupt Receive Read state of ASI_INTR_RECEIVE;...
  • Page 179 UltraSPARC User’s Manual 9.3.1 Outgoing Interrupt Vector Data<2:0> Name: Outgoing Interrupt Vector Data Registers (Privileged) ASI_UDB_INTR_W (data 0): ASI=77 , VA<63:0>=40 ASI_UDB_INTR_W (data 1): ASI=77 , VA<63:0>=50 ASI_UDB_INTR_W (data 2): ASI=77 , VA<63:0>=60 Table 9-1 Outgoing Interrupt Vector Data Register Format Bits Field <63:0>...
  • Page 180 9. Interrupt Handling NACK: Cleared at the start of every interrupt dispatch attempt; set when a dispatch has failed. BUSY: Set if there is an outstanding dispatch. The status of the outgoing interrupt can be read from ASI_INTR_DISPATCH_STATUS. Writes to this ASI cause a trap.
  • Page 181: Software Interrupt (Softint) Register

    raSPARC User’s Manual Table 9-4 Interrupt Receive Register Format Bits Field <63:6> Reserved — <5> BUSY Set when an interrupt vector is received <4:0> MID<4:0> MID of interrupter BUSY: This bit is set when an interrupt vector is received. MID<4:0>: Module ID of interrupter. Note: The BUSY bit must be cleared by software writing zero.
  • Page 182 9. Interrupt Handling write to the SET_SOFTINT register (ASR 14 ) with bit <n> corresponding to the interrupt level set. Note that the value written to the SET_SOFTINT register is ef- fectively ORed into the SOFTINT register. This allows the interrupt handler to set one or more bits in the SOFTINT register with a single instruction.
  • Page 183 UltraSPARC User’s Manual Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 184: Reset And Red_State

    Reset and RED_state 10.1 Overview A reset or trap that sets PSTATE.RED (including a trap in RED_state) will clear the LSU_Control_Register, including the enable bits for the I-Cache, D-Cache, I-MMU, D-MMU, and virtual and physical watchpoints. • The default access in RED_state is noncacheable, so the system must contain some noncacheable scratch memory.
  • Page 185 raSPARC User’s Manual Note: Exiting RED_state by writing 0 to PSTATE.RED in the delay slot of a JMPL is not recommended. A noncacheable instruction prefetch may be made to the JMPL target, which may be in a cacheable memory area. This may result in a bus error on some systems, which will cause an trap.
  • Page 186: Red_State Trap Vector

    10. Reset and RED_state Note: Each register must be initialized before it is used. For example, CWP must be initialized before accessing any windowed registers, since the CWP register selects which register window to access. Failure to properly initialize registers or state prior to use may result in unpredicted or incorrect results. 10.1.2 Externally Initiated Reset (XIR) An Externally Initiated Reset is sent to the CPU via the XIR pin;...
  • Page 187 Unknown Unchanged CLEANWIN Unknown Unchanged WSTATE OTHER Unknown Unchanged NORMAL Unknown Unchanged MANUF 0017 IMPL UltraSPARC-I=0010 UltraSPARC-II=0011 MASK mask-dependent MAXTL MAXWIN Unchanged FPRS Unknown Unchanged Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 188 10. Reset and RED_state Table 10-1 Machine State After Reset and in RED_state (Continued) ‡ Name Fields RED_state Non-SPARC-V9 ASRs SOFTINT Unknown Unchanged TICK_COMPARE INT_DIS 1 (off) Unchanged TICK_CMPR Unknown Unchanged PERF_CONTROL Unknown Unchanged Unknown Unchanged UT (trace user) Unknown Unchanged ST (trace system) Unknown...
  • Page 189 † If power has been cycled, the state of AFSR is unknown; otherwise, it is unchanged. This field or register is not present in UltraSPARC-I. Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 190: Error Handling

    Error Handling 11.1 Overview UltraSPARC provides error checking for all memory access paths between the CPU, E-Cache, UltraSPARC Data Buffer (UDB), and system bus. Errors are re- ported as system fatal errors, deferred traps, or disrupting traps. System fatal er- rors are reported when the system must be reset before continuing.
  • Page 191 raSPARC User’s Manual Since the AFSR is not reset by power on reset, error logging information is pre- served. Software can examine system registers to determine that reset was due to a P_FERR, and which node generated it. The appropriate AFSR can be read to de- termine the cause of the P_FERR.
  • Page 192 11. Error Handling destroyed, but no other state will be corrupted. If TPC is pointing to the MEMBAR #Sync following the access, then the trap handler data_access_error knows that a recoverable error has occurred and resumes execution after setting a status flag. The trap handler must set TNPC to TPC + 4 before resuming, because the contents of TNPC are otherwise undefined.
  • Page 193: Memory Errors

    raSPARC User’s Manual .1.3 Disrupting Errors Disrupting errors are due to Single-Bit ECC Errors (which are corrected by the hardware) and E-Cache data parity errors during write back. Disrupting errors should be handled by logging the error and resuming execution. Recoverable ECC errors result from detection of a single-bit ECC error during a system transaction.
  • Page 194: Memory Error Registers

    11. Error Handling If an E-Cache data parity error occurs while snooping, a bad ECC error is gener- ated and sent to the requester. This causes an instruction_access_error trap at the master that requested the data. The slave processor data_access_error logs error information that can be read by the master during error handling.
  • Page 195 raSPARC User’s Manual Table 11-1 E-Cache Error Enable Register Format Bits Field <63:3> Reserved — <2> ISAPEN Trap on system address parity error <1> NCEEN Trap on TO, BERR, LDP, ETP, EDP, WP, UE, IVUE <0> CEEN Trap on correctable memory read error ISAPEN: If set, an address parity error on an incoming UPA transaction causes a system fatal error;...
  • Page 196 11. Error Handling • Bits <19:16> and <15:0> contain the tag and data parity syndromes respectively. Syndrome bits are endian-neutral, that is, bit 0 corresponds to bits<7:0> of the E-Cache data bus (that is, bytes whose least significant four address bits are F ).
  • Page 197 raSPARC User’s Manual Table 11-3 E-Cache Data Parity Syndrome Bit Orderings Byte E- Cache Data Syndrome Bit Address Bus Bits <7:0> <15:8> <23:16> <31:24> <39:32> <47:40> <55:48> <63:56> <71:64> <79:72> <87:80> <95:88> <103:96> <111:104> <119:112> <127:120> Table 11-4 E-Cache Tag Parity Syndrome Bit Orderings E-Cache Tag Syndrome Bit Bus Bits...
  • Page 198 11. Error Handling Refer to Table 10-1, “Machine State After Reset and in RED_state,” on page 172 for the state of this register after reset. Name: ASI_ASYNC_FAULT_ADDRESS ASI=4D , VA<63:0>=0 Table 11-5 Asynchronous Fault Address Register Bits Field <63:41> Reserved —...
  • Page 199 UltraSPARC User’s Manual 11.3.4 UltraSPARC Data Buffer (UDB) Error Register For implementation efficiency, the UltraSPARC Data Buffer (UDB) error and con- trol registers are physically separated into upper half and lower half registers. Separate ASIs are used for reading (7F ) and writing (77 ) the UDB registers.
  • Page 200: Ultrasparc Data Buffer (Udb) Control Register

    11. Error Handling 11.4 UltraSPARC Data Buffer (UDB) Control Register Name: ASI_UDBH_CONTROL_REG_WRITE ASI=77 , VA<63:0>=20 Name: ASI_UDBH_CONTROL_REG_READ ASI=7F , VA<63:0>=20 Name: ASI_UDBL_CONTROL_REG_WRITE ASI=77 , VA<63:0>=38 Name: ASI_UDBL_CONTROL_REG_READ ASI=7F , VA<63:0>=38 Table 11-8 UDB Error Register Format Bits Field <63:13> Reserved —...
  • Page 201 UltraSPARC User’s Manual The physical address of the first error within a class (UE, CE, {TO, BE}) is cap- tured in the AFAR until the associated error status bit is cleared in AFSR, or an error from a higher priority class occurs. A CE error overwrites prior TO or BE errors.
  • Page 202: Section Iii - Ultrasparc And Sparc-V9

    Section III — UltraSPARC and SPARC-V9 12. Instruction Set Summary ..............189 13. UltraSPARC Extended Instructions ..........195 14. Implementation Dependencies ............235 15. SPARC-V9 Memory Models .............. 255 Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 203 UltraSPARC User’s Manual Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 204: Instruction Set Summary

    Instruction Set Summary The UltraSPARC CPU implements both the standard SPARC-V9 instruction set and a number of implementation-dependent extended instructions. Standard SPARC-V9 instructions are documented in The SPARC Architecture Manual, Ver- sion 9. UltraSPARC extended instructions are documented in Chapter 13, “UltraSPARC Extended Instructions.”...
  • Page 205 raSPARC User’s Manual Table 12-1 Complete UltraSPARC Instruction Set Opcode Description D (ADDcc) Add (and modify condition codes) DC (ADDCcc) Add with carry (and modify condition codes) IGNADDRESS Calculate address for misaligned data access 13.5.5 IGNADDRESSL Calculate address for misaligned data access (little-endian) 13.5.5 D (ANDcc) And (and modify condition codes)
  • Page 206 12. Instruction Set Summary Table 12-1 Complete UltraSPARC Instruction Set (Continued) Opcode Description FMUL(s,d,q) Floating-point multiply A.18 Signed upper 8- × 16-bit partitioned product of corresponding components FMUL8SUx16 13.5.4 Unsigned lower 8- × 16-bit partitioned product of corresponding components FMUL8ULx16 13.5.4 8- ×...
  • Page 207 raSPARC User’s Manual Table 12-1 Complete UltraSPARC Instruction Set (Continued) Opcode Description Load double floating-point from alternate space A.26 Zero-extended 8-/16-bit load to a double precision FP register 13.6.2 Load floating-point A.25 Load floating-point from alternate space A.26 Load floating-point state register lower A.25 Load quad floating-point A.25...
  • Page 208 12. Instruction Set Summary Table 12-1 Complete UltraSPARC Instruction Set (Continued) Opcode Description RDPR Read privileged register A.42 RDTICK Read TICK register A.43 Read Y register A.43 RESTORE Restore caller’s window A.45 RESTORED Window has been restored A.46 RETRY Return from trap and retry A.11 RETURN Return...
  • Page 209 raSPARC User’s Manual Table 12-1 Complete UltraSPARC Instruction Set (Continued) Opcode Description BC (SUBCcc) Subtract with carry (and modify condition codes) A.55 Swap integer register with memory A.56 Swap integer register with memory in alternate space A.57 DDcc Tagged add and modify condition codes (trap on overflow) A.58 ADDccTV) UBcc...
  • Page 210: Ultrasparc Extended Instructions

    UltraSPARC Extended Instructions 13.1 Introduction UltraSPARC extends the standard SPARC-V9 instruction set with three new classes of instructions designed to support power-down mode (see Section 13.2, “SHUTDOWN") enhance graphics functionality (see Section 13.5, “Graphics In- structions”), and improve the efficiency of memory accesses (see Section 13.6, “Memory Access Instructions).
  • Page 211: Graphics Data Formats

    PLL. If desired, the external clock can be stopped after the EPD signal is asserted, in order to allow reset processing to complete. Consult the UltraSPARC-I Data Sheet for electrical and timing related specifications. (See the Bibliography for in- formation about how to obtain the data sheet.)
  • Page 212: Graphics Status Register (Gsr)

    13. UltraSPARC Extended Instructions 13.3.2 Fixed Data Formats The fixed 16-bit data format consists of four 16-bit signed fixed-point values con- tained in a 64-bit word. The fixed 32-bit format consists of two 32-bit signed fixed point-values contained in a 64-bit word. Fixed data values provide an intermedi- ate format with enough precision and dynamic range for filtering and simple im- age computations on pixel values.
  • Page 213: Graphics Instructions

    raSPARC User’s Manual RDASR format: — 30 29 WRASR format: — simm13 30 29 Suggested Assembly Language Syntax %gsr, reg , reg_or_imm , %gsr Accesses to this register cause an trap if either PSTATE.PEF or fp_disabled FPRS.FEF is zero. Figure 13-2 shows the format of the GSR. scale_factor alignaddr_offset —...
  • Page 214 13. UltraSPARC Extended Instructions floating-point/graphics code only). Pixel values are stored in single-precision floating point registers and fixed values are stored in double-precision floating- point registers, unless otherwise specified. 13.5.1 Opcode Format The graphics instruction set maps to the opcode space reserved for the Imple- mentation-Dependent Instruction 1 (IMPDEP1) instructions.
  • Page 215 UltraSPARC User’s Manual Description: The standard versions of these instructions perform four 16-bit or two 32-bit par- titioned adds or subtracts between the corresponding fixed point values con- tained in the source operands (rs1, rs2). For subtraction, rs2 is subtracted from rs1. The result is placed in the destination register (rd).
  • Page 216 13. UltraSPARC Extended Instructions Description: The PACK instructions convert to a lower precision fixed or pixel format. Input values are clipped to the dynamic range of the output format. Packing applies a scale factor from GSR.scale_factor to allow flexible positioning of the binary point. Note: For good performance, do not use the result of an FPACK as part of a 64-bit graphics instruction source operand in the next three instruction groups.
  • Page 217 raSPARC User’s Manual GSR.scale_factor 1010 GSR.scale_factor 0100 implicit binary pt implicit binary pt Figure 13-3 FPACK16 Operation This operation, illustrated in Figure 13-3, is carried out as follows: Left shift the value in rs2 by the number of bits in the GSR.scale_factor, while maintaining clipping information.
  • Page 218 13. UltraSPARC Extended Instructions 13.5.3.2 FPACK32 FPACK32 takes two 32-bit fixed values in rs2, scales, truncates and clips them into two 8-bit unsigned integers. The two 8-bit integers are merged at the corre- sponding least significant byte positions of each 32-bit word in rs1 left shifted by 8 bits.
  • Page 219 raSPARC User’s Manual GSR.scale_factor 0110 implicit binary pt Figure 13-4 FPACK32 Operation .5.3.3 FPACKFIX FPACKFIX takes two 32-bit fixed values in rs2, scales, truncates and clips them into two 16-bit signed integers, then stores the result in the 32-bit rd register. This operation, illustrated in Figure 13-5, is carried out as follows: Artisan Technology Group - Quality Instrumentation ...
  • Page 220 13. UltraSPARC Extended Instructions For each 32-bit value, truncate and clip to a 16-bit signed integer starting at the bit immediately to the left of the implicit binary point (i.e. between bits 16 and 15 of each 32-bit word). Truncation is performed to convert the scaled value into a signed integer (i.e.
  • Page 221 raSPARC User’s Manual .5.3.4 FEXPAND FEXPAND takes four 8-bit unsigned integers in rs2, converts each integer to a 16- bit fixed value, and stores the four 16-bit results in the rd register. This operation, illustrated in Figure 13-6, is carried out as follows: Left shift each 8-bit value by 4 and zero-extend the results to a 16-bit fixed value.
  • Page 222 13. UltraSPARC Extended Instructions FPMERGE also converts from planar to packed when it is applied twice in suc- cession; for example: R1R2R3R4, B1B2B3B4 → R1B1R2B2R3B3R4B4 → R1G1B1A1R2G2B2A2 Figure 13-7 FPMERGE Operation Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 223 UltraSPARC User’s Manual 13.5.4 Partitioned Multiply Instructions opcode operation 8- × 16-bit partitioned product 0 0011 0001 FMUL8x16 8- × 16-bit upper α partitioned product 0 0011 0011 FMUL8x16AU 8- × 16-bit lower α partitioned product 0 0011 0101 FMUL8x16AL upper 8- ×...
  • Page 224 13. UltraSPARC Extended Instructions 13.5.4.1 FMUL8x16 FMUL8x16 multiplies each unsigned 8-bit value (i.e., a pixel) in rs1 by the corre- sponding (signed) 16-bit fixed-point integers in rs2; it rounds the 24-bit product (assuming a binary point between bits 7 and 8) and stores the upper 16 bits of the result into the corresponding 16-bit field in the rd register.
  • Page 225 UltraSPARC User’s Manual Figure 13-9 FMUL8x16AU Operation 13.5.4.3 FMUL8x16AL FMUL8x16AL is the same as FMUL8x16AU, except that the least significant 16 bits of the 32-bit rs2 register are used for the α value. Figure 13-10 FMUL8x16AL Operation Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 226 13. UltraSPARC Extended Instructions 13.5.4.4 FMUL8SUx16 FMUL8SUx16 multiplies the upper 8 bits of each 16-bit signed value in rs1 by the corresponding signed 16-bit fixed-point signed integer in rs2. It rounds the 24-bit product (to nearest) and then stores the upper 16 bits of the result into the corre- sponding 16-bit field of the rd register.
  • Page 227 raSPARC User’s Manual sign-extended sign-extended sign-extended sign-extended 8 msb 8 msb 8 msb 8 msb Figure 13-12 FMUL8ULx16 Operation .5.4.6 FMULD8SUx16 FMULD8SUx16 multiplies the upper 8 bits of each 16-bit signed value in rs1 by the corresponding signed 16-bit fixed point signed integer in rs2. The 24-bit prod- uct is shifted left by 8-bits to make up a 32-bit result.
  • Page 228 13. UltraSPARC Extended Instructions 13.5.4.7 FMULD8ULx16 FMULD8ULx16 multiplies the unsigned lower 8 bits of each 16-bit value in rs1 by the corresponding fixed point signed integer in rs2. Each 24-bit product is sign- extended to 32 bits and stored in the rd register. The operation is illustrated in Figure 13-14.
  • Page 229 UltraSPARC User’s Manual 13.5.5 Alignment Instructions opcode operation 0 0001 1000 ALIGNADDRESS Calculate address for misaligned data access 0 0001 1010 ALIGNADDRESS_LITTLE Calculate address for misaligned data access, little-endian 0 0100 1000 FALIGNDATA Perform data alignment for misaligned data Format (3): 110110 30 29 Suggested Assembly Language Syntax...
  • Page 230 13. UltraSPARC Extended Instructions faligndata %f0, %f4, %f8 Traps fp_disabled Note: For good performance, do not use the result of FALIGN as a 32-bit graphics instruction source operand in the next instruction group. 13.5.6 Logical Operate Instructions opcode operation 0 0110 0000 FZERO Zero fill 0 0110 0001...
  • Page 231 UltraSPARC User’s Manual Format (3): 11 0110 30 29 Suggested Assembly Language Syntax fzero freg fzeros freg fone freg fones freg fsrc1 freg , freg fsrc1s freg , freg fsrc2 freg , freg fsrc2s freg , freg fnot1 freg , freg fnot1s freg , freg...
  • Page 232 13. UltraSPARC Extended Instructions Description: The standard 64-bit version of these instructions perform one of sixteen 64-bit logical operations between rs1 and rs2. The result is stored in rd. The 32-bit (sin- gle-precision) version of these instructions performs 32-bit logical operations. Note: For good performance, do not use the result of a single logical as part of a 64-bit graphics instruction source operand in the next instruction group.
  • Page 233 UltraSPARC User’s Manual Suggested Assembly Language Syntax fcmple32 freg , freg , reg fcmpne16 freg , freg , reg fcmpne32 freg , freg , reg fcmpeq16 freg , freg , reg fcmpeq32 freg , freg , reg Description: Four 16-bit or two 32-bit fixed-point values in rs1 and rs2 are compared. The 4-bit or 2-bit results are stored in the corresponding least significant bits of the integer rd register.
  • Page 234 13. UltraSPARC Extended Instructions 13.5.8 Edge Handling Instructions opcode operation 0 0000 0000 EDGE8 Eight 8-bit edge boundary processing 0 0000 0010 EDGE8L Eight 8-bit edge boundary processing, little- endian 0 0000 0100 EDGE16 Four 16-bit edge boundary processing 0 0000 0110 EDGE16L Four 16-bit edge boundary processing, little- endian...
  • Page 235 UltraSPARC User’s Manual If 32-bit address masking is disabled (PSTATE.AM = 0, 64-bit addressing) and the upper 61 bits of rs1 are equal to the corresponding bits in rs2, rd is set equal to the right edge mask ANDed with the left edge mask. If 32-bit address masking is enabled (PSTATE.AM = 1, 32-bit addressing) is set and the bits <31:3>...
  • Page 236 13. UltraSPARC Extended Instructions Table 13-2 Edge Mask Specification (Little-Endian) Edge Size A2..A0 Left Edge Right Edge 1111 1111 0000 0001 1111 1110 0000 0011 1111 1100 0000 0111 1111 1000 0000 1111 1111 0000 0001 1111 1110 0000 0011 1111 1100 0000 0111 1111 1000 0000...
  • Page 237 UltraSPARC User’s Manual Traps: fp_disabled 13.5.10 Three-Dimensional Array Addressing Instructions opcode operation 0 0001 0000 ARRAY8 Convert 8-bit 3-D address to blocked byte address 0 0001 0010 ARRAY16 Convert 16-bit 3-D address to blocked byte address 0 0001 0100 ARRAY32 Convert 32-bit 3-D address to blocked byte address Format (3): 11 0110...
  • Page 238 13. UltraSPARC Extended Instructions Figure 13-15 shows the format of rs1. Z integer Z fraction Y integer Y fraction X integer X fraction Figure 13-15 Three Dimensional Array Fixed-Point Address Format The integer parts of X, Y, and Z are converted to the following blocked-address formats: Middle Upper...
  • Page 239 UltraSPARC User’s Manual Note: To maximize reuse of E-Cache and TLB data, software should block array references for large images to the 64 KB level. This means processing elements within a 32 x 64 x 64 block. The following code fragment shows assembly of components along an interpolat- ed line at the rate of one component per clock on UltraSPARC: Code Example 13-4 Assembly of Components Along an Interpolated Line Addr, DeltaAddr, Addr...
  • Page 240: Memory Access Instructions

    13. UltraSPARC Extended Instructions 13.6 Memory Access Instructions 13.6.1 Partial Store Instructions Opcode imm_asi ASI Value Operation ASI_PST8_P STDFA Eight 8-bit conditional stores to primary address space ASI_PST8_S STDFA Eight 8-bit conditional stores to secondary address space ASI_PST8_PL STDFA Eight 8-bit conditional stores to primary address space, little-endian ASI_PST8_SL STDFA...
  • Page 241 UltraSPARC User’s Manual most significant bit of the mask (not the entire register) corresponds to the most significant part of the rs1 register. The data is stored in little-endian form in mem- ory if the ASI name has a “_LITTLE” suffix; otherwise, it is big-endian. Note: If the byte ordering is little-endian, the byte enables generated by this instruction are swapped with respect to big-endian.
  • Page 242 13. UltraSPARC Extended Instructions 13.6.2 Short Floating-Point Load and Store Instructions Opcode imm_asi ASI Value Operation LDDFA ASI_FL8_P 8-bit load/store from/to primary address space STDFA LDDFA ASI_FL8_S 8-bit load/store from/to secondary address space STDFA LDDFA 8-bit load/store from/to primary address space, lit- ASI_FL8_PL STDFA tle-endian...
  • Page 243 raSPARC User’s Manual These ASIs allow 8- and 16-bit loads or stores to be performed to the floating- point registers. Eight-bit loads can be performed to arbitrary byte addresses. For sixteen bit loads, the least significant bit of the address must be zero, or a mem_not_aligned trap is taken.
  • Page 244 13. UltraSPARC Extended Instructions 13.6.3 Atomic Quad Load Opcode imm_asi ASI Value Operation ASI_NUCLEUS_QUAD_LDD LDDA 128-bit atomic load ASI_NUCLEUS_QUAD_LDD_L LDDA 128-bit atomic load, little endian Format (3) LDDA: 01 0011 imm_asi 01 0011 simm_13 30 29 Suggested Assembly Language Syntax ldda [ reg_addr ] imm_asi , reg ldda...
  • Page 245 UltraSPARC User’s Manual 13.6.4 Block Load and Store Instructions Opcode imm_asi ASI Value Operation LDDFA 64-byte block load/store from/ to primary ASI_BLK_AIUP STDFA address space, user privilege LDDFA 64-byte block load/store from/ to secondary ASI_BLK_AIUS STDFA address space, user privilege 64-byte block load/store from/ to primary LDDFA ASI_BLK_AIUPL...
  • Page 246 13. UltraSPARC Extended Instructions Description: Block load and store instructions are selected by using one of the block transfer ASIs with the LDDA and STDA instructions. These ASIs allow block loads or stores to be performed to the same address spaces as normal loads and stores. Little-endian ASIs access data in little-endian format, otherwise the access is as- sumed to be big-endian.
  • Page 247 raSPARC User’s Manual Note: These instructions are used for transferring large blocks of data (more than 256 bytes); for example, BCOPY and BFILL. On UltraSPARC they do not allocate in the D-Cache or E-Cache on a miss. UltraSPARC updates the E-Cache on a hit.
  • Page 248 13. UltraSPARC Extended Instructions taken, so the trap handler need not consider pending block loads. If the BLD overlaps a previous or later store and there is no intervening MEMBAR, trap, or data reference, the BLD may return data from before or after the store. BST does not follow memory model ordering with respect to loads, stores or flushes.
  • Page 249 UltraSPARC User’s Manual Code Example 13-5 Byte-Aligned Block Copy Inner Loop Note that the loop must be unrolled two times to achieve maximum performance. All FP registers are double-precision. Eight versions of this loop are needed to handle all the cases of double word misalignment between the source and destination.
  • Page 250: Implementation Dependencies

    Implementation Dependencies 14.1 SPARC-V9 General Information 14.1.1 Level-2 Compliance (Impdep #1) UltraSPARC is designed to meet Level-2 SPARC-V9 compliance. It • Correctly interprets all non-privileged operations, and • Correctly interprets all privileged elements of the architecture. Note: System emulation routines (for example, quad-precision floating-point operations) shipped with UltraSPARC also must be Level-2 compliant.
  • Page 251 raSPARC User’s Manual .1.3 Trap Levels (Impdep #37, 38, 39, 40, 114, 115) UltraSPARC supports five trap levels; that is, MAXTL=5. Normal execution is at TL0. Traps at MAXTL –1 cause the CPU to enter RED_state. If a trap is generated while the CPU is operating at TL = MAXTL, the CPU will enter error_state and generate a Watchdog Reset (WDR).
  • Page 252 14. Implementation Dependencies multiple nested traps, promoting processor efficiency while dramatically reduc- ing the system overhead needed for trap handling. Three sets of alternate globals are selected for different kinds of traps: • MMU globals for memory faults • Interrupt globals, and •...
  • Page 253 raSPARC User’s Manual and FFFF F7FF FFFF FFFF inclusive are termed “out-of-range” and are illegal. Address translation and MMU related descriptions can be found in Section 4.2, “Virtual Address Translation,” on page 21. FFFF FFFF FFFF FFFF FFFF F800 0000 0000 FFFF F7FF FFFF FFFF Out of Range VA (VA “Hole”)
  • Page 254 14. Implementation Dependencies ing address by XORing ones into the upper 20 bits. See also Section 6.9.4, “I-/D- MMU Synchronous Fault Status Registers (SFSR),” on page 58 and Section 6.9.5, “I-/D-MMU Synchronous Fault Address Registers (SFAR),” on page 60. When a trap occurs on the delay slot of a taken branch or call whose target is out- of-range, or the last instruction below the VA hole, UltraSPARC records the fact that nPC points to an out of range instruction.
  • Page 255: Sparc-V9 Integer Operations

    raSPARC User’s Manual .1.8 Population Count Instruction (POPC) The population count instruction is not directly executed in hardware; it is emu- lated in software. .1.9 Secure Software To establish an enhanced security environment, it may be necessary to initialize certain processor states between contexts. Examples of such states are the con- tents of integer and floating-point register files, condition codes, and state regis- ters.
  • Page 256 , that uniquely identifies an UltraSPARC-class CPU. Table 14-3 shows the VER.impl values for each UltraSPARC model. Table 14-3 VER.impl Values by UltraSPARC Model UltraSPARC-I UltraSPARC-II VER.impl 0010 0011 mask: 8-bit mask set revision number that identifies the mask set revision of this Artisan Technology Group - Quality Instrumentation ...
  • Page 257: Sparc-V9 Floating-Point Operations

    raSPARC User’s Manual and is incremented for each all-layer mask revision. The minor number starts at zero for each major revision, and is incremented for each less- than-all-layer mask revision. maxtl: Maximum number of supported trap levels beyond level 0. This is the same as the largest possible value for the TL register.
  • Page 258 14. Implementation Dependencies Subnormal Operand Trapping Cases (NS=0) Table 14-4 Two Subnormal Operations One Subnormal Operand Operands F(sd)TO(ix) Unfinished trap always — F(sd)TO(ds) FSQRT(sd) FADD/SUB(sd) Unfinished trap always Unfinished trap always FSMULD FMUL(sd) Unfinished trap if no overflow and: Unfinished trap always FDIV(sd) -25 <...
  • Page 259 raSPARC User’s Manual enabled, an (with FSR.ftt=2, ) trap is generated. fp_exception_other unfinished_FPop System software will properly handle these cases and resume execution. If the ex- ception is not enabled, the actual result status is used to update the aexec bits of the fsr.
  • Page 260 14. Implementation Dependencies The FPRS.DU and FPRS.DL may be set pessimistically, even though the instruc- tion that modified the floating-point register file is nullified. 14.3.5 Floating-Point Status Register (FSR) (Impdep #13, 19, 22, 23, 24) UltraSPARC supports precise-traps and implements all three exception fields (TEM, cexc, and aexc) conforming to IEEE Std 754-1985.
  • Page 261 raSPARC User’s Manual RD: IEEE Std 754-1985 Rounding Direction. Table 14-8 Floating-Point Rounding Modes Round Toward Nearest (even if tie) +∞ –∞ TEM: 5-bit trap enable mask for the IEEE-754 floating-point exceptions. If a floating-point operate instruction produces one or more exceptions, the corresponding cexc/aexc bits are set and an (with fp_exception_ieee_754...
  • Page 262: Sparc-V9 Memory-Related Operations

    14. Implementation Dependencies Note: UltraSPARC does not contain an FQ. An attempt to read the FQ with a RDPR instruction causes an trap. illegal_instruction Note: SPARC-V8-compatible programs should set the least significant bit of the floating-point register number to zero for all double-precision instructions. Violation of this SPARC-V8 architectural constraint may result in unexpected program behavior.
  • Page 263 UltraSPARC guarantees that earlier code modifications will be visible across the whole system. 14.4.5 PREFETCH{A} (Impdep #103, 117) For UltraSPARC-I, PREFETCH{A} instructions with fcn=0..4 are treated as NOPs. For UltraSPARC-II, PREFETCH{A} instructions with fcn=0..4 have the following meanings:...
  • Page 264: Non-Sparc-V9 Extensions

    14. Implementation Dependencies 14.4.7 LDD/STD Handling (Impdep #107, 108) LDD and STD instructions are directly executed in hardware. Note: LDD/STD are deprecated in SPARC-V9. In UltraSPARC it is more efficient to use LDX/STX for accessing 64-bit data. LDD/STD take longer to execute than two 32-/64-bit loads/stores.
  • Page 265 raSPARC User’s Manual Table 14-11 TICK_compare Register Format Bits Field <63> INT_DIS TICK_INT interrupt enable <62:0> TICK_CMPR Compare value for TICK interrupts INT_DIS: If set, TICK_INT interrupt generation is disabled. TICK_CMPR: Writes to the TICK_Compare Register load a value for comparison to the TICK register bits <62:0>.
  • Page 266 14. Implementation Dependencies 14.5.6 Partial Stores UltraSPARC supports 8-/16-/32-bit partial stores to memory. See Section 13.6.1, “Partial Store Instructions,” on page 225. 14.5.7 Short Floating-Point Loads and Stores UltraSPARC supports 8-/16-bit loads and stores to the floating-point registers. See Section 13.6.2, “Short Floating-Point Load and Store Instructions,” on page 227.
  • Page 267 raSPARC User’s Manual Note: Exiting RED_state by writing 0 to PSTATE.RED in the delay slot of a JMPL instruction is not recommended. A noncacheable instruction prefetch may be made to the JMPL target, which may be in a cacheable memory area. This may result in a bus error on some systems, which causes an instruction_access_error trap.
  • Page 268 14. Implementation Dependencies Note: The AG, IG, and MG bits are mutually exclusive. Attempting to set a reserved encoding using a WRPR to PSTATE will generate an illegal_instruction trap. UltraSPARC does not check for a reserved encoding in TSTATE. This will cause undefined results when a DONE or RETRY is executed.
  • Page 269 raSPARC User’s Manual .5.14 Debug and Diagnostics Support UltraSPARC support for debug and diagnostics is described in Appendix A, “Debug and Diagnostics Support,” on page 303. Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 270: Sparc-V9 Memory Models

    SPARC-V9 Memory Models 15.1 Overview SPARC-V9 defines the semantics of memory operations for three memory mod- els. From strongest to weakest, they are Total Store Order (TSO), Partial Store Or- der (PSO), and Relaxed Memory Order (RMO). The differences in these models lie in the freedom an implementation is allowed in order to obtain higher perfor- mance during program execution.
  • Page 271: Supported Memory Models

    raSPARC User’s Manual data registers, and for access protection. Attempts by non-privileged software (PSTATE.PRIV=0) to access restricted ASIs (ASI<7>=0) cause a privileged_action trap. Memory is logically divided into real memory (cached) and I/O memory (non- cached with and without side-effects) spaces. Real memory spaces can be access- ed without side-effects.
  • Page 272 15. SPARC-V9 Memory Models • A MEMBAR #StoreLoad must be used to prevent a load from bypassing a prior store, if Strong Sequential Order is desired. • Stores are processed in program order. • Stores cannot bypass earlier loads. • Accesses with the E-bit set (that is, those having side-effects) are all strongly ordered with respect to each other.
  • Page 273 UltraSPARC User’s Manual 15.2.3 RMO UltraSPARC implements the following programmer-visible properties in Relaxed Memory Order (RMO) mode: • There is no implicit order between any two memory references, either cacheable or non-cacheable, except that non-cacheable accesses with the E-bit set (that is, those having side-effects) are all strongly ordered with respect to each other.
  • Page 274: Section Iv - Producing Optimized Code

    Section IV — Producing Optimized Code 16. Code Generation Guidelines ............. 261 17. Grouping Rules and Stalls ..............281 Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 275 UltraSPARC User’s Manual Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 276: Code Generation Guidelines

    Code Generation Guidelines 16.1 Hardware / Software Synergy One of the goals set for UltraSPARC was for the processor to execute SPARC-V8 binaries efficiently, providing around three times the performance of existing ma- chines running the same code. A significantly larger performance gain can be ob- tained if the code is re-compiled using a compiler specifically designed for UltraSPARC.
  • Page 277 raSPARC User’s Manual .2.2 Instruction Alignment .2.2.1 I-Cache Organization The 16 Kb I-Cache is organized as a 2-way set associative cache, with each set containing 256 eight-instruction lines (Figure 16-1). The 14 bits required to access any location in the I-Cache are composed of the 13 least significant address bits (since the minimum page size is 8K, these 13 bits are always part of the page off- set and need not be translated) and 1 bit used to predict the associativity number (way) in which instructions reside.
  • Page 278 16. Code Generation Guidelines struction would be fetched in that case. If the target is accessed from more than one place, it should be aligned so that it accommodates the largest possible group. If accesses to the I-Cache are expected to miss, it may be desirable to align targets on a 16-byte (even 32-byte) boundary so that 4 instructions are forwarded to the next stage.
  • Page 279 raSPARC User’s Manual • Breaking the group and scheduling the ALU instruction with the next group. Notice that this may not lengthen the critical path (in terms of number of cycles executed) if the next group can accommodate this extra instruction without adding any new group.
  • Page 280 16. Code Generation Guidelines Since there is one set of prediction bits for every two instructions, it is possible to have two branches (a CTI couple) sharing prediction bits. Under normal circumstances, the bits are maintained correctly; however, the bits may be updated based on the wrong branch if the second branch in the CTI couple is the target of another branch (Figure 16-4).
  • Page 281 UltraSPARC User’s Manual PDU is somewhat separated from the rest of the pipeline, the I-Cache miss may have occurred when the pipeline was already stalled (for example, due to a multi-cycle integer divide, floating-point divide dependency, dependency on load data that missed the D-Cache, etc.). This means that the miss (or part of it) may be transparent to the pipeline.
  • Page 282 16. Code Generation Guidelines • The instruction buffer almost always contains several instructions when an I-Cache miss occurs (an average of about 6.6). • The instruction buffer is filled faster (up to 4 instructions per cycle) than it is emptied. All these factors contribute to reducing the apparent I-Cache miss latency from 6 cycles (assuming an E-Cache hit) to 0.14 cycles on average for fpppp;...
  • Page 283 raSPARC User’s Manual The static bit provided by BPcc and FBPfcc instructions is used to set the state machine in either the likely taken state or the likely not taken state (Figure 16-6). For branches without prediction (Bicc, FBfcc), UltraSPARC initializes the state machine to likely not taken.
  • Page 284 16. Code Generation Guidelines Avoid scheduling long latency instructions such as FDIV if the branch is predicted to be not-taken a significant portion of the time (since they affect the timing of the non-taken stream). Avoid scheduling an instruction that would stall dispatching due to a load- use dependency.
  • Page 285 raSPARC User’s Manual Assuming that a specific branch can only be predicted with 50% accuracy (basi- cally, it is not predicted), the compiler must balance the two cycle penalty on av- erage for the mispredicted branch case vs. the ability to schedule other instructions around MOVcc (the SETcc cycle and the two groups after MOVcc, since MOVcc is a single instruction group).
  • Page 286 16. Code Generation Guidelines bicc delay F instr1F instr2F grp1 grp2 grp3 grp4 instr1 (correct) Figure 16-9 Cost of a Mispredicted Branch (Shaded Area) It should be obvious from Figure 16-9 how expensive badly behaved branches are for UltraSPARC. Special consideration should be given to moving hard to predict branches after highly predictable branches based on profiling, and to combining conditions to make branches more predictable.
  • Page 287: Data Stream Issues

    raSPARC User’s Manual The technique shown in Figure 16-10 can be generalized to N levels, where N branches are correlated and become more predictable. The above technique may lead to unrolling of loops that were previously identified as bad candidates, be- cause of the unpredictable behavior of their conditional branches.
  • Page 288 16. Code Generation Guidelines 16.3.2 D-Cache Timing The latency of a load to the D-Cache depends on the opcode. For unsigned loads, data can be used two cycles after the load. For instance, if the first two instruc- tions in the instruction buffer are a load and an instruction dependent on that load, the grouping logic will break the group after the load and a bubble will be inserted in the pipeline the following cycle.
  • Page 289 raSPARC User’s Manual see later, this is desirable not only for improving the D-Cache hit rate (by increas- ing its utilization density), but also for D-Cache misses where, for sequential ac- cesses, one out of two requests to the E-Cache can be eliminated. Grouping load data beyond a D-Cache sub-block is also desirable, since an E-Cache line contains four D-Cache sub-blocks (for a total of 64 bytes).
  • Page 290 16. Code Generation Guidelines If such a load (D-Cache miss, E-Cache hit) is immediately followed by a use, the group is broken and an (N+1)-cycle stall occurs; Figure 16-12 illustrates this situ- ation. (The figure shows a 7-cycle stall, which is consistent with 1–1–1 mode; 2–2 mode incurs an 8-cycle stall.) load r use r...
  • Page 291 raSPARC User’s Manual load r load r load r load r load r load r load r load r use r 1–1–1 mode Figure 16-13 Pipelined Loads to the E-Cache ( shown) Thus, the load buffer must be at least seven entries deep to accommodate all pipelined loads in the steady state.
  • Page 292 16. Code Generation Guidelines Code Example 16-1 Load Hit Bypassing Load Miss (Not Supported on UltraSPARC) [%l1+%g0],%l6 (D-Cache miss) [%l2+%g0],%l7 (D-Cache hit) %l7,%g1,%g2 (use of D-Cache hit) %l6,%g1,%g3 (use of D-Cache miss) In Code Example 16-1, the first ADD will stall the pipeline until both the load miss and the load hit are handled.
  • Page 293 raSPARC User’s Manual .3.6.4 Mixing Independent Loads and Stores Note: The bus turnaround penalty is two cycles for systems running in 1–1–1 mode only; systems running in 2–2 mode incur no turnaround penalty. Mixing reads and writes from and to the E-Cache results in a penalty, caused by the difference in timing between reads and writes and also the bus turnaround time.
  • Page 294 16. Code Generation Guidelines In order to increase the throughput to the E-Cache, which results in decreasing the frequency of the store buffer full condition, UltraSPARC collapses two stores to the same 16 bytes of memory into one store. Since compression only occurs among two adjacent entries in the store buffer, the code should be organized so that multiple stores to the same “region”...
  • Page 295 UltraSPARC User’s Manual Code Example 16-4 RAW Hazard Penalty %l1,[addr1] RAW Hazard [addr1],%l2 %l2,%l3,%l4 Under the Relaxed Memory Order (RMO) mode, stores can pass younger loads if a MEMBAR instruction has not been issued to prevent it. UltraSPARC provides hardware detection of Write-After-Read (WAR) hazards so that a store to the same memory address as an older outstanding load does not pass that load.
  • Page 296: Grouping Rules And Stalls

    Grouping Rules and Stalls 17.1 Introduction The chapter explains in detail how to group instructions to obtain maximum throughput in UltraSPARC. The following subsections explain the formatting conventions that make it easier to understand this information. 17.1.1 Textual Conventions Rules are presented that consider instructions in three different ways: Instructions: Actual SPARC-V9 and UltraSPARC machine instructions.
  • Page 297: General Grouping Rules

    UltraSPARC User’s Manual • (Move Floating-Point Register on Condition) FMOVcc — Consists of the following instructions: FMOV{s,d,q}A FMOV{s,d,q}CC FMOV{s,d,q}CS FMOV{s,d,q}E FMOV{s,d,q}G FMOV{s,d,q}GE FMOV{s,d,q}GU FMOV{s,d,q}L FMOV{s,d,q}LE FMOV{s,d,q}LEU FMOV{s,d,q}N FMOV{s,d,q}NE FMOV{s,d,q}NEG FMOV{s,d,q}POS , and FMOV{s,d,q}VC FMOV{s,d,q}VS Instruction Classes: Groups of SPARC-V9 and UltraSPARC instructions that have similar effects. Instruction classes are always written in lower case italic body font.
  • Page 298: Instruction Availability

    17. Grouping Rules and Stalls • Floating-point/graphics Note: belong to CALL RETURN JMPL FCMP{LE,NE,GT,EQ}{16,32} multiple categories. 17.3 Instruction Availability Instruction dispatch is limited to the number of instructions available in the in- struction buffer. Several factors limit instruction availability. UltraSPARC fetches up to four instructions per clock from an aligned group of eight instructions.
  • Page 299: Integer Execution Unit (Ieu) Instructions

    UltraSPARC User’s Manual 17.5 Integer Execution Unit (IEU) Instructions IEU instructions can be dispatched only if they are in the first three instruction slots. A maximum of two IEU instructions can be executed in one cycle. There are two IEU pipelines: IEU and IEU .
  • Page 300 17. Grouping Rules and Stalls , and delay dispatching subsequent instructions for a variable MULX {U,S}MUL{cc} number of clocks, depending on the value of the rs1 operand. Four bubbles are inserted when the upper 60 bits of rs1 are zero, or for signed multiplies when the upper 60 bits of rs1 are one.
  • Page 301 UltraSPARC User’s Manual Instructions that read the result of a cannot be in the same group MOVcc MOVr or the following group. For example: MOVcc %xcc, 0, i6 [i6+i1], i8 Instructions that read the result of an (including stores) FCMP{LE,NE,GT,EQ}{16,32} cannot be in the same group or in the two following groups.
  • Page 302: Control Transfer Instructions

    17. Grouping Rules and Stalls FCMPLE16 FMOVr i5 17.6 Control Transfer Instructions One Control Transfer Instruction (CTI) can be dispatched per group. The follow- ing control transfer instructions are not single group instructions: CALL BPcc , and are always dispatched as the oldest JMPL CALL JMPL...
  • Page 303 UltraSPARC User’s Manual If the delay slot of a DCTI is aligned on a 32-byte address boundary (that is, the DCTI is the last instruction in a cache line and the delay slot contains the first in- struction in the next cache line), then the DCTI cannot be grouped with instruc- tions from the predicted stream.
  • Page 304 17. Grouping Rules and Stalls the W Stage . If the branch in the previous example was predicted not taken but actually was taken: setcc BPcc (mispredicted) FADD (delay slot) f0 (sequential) FMUL FMUL f0,f0,f0 (branch target) If an annulling branch is predicted not taken, the delay slot is still dispatched. Multicycle instructions (except load instructions) run to completion, even if the delay slot instruction is annulled.
  • Page 305: Load / Store Instructions

    UltraSPARC User’s Manual An annulled load use or floating-point use will be treated as a dependent instruc- tion until the N Stage of the branch. For example: FADD f7,f7,f6 Bcc, a (not taken) FADD f6,f7,f8 flushed FADD f6,f7,f8 If the annulling branch is grouped with a delay slot containing a load use, the group will pay the full load use penalty even if the load use is annulled.
  • Page 306 17. Grouping Rules and Stalls Stores are not stalled on a cache miss. Stores are enqueued in the store buffer un- til data can be written to the E-Cache SRAM for cacheable accesses, the UDB for noncacheable accesses, or the internal register for internal ASIs. Store data is written in the order that stores are issued, so a cache miss forces subsequent store hits to remain enqueued until the older store miss data is written out.
  • Page 307 UltraSPARC User’s Manual 17.7.1.2 Cache Timing The following example illustrates D-Cache hit timing. The first load causes UltraSPARC to enter delayed return mode, returning data in the N Stage. The second load is also in delayed return mode returning data in its N Stage, other- wise it would collide with the first load data.
  • Page 308 17. Grouping Rules and Stalls 17.7.1.4 Read-After-Write and Interaction with Store Buffer If a load hits the D-Cache and overlaps a store in the store buffer, the load will not return data until two clocks after the store updates the D-Cache. The overlap check is pessimistic, because only the lower 14 bits of the effective memory ad- dress are checked.
  • Page 309 UltraSPARC User’s Manual instructions are held in the G Stage until three clocks after the N Stage, LDD{A} or until older loads have returned data. If is dispatched and a miss occurs LDD{A} on an N Stage or earlier load, the instruction will be canceled in the W Stage and fetched again.
  • Page 310: Floating-Point And Graphic Instructions

    17. Grouping Rules and Stalls #LoadStore or #MemIssue will force younger stores to remain out- MEMBAR standing until four clocks after all older loads are not outstanding. In PSO or TSO, stores remain outstanding until four clocks after all older loads are not out- standing.
  • Page 311 UltraSPARC User’s Manual A Class: F{i,x}TO{s,d} F{s,d}TO{d,s} F{s,d}TO{i,x} FABS{s,d} FADD{s,d} FALIGNDATA FAND{s} FANDNOT1{s} FANDNOT2{s} FCMP{E}{s,d} FEXPAND FMOVr{s,d} FMOV{s,d}cc FNAND{s} FNEG{s,d} FNOR{s} FNOT1{s} FNOT2{s} FONE{s} FOR{s} FORNOT1{s} FORNOT2{s} FPADD{16,32}{s} FPMERGE FPSUB{16,32}{s} , and FSRC1{s} FSRC2{s} FSUB{s,d} FXNOR{s} FXOR{s} FZERO{s} M Class: FCMP{LE,NE,GT,EQ}{16,32} FDIST FDIV{s,d}...
  • Page 312 17. Grouping Rules and Stalls MOVcc based on a floating-point condition code can be in the same group as an FCMP{E}{s,d}, however, if they reference different condition codes. For example: FCMP fcc0, f2, f4 MOVcc fcc1, f6, f8 Latencies between dependent floating-point and graphics instructions are shown in Table 17-1, “Latencies for Floating-Point and Graphics Instructions,”...
  • Page 313 UltraSPARC User’s Manual Floating-point stores other than can store the result of a floating-point or ST{X}FSR graphics instruction other than and be in the same group. For ex- FDIV FSQRT ample: FADDs f2, f5, f6 f6, [address] Floating-point stores of the result of an are treated the same as a FDIV FSQRT...
  • Page 314 17. Grouping Rules and Stalls For the preceding two rules, all graphics instructions, FDIVs FSQRTs FdTOi , and are considered to be double, even FsTOx FiTOd FxTOs FsTOd FdTOs FsMULd though a single-precision register is referenced. For example, the following in- structions can be grouped together: FORs f2, f4, f0...
  • Page 315 raSPARC User’s Manual Table 17-1 Latencies for Floating-Point and Graphics Instructions → Result used by FPA or FPM FADD{s,d} FMOVr{s,d} FPACK{16,32,FIX} PDIST {rd} FSUB{s,d} FMOVcc{s,d} FMUL8x16{AL,AU} F{s,d}TO{i,x} FMOV{s,d} FMUL{d}8ULx16 F{i,x}TO{d,s} FABS{s,d} FMUL{d}8SUx16 F{s,d}TO{d,s} FNEG{s,d} PDIST{rs1, rs2} FCMP{s,d} FPADD{16,32}{s} FCMPLE{16,32} FCMPE{s,d} FPSUB{16,32}{s} FCMPNE{16,32} Result...
  • Page 316: Appendixes

    Appendixes A. Debug and Diagnostics Support ............303 Performance Instrumentation ............319 C. Power Management................327 D. IEEE 1149.1 Scan Interface ..............329 Pin and Signal Descriptions ............... 337 ASI Names .................... 345 Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 317 UltraSPARC User’s Manual Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 318: Debug And Diagnostics Support

    Debug and Diagnostics Support A.1 Overview All debug and diagnostics accesses are double-word aligned, 64-bit accesses. Non-aligned accesses cause a trap. Accesses must use mem_address_not_aligned LDXA/STXA/LDFA/STDFA instructions, except for the instruction cache ASIs which must use LDDA/STDA/STDFA instructions. Using another type of load or store will cause a trap (with SFSR.FT = 8, Illegal ASI size).
  • Page 319: Floating-Point Control

    raSPARC User’s Manual This control register is accessed through ASR 18 . Nonprivileged accesses to this register cause a trap. See also Table 10-1, “Machine State After Reset privileged_opcode and in RED_state,” on page 172 for the state of this register after reset. —...
  • Page 320 A. Debug and Diagnostics Support Table A-1 ASIs Affected by Watchpoint Traps Watchpoint if Watchpoint if ASI Type ASI Range D-MMU Matching VA Matching PA ..11 ..19 ..2C Translating ASIs ..71 ..79 ..FF ..15 — Bypass ASIs ..1D ..6F — Nontranslating ASIs ..77 ..7F...
  • Page 321: Lsu_Control_Register

    User’s Manual DB_VA: The 64-bit virtual data watchpoint address. Note: UltraSPARC-I and UltraSPARC-II support a 44-bit virtual address space. Software is responsible to write a sign-extended 64-bit address into the VA watchpoint register. The watchpoint address is sign-extended to 64 bits from bit 43 when read.
  • Page 322 A. Debug and Diagnostics Support LSU.D-Cache_enable. If cleared, misses are forced on D-Cache accesses with no cache fill. A FLUSH, DONE, or RETRY instruction is needed after software changes this bit to ensure the new information is used. A.6.2 MMU Control LSU.enable_I-MMU.
  • Page 323 UltraSPARC User’s Manual A.6.4.1 Virtual Address Data Watchpoint Enable VR, VW: LSU.virtual_address_data_watchpoint_enable. If VR/VW is set, a data read/write that matches the (range of) addresses in the virtual watchpoint register cause a watchpoint trap. Both VR and VW may be set to place a watchpoint for either a read or write access.
  • Page 324: I-Cache Diagnostic Accesses

    A. Debug and Diagnostics Support watchpoint is disabled. If the watchpoint is enabled and a data reference overlaps any of the watched bytes in the watchpoint mask, a physical watchpoint trap is generated. A.7 I-Cache Diagnostic Accesses The instruction cache (I-Cache) utilizes the Dynamic Set Prediction technique to realize a set-associative cache with a direct-mapped physical RAM design.
  • Page 325 UltraSPARC User’s Manual Note: To simplify the implementation, read access to the instruction cache fields (ASIs 60 .. 6F ) must use the LDDA instruction instead of LDXA or LDDFA. Using another type of load causes a trap (with SFSR.FT = 8, data_access_exception Illegal ASI size).
  • Page 326 A. Debug and Diagnostics Support Undefined IC_valid Undefined IC_tag Figure A-9 I-Cache Tag/Valid Field Data Format (ASI 67 Undefined: The value of these bits are undefined on reads and must be masked off by software. IC_valid: The 1-bit valid field IC_tag: The 28-bit physical tag field (PA<40:13>...
  • Page 327 raSPARC User’s Manual Undefined: The value of these bits are undefined on reads and must be masked off by software. IC_pdec: The two 4-bit pre-decode fields. The encodings are: • Bits<3:2> = 00 CALL, BPA, FBA, FBPA or BA • Bits<3:2> = 01 Not a CALL, JMPL, BPA, FBA, FBPA or BA •...
  • Page 328 A. Debug and Diagnostics Support Undefined, und: The value of these bits are undefined on reads and must be masked off by software. IC_lru: Selects the least recently accessed set of the line corresponding to IC_addr. There is only one physical lru bit per IC_addr value (i.e. cache line).
  • Page 329: D-Cache Diagnostic Accesses

    UltraSPARC User’s Manual Note: The branch prediction, set prediction and next field address fields are not updated when instructions are loaded into the cache with ASI_ICACHE_INSTR. When a cache line is brought into the I-Cache, the corresponding IC_sp fields are initialized to the same set as the currently missed line.
  • Page 330: E-Cache Diagnostics Accesses

    A. Debug and Diagnostics Support DC_addr: This 9-bit index <13:5> selects a tag/valid field (512 tags). — DC_tag DC_valid Figure A-19 D-Cache Tag/Valid Access Data Format (ASI 47 DC_tag: The 28-bit physical tag (PA<40:13> of the associated data). DC_valid: The 2-bit valid field, one for each sub-block (32b block, 16b sub-block). Bit<1>...
  • Page 331 raSPARC User’s Manual EC_addr: A 16-bit index <18:3> selects a 64-bit data field from a 0.5 Mb E-Cache. A 17-bit index <19:3> selects a 64-bit data field from a 1 Mb E-Cache. An 18-bit index <20:3> selects a 64-bit data field from a 2 Mb E-Cache. A 19-bit index <21:3>...
  • Page 332 A. Debug and Diagnostics Support If written, the content of the E-Cache_tag_data_register is written to the selected E-Cache tag/state/parity fields. The contents of the E-Cache_tag_data_register are previously updated with STA at ASI_ECACHE_TAG_DATA. Note: Software must ensure that the two-step operations are done atomically; e.g., LDXA ASI_ECACHE (TAG) and LDXA ASI_ECACHE_TAG_DATA, STXA ASI_ECACHE_TAG_DATA and STXA ASI_ECACHE (TAG).
  • Page 333 UltraSPARC User’s Manual Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 334: Performance Instrumentation

    Performance Instrumentation B.1 Overview Up to two performance events can be measured simultaneously in UltraSPARC. The Performance Control Register (PCR) controls event selection and filtering (that is, counting user and/or system level events) for a pair of 32-bit Perfor- mance Instrumentation Counters (PICs). B.2 Performance Control and Counters The 64-bit PCR and PIC are accessed through read/write Ancillary State Register instructions (RDASR/WRASR).
  • Page 335 raSPARC User’s Manual PIC for accurate timing and not on write-to-read counts. See also Table 10-1, “Ma- chine State After Reset and in RED_state,” on page 172 for the state of these reg- isters after reset. — — — PRIV 15 14 Figure B-1 Performance Control Register (PCR)
  • Page 336: Pcr/Pic Accesses

    B. Performance Instrumentation B.3 PCR/PIC Accesses An example of the operational flow in using the performance instrumentation is shown in Figure B-3. start set up PCR context switch to B accumulate stat in PIC PCR.sel [saveA1] [0,1] PCR.UT/ST [saveA2] [0,1] PCR.PRIV PIC[PCR.sel] PIC[PCR.sel]...
  • Page 337 raSPARC User’s Manual Using the two counters to measure instruction completion and cycles allows cal- culation of the average number of instructions completed per cycle. 4.2 Grouping (G) Stage Stall Counts These are the major cause of pipeline stalls (bubbles) from the G Stage of the pipeline.
  • Page 338 B. Performance Instrumentation There are also overcounts due to, for example, mispredicted CTIs and dispatched instructions that are invalidated by traps. Load_use_RAW [PIC1] There is a load use in the execute stage and there is a read-after-write hazard on the oldest outstanding load. This indicates that load data is being delayed by completion of an earlier store.
  • Page 339 raSPARC User’s Manual Loads that hit the D-Cache may be placed in the load buffer for a number of rea- sons; for example, the load buffer was not empty. Such loads may be turned into misses if a snoop occurs during their stay in the load buffer (due to an external request or to an E-Cache miss).
  • Page 340 B. Performance Instrumentation Note: A block memory access is counted as a single reference. Atomics count the read and write individually. B.4.5 PCR.S0 and PCR.S1 Encoding Table B-1 PiC.S0 Selection Bit Field Encoding S0 Value PIC0 Selection 0000 Cycle_cnt 0001 Instr_cnt 0010 Dispatch0_IC_miss...
  • Page 341 UltraSPARC User’s Manual Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 342: Power Management

    Power Management C.1 Overview Power-down mode is intended to support Energy Star compliance for UltraSPARC based systems. Energy Star specifies a system power dissipation of 30 watts in the standby mode. To support this, the goal is one-half watt for the UltraSPARC CPU and one-half watt for the remainder of the module when in the power-down mode.
  • Page 343: Power-Up

    UltraSPARC User’s Manual C.3 Power-Up Restart from power-down mode uses the power-on reset (POR) pin. The system must activate the reset pin with a stable external clock for the same time as a nor- mal power-on reset. This reset will shut off the external power-down (EPD) sig- nal (asynchronously if the module clock generator has been disabled), and enable the clock generator and PLL, like a normal power-up sequence.
  • Page 344: Introduction

    IEEE 1149.1 Scan Interface D.1 Introduction UltraSPARC provides an IEEE Std 1149.1-1990 compliant test access port (TAP) and boundary scan architecture. The primary use of 1149.1 scan interface is for board-level interconnect testing and diagnosis. The IEEE 1149.1 test access port and boundary scan architecture consists of three major parts: •...
  • Page 345: Test Access Port (Tap) Controller

    raSPARC User’s Manual Table D-1 IEEE 1149.1 Signals Signal Description Test data out. This is the scan shift output signal from either the instruction register or one of the test data registers. Test data input. This forms the scan shift in signal for the instruction and various test data registers.
  • Page 346: D. Ieee 1149.1 Scan Interface

    D. IEEE 1149.1 Scan Interface TEST-LOGIC-RESET RUN-TEST/IDLE SELECT-DR-SCAN SELECT-IR-SCAN CAPTURE-DR CAPTURE-IR SHIFT-DR SHIFT-IR EXIT-1-DR EXIT-2-IR PAUSE-DR PAUSE-IR EXIT-2-DR EXIT-2-IR UPDATE-DR UPDATE-IR Figure D-1 TAP Controller State Diagram D.3.3 SELECT-DR-SCAN A temporary state in which all test data registers retain their previous state. Sun Microelectronics Artisan Technology Group - Quality Instrumentation ...
  • Page 347 UltraSPARC User’s Manual D.3.4 SELECT-IR-SCAN A temporary state in which all test data registers retain their previous state. D.3.5 CAPTURE IR/DR In this state, the selected register (either instruction register or data register) loads data into its parallel input. For the instruction register, this corresponds to sampling the 8 bits of status infor- mation and the loading of the constant ‘01’...
  • Page 348: Instruction Register

    D. IEEE 1149.1 Scan Interface D.4 Instruction Register The instruction register is used to select the test to be performed and/or the test data register to be accessed. The instruction register is 8 bits wide and consists of a shift-register (with parallel inputs) and a parallel output stage.
  • Page 349 UltraSPARC User’s Manual Table D-3 IEEE 1149.1 Instruction Encodings Instruction IR encoding Scan Chain BYPASS bypass IDCODE id register EXTEST boundary SAMPLE boundary INTEST boundary PLLMODE pll mode CLKCTRL clock control RAMWCP ram control POWERCUT HIGHZ bypass INTEST2 boundary FULLSCAN ..7F internal D.5.1 Public Instructions...
  • Page 350: Public Test Data Registers

    D. IEEE 1149.1 Scan Interface D.5.1.4 INTEST Selects the boundary scan register as the active test data register. This instruction allows the boundary scan register to be used sa virtual low speed functional tester. The on-chip clock is derived from TCK and is issued in the Run-Test/Idle state of the TAP controller.
  • Page 351 UltraSPARC User’s Manual D.6.3 Boundary Scan Register Allows for the testing of circuitry external to the device; for example, the inter- connect (EXTEST), setting defined values at the device periphery (EXTEST), the sampling and examination of the values at the pins without disturbing the sys- tem (SAMPLE/PRELOAD), and the functional testing of the device itself (IN- TEST).
  • Page 352: Introduction

    Pin and Signal Descriptions E.1 Introduction This Appendix describes the UltraSPARC pins and signals in a general way. Con- sult the relevant data sheets for detailed information about the electrical and me- chanical characteristics of the processor, including pin and pad assignments. The “Bibliography”...
  • Page 353 UltraSPARC User’s Manual E.2.2 UltraSPARC Data Buffer (UDB) Pins Table E-2 UltraSPARC Data Buffer (UDB) Pins Symbol Type Name and Function SYSDATA<63:0> Connects the UDB chip to the system data interconnect. Two UDB chips are required. Each UDB chip handles half of the 128-bit system data interconnect. SYSECC<7:0>...
  • Page 354: E. Pin And Signal Descriptions

    E. Pin and Signal Descriptions E.2.3 System Interface Pins Table E-3 System Interface Pins Symbol Type Name and Function SYSADDR<35:0> I/O 36-bit bidirectional packet-switched request bus, which includes 1-bit odd-parity. It carries address bits PA<40:4> of a 41-bit physical address space in the P_REQ and S_REQ transac- tions described in Chapter 7, “UltraSPARC External Interfaces.”...
  • Page 355 MCAP<3:0> Implementation-dependent module capability bits. May be used to indicate speed range of the module. Hardwired externally. SCLK_MODE is present only on UltraSPARC-I. LOOP_CAP is present only on UltraSPARC-I. PHASE_DET_CLK is present only on UltraSPARC-II. ECACHE_22_MODE is present only on UltraSPARC-II.
  • Page 356: Signal Descriptions

    E. Pin and Signal Descriptions E.2.6 IEEE 1149.1 (JTAG) Interface Pins Table E-6 IEEE 1149.1 (JTAG) Interface Pins Symbol Type Name and Function IEEE 1149.1 test data output. A three-state signal driven only when the Test Access Port (TAP) controller is in the shift-DR state. IEEE 1149.1 test data input.
  • Page 357 Clock Stopper (debug) EXT_EVENT Initialization Reset RESET_L XIR Reset (NMI) XIR_L Power Down Mode ECAD<19:0> for UltraSPARC-II ECAT<17:0> for UltraSPARC-II LOOP_CAP present in UltraSPARC-I only Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 358 E. Pin and Signal Descriptions E.3.2 UltraSPARC Data Buffer (UDB) Signals Table E-9 UltraSPARC Data Buffer (UDB) Signals Function Name Count Data Transfer E-Cache Data Bus EDATA<63:0> E-Cache Data Bus Parity EDPAR<7:0> System Data Bus SYSDATA<63:0> System Data Bus ECC SYSECC<7:0>...
  • Page 359 UltraSPARC User’s Manual Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 360: Asi Names

    ASI Names F.1 Introduction This Appendix lists the names and suggested macro syntax for all supported Ad- dress Space Identifiers. Table F-1 ASI Names (Alphabetical) ASI Name or Macro Syntax Description Value ASI_AFAR Asynchronous fault address register ASI_AFSR Asynchronous fault status register ASI_AIUP Primary address space, user privilege ASI_AIUPL...
  • Page 361 UltraSPARC User’s Manual Table F-1 ASI Names (Alphabetical) (Continued) ASI Name or Macro Syntax Description Value ASI_BLK_PL Primary address space, block load/store, little endian ASI_BLK_S Secondary address space, block load/store ASI_BLK_SL Secondary address space, block load/store, little endian ASI_BLOCK_AS_IF_USER_PRIMAR Y Primary address space, block load/store, user privilege ASI_BLOCK_AS_IF_USER_PRIMARY_LI Primary address space, block load/store, user privilege, lit-...
  • Page 362: F. Asi Names

    F. ASI Names Table F-1 ASI Names (Alphabetical) (Continued) ASI Name or Macro Syntax Description Value ASI_EC_R E-Cache data RAM diagnostic read access ASI_EC_R E-Cache tag/valid RAM diagnostic read access ASI_EC_TAG_DATA E-Cache tag/valid RAM data diagnostic access ASI_EC_W E-Cache data RAM diagnostic write access ASI_EC_W E-Cache tag/valid RAM diagnostic write access ASI_ESTATE_ERROR_EN_REG...
  • Page 363 UltraSPARC User’s Manual Table F-1 ASI Names (Alphabetical) (Continued) ASI Name or Macro Syntax Description Value ASI_IC_TAG I-Cache tag/valid RAM diagnostic access ASI_IMMU I-MMU Synchronous Fault Status Register ASI_IMMU I-MMU Tag Target Register ASI_IMMU I-MMU TLB Tag Access Register ASI_IMMU I-MMU TSB Register ASI_IMMU_DEMAP I-MMU TLB demap...
  • Page 364 F. ASI Names Table F-1 ASI Names (Alphabetical) (Continued) ASI Name or Macro Syntax Description Value ASI_PRIMARY_NO_FAULT_LITTLE Primary address space, no fault, little endian ASI_PST16_PL Primary address space,4 16-bit partial store, little endian ASI_PST16_PRIMARY Primary address space,4 16-bit partial store ASI_PST16_PRIMARY_LITTLE Primary address space,4 16-bit partial store, little endian ASI_PST16_S...
  • Page 365 UltraSPARC User’s Manual Table F-1 ASI Names (Alphabetical) (Continued) ASI Name or Macro Syntax Description Value ASI_UDBH_ERROR_REG_READ External UDB Error Register, read high ASI_UDBH_ERROR_REG_WRITE External UDB Error Register, write high ASI_UDBL_CONTROL_REG_READ External UDB Control Register, read low ASI_UDBL_CONTROL_REG_WRITE External UDB Control Register, write low ASI_UDBL_ERROR_R External UDB Error Register, read low ASI_UDBL_ERROR_REG_READ...
  • Page 366: Introduction

    These models are: • UltraSPARC-I • UltraSPARC-II G.2 Summary UltraSPARC-I is the base processor model. UltraSPARC-II supports the following enhancements: • Reduced gate dimensions (0.35 µ) and faster cycles times (4 ns) • 8 Mb and 16 Mb E-Cache sizes •...
  • Page 367: References To Model-Specific Information

    raSPARC User’s Manual 3 References to Model-Specific Information Table G-1 lists the pages within the UltraSPARC User’s Manual that contain mod- el-specific information. Table G-1 UltraSPARC Model-Specific Information Page Description Implementation technologies and cycle times Number of trap levels E-Cache sizes E-Cache SRAM modes System : Processor clock frequency ratios Support for the PREFETCH{A} instructions...
  • Page 368: G. Differences Between Ultrasparc Models

    VA encoding to access 8 and 16 Mb E-Cache tag/state/parity fields Number of bits in ECAT interface Number of bits in ECAD interface SCLK_MODE pin is present only in UltraSPARC-I LOOP_CAP pin present only in UltraSPARC-I PHASE_DET_CLK pin present only in UltraSPARC-II...
  • Page 369 UltraSPARC User’s Manual Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 370: Back Matter

    Back Matter Glossary ....................... 357 Bibliography ....................363 Index ......................367 Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 371 UltraSPARC User’s Manual Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 372: Glossary

    Glossary This glossary defines some important words and acronyms used throughout this manual. Italicized words within definitions are further defined elsewhere in the list. aliases: Two virtual addresses are aliases of each other if they refer to the same physi- cal address.
  • Page 373 UltraSPARC User’s Manual CPI: Cycles per instruction. The number of clock cycles it takes to execute one instruction. cross call: An interprocessor call in a multi-processor system. current window: The block of 24 r registers to which the Current Window Pointer (CWP) regis- ter points.
  • Page 374 . Glossary may: A key word indicating flexibility of choice with no implied preference. Memory Management Unit (MMU): An MMU is a mechanism that implements a policy for address translation and protection among contexts. See also virtual address, physical address, and context.
  • Page 375 UltraSPARC User’s Manual privileged: An adjective that describes (1) the state of the processor when PSTATE.PRIV=1, that is, privileged mode; (2) processor state that is only accessible to software while the processor is in privileged mode; e.g., privi- leged registers, privileged ASRs, or, in general, privileged state; (3) an instruc- tion that can be executed only when the processor is in privileged mode.
  • Page 376 . Glossary should: A key word indicating flexibility of choice with a strongly preferred imple- mentation. The phrase “it is recommended” is used interchangeably with the key word should. side effect: A memory location is deemed to have side effects if additional actions beyond the reading or writing of data may occur when a memory operation on that location is allowed to succeed.
  • Page 377 UltraSPARC User’s Manual unassigned: A value (for example, an ASI number), the semantics of which are not archi- tecturally mandated and which may be determined independently by each implementation (preferably within any guidelines given). undefined: An aspect of the architecture that has deliberately been left unspecified. Soft- ware should have no expectation of, nor make any assumptions about, an undefined feature or behavior.
  • Page 378: Bibliography

    Bibliography General References Books [Weaver, David L., editor.] The SPARC Architecture Manual, Version 8, Prentice-Hall, Inc., 1992. Weaver, David L., and Tom Germond, eds. The SPARC Architecture Manual, Version 9, Prentice-Hall, Inc., 1994. IEEE Standard for Binary Floating-Point Arithmetic, IEEE Std 754-1985, IEEE, New York, NY, 1985.
  • Page 379: Sun Microelectronics (Sme) Publications

    World Wide Web. See “On Line Resources” below for information about the SME WWW pages. Data Sheets UltraSPARC-I Data Sheet (STP1030). UltraSPARC-I Data Buffer (UDB) Data Sheet (STP1080). UltraSPARC-I Crossbar Switch (XBI) Data Sheet (STP2230SOP). UltraSPARC-I UPA-To-SBUS Interface Data Sheet (STP2220BGA). UltraSPARC-I Reset/Interrupt/Clock Controller Data Sheet (STP2210QFP).
  • Page 380: How To Contact Sme

    The Sun Microelectronics WWW page is located at: http://www.sun.com/sparc It contains the latest information about the entire UltraSPARC product line, in- cluding HTML and Postscript copies of the UltraSPARC-I and UltraSPARC-II data sheets. Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 381 UltraSPARC User’s Manual Sun Microelectronics Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com...
  • Page 382 Alternate Global Registers 252 deasserted for second cycle of two-cycle AM, see Address Mask (AM) field of PSTATE packet 88 register driven by UltraSPARC-I 88 Ancillary State Register (ASR) 156 during reset 88 last state 84 annex register file 14 maintained by holding amplifiers 88...
  • Page 383 raSPARC User’s Manual RAY8 instruction 222 ASI_SDBL_ERROR_REG 184 field of SFSR register 58 ASI_SECONDARY 34 , see Alternate Space Identifier (ASI) field of ASI_SECONDARY_LITTLE 34 SFSR register ASI_SECONDARY_NO_FAULT 36, 42, 49 to 51 _AS_IF_USER_PRIMARY 34, 50 ASI_SECONDARY_NO_FAULT_LITTLE 36, 42, _AS_IF_USER_PRIMARY_LITTLE 34 49, 51 _AS_IF_USER_SECONDARY 34, 50 ASIs that support atomic accesses 34...
  • Page 384 Index board-level interconnect testing and diagnosis 329 C Stage 276, 290, 292 boiundary scan register 336 C stage 269 boundary scan 329 cache boundary scan chain 334 direct mapped 274 boundary scan register 334 to 335 external 18 flushing 28 branch inclusion 28 mispredicted 14...
  • Page 385 raSPARC User’s Manual heable store misses modified, own, exclusive, shared, invalid (MOESI) 8 back-to-back 295 coherency transactions hing in power-down mode 327 TSB 45 coherent P_REQ 92 NRESTORE Register 240, 285 Coherent P_REQ transaction NSAVE Register 240, 285 packet format illustrated 140 acity misses 275 coherent read hit S instruction 35...
  • Page 386 Index copybacks data alignment 7, 273 cache line 77, 357 data byte addresses within quadword CopybackToDiscard transaction 108, 141 illustrated 76 Copy-Out Parity Error (CP) field of AFSR 181 Data Cache (D-Cache) 8, 14 hiding misses 8 Correctable ECC Error (CE) field of AFSR 181 illustrated 5 correctable error 179 miss 8...
  • Page 387 raSPARC User’s Manual hit rate 274 Demap Context operation 67 hit timing 292 dependency latency (pin-to-pin) 275 load use 269 line 273 to 274 dependency checking 289 load hit 292 to 293 destination register 360 load miss 292 Diag, see Diagnostics (Diag) field of TTE logical organization illustrated 272 Diagnostic (Diag) field of TTE 43 miss 291, 324...
  • Page 388 Index cacheable and noncacheable 33 E-Cache coherence states defined 94 DONE instruction 39, 252, 307 E-Cache coherency DSYN_WR_L pin 340 system responsibility 94 DSYN_WR_L signal 341 E-Cache Data Access Address Dtags 98 illustrated 315 Dtags (coherence sequence without them) 101 E-Cache Data Access Data Dtags (coherence sequence) 99 illustrated 316...
  • Page 389 raSPARC User’s Manual ATA pins 338 to 339 extended floating-point pipeline 11 ATA signals 341, 343 extended instructions 3, 253 e handling instructions 219 Extended Interrupt Target ID 117 external cache 4, 18 e mask encoding 220 little-endian 221 External Cache (E-Cache) 8, 14 GE16 instruction 219 External Cache Unit (ECU) 8 illustrated 5...
  • Page 390 Index fcc3, see Floating-Point Condition Code 3 (fcc3) field floating-point IEEE-754 exception 358 of FSR register floating-point multiplier 297 fccN 358 floating-point pipeline 7, 11 FCMPEQ instruction 218 floating-point queue 11 FCMPEQ16 instruction 217 floating-point register file 14 to 15, 19 FCMPEQ32 instruction 217 Floating-Point Registers State (FPRS) Register 244...
  • Page 391 raSPARC User’s Manual OT1S instruction 215 FPSUB32 instruction 199 OT2 instruction 215 FPSUB32S instruction 199 to 200 OT2S instruction 215 FPU Enabled (FEF) field of FPRS register 198, NE instruction 215 FQ, see floating-point deferred trap queue (FQ) 247 NES instruction 215 frame buffer 278 FSRC1 instruction 215 textual conventions 11...
  • Page 392 Index graphics instructions 293 illustrated 310 Graphics Status Register (GSR) 197, 304 I-Cache Instruction Access Data 310 illustrated 310 Graphics Unit (GRU) 7 I-Cache miss processing 265 illustrated 5 I-Cache organization 262 Group (G) Stage illustrated 262, 309 illustrated 11 I-Cache Predecode Field Access Address 311 group break 287 illustrated 311...
  • Page 393 7, 11 ruction grouping integer register file 15, 240, 284 anti-dependency constraints 282 interconnect master 102 input dependency constraints 282 UltraSPARC-I 74 output dependency constraints 282 interconnect packet formats 138 read-after-write dependency interconnect packet types constraints 282 illustrated 139...
  • Page 394 39 trap 116, 159, 162 to 163, 252 interrupt_vector internal ASIs 39 interrupter internal cache coherency UltraSPARC-I as 75 UltraSPARC-I responsibility 94 invalid_fp_register floating-point trap type 246, interprocessor call 358 Interrupt (P_INT_REQ) 116 Invalidate transaction 106, 141 Interrupt Disable (INT_DIS) field of TICK...
  • Page 395 359 d Data Parity Error (LDP) field of AFSR 181 MCAP pin 340 d hit bypassing load miss not support on UltraSPARC-I 277 trap 47, 49, 56, 58, 154, mem_address_not_aligned 159, 226, 228 to 229, 231, 238, 273, 303...
  • Page 396 Index MEMBAR examples MMU demap 66 and memory ordering 31 MMU demap context operation 66, 68 MEMBAR instruction 31 to 32, 38, 258 MMU demap operation format memory access instructions 225 illustrated 66 memory accesses MMU demap page operation 66, 68 global visibility 31 MMU dTLB Tag Access Register memory ECC error 182...
  • Page 397 raSPARC User’s Manual LD8SUx16 instruction 212 next program counter 359 LD8ULx16 instruction 213 NFO bit in MMU 36 lticycle instructions 289 NFO page attribute bit 280 ltiflow TRACE and Cydrome Cydra-5 280 NFO, see No-Fault Only (NFO) field of TTE ltiple bit ECC error 176 No Dual Tag Present (NDP) option 93 ltiple Error (ME) field of AFSR 181...
  • Page 398 Index and TLB miss 36 Number of Slave Reads (ONEREAD) field of UPA_PORT_ID register 153 Non-faulting loads 248 Number of Writebacks (WB) field of UPA_ non-faulting loads 36, 280 CONFIG register 155 non-privileged 359 NWINDOWS 240, 242, 359 non-privileged mode 359 Non-privileged Trap (NPT) field of TICK register 239 nonrestricted ASI 146...
  • Page 399 raSPARC User’s Manual NCBRD_REQ 110, 118, 122, 126, 141 P_SNACK transaction 93 NCBWR_REQ 111, 122, 127, 141 P_WRB_REQ 95 to 97, 101, 104, 113, 115, 120, 122, 128, 135, 138, 141 NCRD_REQ 109, 118 to 120, 122, 126 to 127, 141 to 142 P_WRI_REQ 95 to 96, 101, 105 to 106, 122, 127, 141 to 144...
  • Page 400 Index PCON, see Processor Configuration (PCON) field of Physical Address Data Watchpoint Write Enable UPA_CONFIG register (PW) field of LSU_Control_Register 308 physical address space PContext field 57 accessing 145 PCR Cycle_cnt function 321 size 3 PCR DC_hit function 323 physical memory 362 PCR DC_ref function 323 physical page attribute bits PCR Dispatch0_dyn_use function 323...
  • Page 401 raSPARC User’s Manual el distance 7 Primary Context Register 57 el orderings 197 PRIV, see Privileged (PRIV) field of PCR register Privilege (PRIV) field of AFSR 177 L_BYPASSS signal 343 privilege (PRIV) field of PSTATE register 180 LBYPASS signal 342 , see Physical Address Data Watchpoint Mask privilege violation 60 (PM) field of LSU_Control_Register...
  • Page 402 Index Register (R) Stage 14 register file qne, see Queue Not Empty (qne) field of FSR register annex 14 quad-precision floating-point instructions 244 floating-point 14 to 15, 19 quadword ordering 76 integer 15 queue Register Stage floating-point 11 illustrated 11 Queue Not Empty (qne) field of FSR register 247 register window 7 Relaxed Memory Order (RMO) 280...
  • Page 403 raSPARC User’s Manual S_SRS 120 TVaddr 171, 236 S_SWIB 116, 120, 122 S_WAB 97, 105, 113, 115, 117, 120, 122, 129, 135 S_WAS 110 to 111, 120, 122, 129 S_WBCAN 97, 101, 105, 113, 115, 120 to 122, 125, ERR 111 129, 137 to 138 BP_REQ 122 S0, see Select Code 0 (S0) field of PCR register...
  • Page 404 ECC error 178 speculative load to page marked with E-bit 31 Size, see Page Size (Size) field of TTE speculative loads slave support for 4 UltraSPARC-I as 75 trap 159 spill_n_normal Slave Interface (valid S_REPLY & P_REPLY trap 159 spill_n_other types) 130 Split field of TSB register 62...
  • Page 405 raSPARC User’s Manual see System Trace (ST) field of PCR register in E-Cache 77 ble storage 28 to 29 SYSADDR pins 339 e transition invariants 95 SYSADDR bus 85, 87, 92, 116, 119, 138 to 139, 143 arbitration protocol 84 AR (SPARC-V8) 32 current driver 84 equivalent to MEMBAR...
  • Page 406 Index reserved fields 235 TICK_CMPR, see Tick Compare (TICK_CMPR) field of TICK_compare register TCK IEEE 1149.1 signal 330 TICK_CMPR_REG register 157 TCK pin 338, 341 TICK_INT 167, 250 TCK signal 342 to 343 TICK_REG Ancillary State Register (ASR) 156 TDATA pins 339 Timeout 122 TDATA signals 341 TL Register 285...
  • Page 407 _Base field of TSB Register 61 UltraSPARC-I block diagram 5 _Base, see Base Address (TSB_Base) field of TSB UltraSPARC-I Data Buffer (UDB) 10, 74, 127, 175, Artisan Technology Group - Quality Instrumentation ... Guaranteed | (888) 88-SOURCE | www.artisantg.com register...
  • Page 408 156 interaction with E-Cache 76 UPA_Slave_Int_L signal interface pins defined 337 unused in UltraSPARC-I 153 UltraSPARC-I Data Buffer (UDB) Error UPACAP, see UPA Capabilities (UPACAP) field of Register 186 UPA_PORT_ID register UltraSPARC-I extended instructions 253 UPACAP, see UPA Capabilities (UPACAP) subfield...
  • Page 409 raSPARC User’s Manual ual color 28 to 29 Writeback transaction 104, 114, 119, 136 to 137, ual noncacheable accesses 18 cancellation 114 to 115 ual page number 21 WritebackInvalidate transaction 141 ual_address_data_watchpoint_mask 308 writebacks ually cacheable 28 cache line 77 ually indexed, physically tagged (VIPT) 272 write-invalidate cache coherency protocol 98 cache 8...
  • Page 410 Artisan Technology Group is your source for quality new and certified-used/pre-owned equipment SERVICE CENTER REPAIRS WE BUY USED EQUIPMENT • FAST SHIPPING AND DELIVERY Experienced engineers and technicians on staff Sell your excess, underutilized, and idle used equipment at our full-service, in-house repair center We also offer credit for buy-backs and trade-ins •...

This manual is also suitable for:

Ultrasparc-ii

Table of Contents