Summary of Contents for Digital Equipment Alpha 21164PC
Page 1
Digital Semiconductor Alpha 21164PC Microprocessor Hardware Reference Manual Order Number: EC–R2W0A–TE Revision/Update Information: This is a preliminary document. Preliminary Digital Equipment Corporation Maynard, Massachusetts http://www.digital.com/semiconductor...
Page 2
Digital Equipment Corporation makes no representations that the use of its products in the manner described in this publication will not infringe on existing or future patent rights, nor do the descriptions contained in this publication imply the granting of licenses to make, use, or sell equipment or software in accordance with the description.
Page 13
6–4 HW_MFPR and HW_MTPR Instruction Format ......6-12 9–1 osc_clk_in_h,l Input Network and Terminations ......9–2 Impedance vs Clock Input Frequency.
Page 14
Tables 2–1 Effect of Branching Instructions on the Branch—Prediction Stack ... . 2–2 Pipeline Examples—All Cases ........2-15 2–3 Pipeline Examples—Integer Add .
Page 15
5–22 Dcache Test Tag Control Register Fields ....... 5-52 5–23 Dcache Test Tag Register Fields .
Page 16
A–4 Opcodes Reserved for PALcode ........A-10 A–5 IEEE Floating-Point Instruction Function Codes .
Page 17
Preface This manual provides information about the architecture, internal design, external interface, and specifications of the Digital Semiconductor Alpha 21164PC micropro- cessor (referred to as the 21164PC) and its associated software. Audience This reference manual is for system designers and programmers who use the 21164PC.
Page 18
Chapter 7, Initialization and Configuration, describes the initialization and con- • figuration sequence. Chapter 8, Error Detection and Error Handling, describes error detection and • error handling. Chapter 9, Electrical Data, provides electrical data and describes signal integrity • issues. Chapter 10, Thermal Management, provides information about thermal manage- •...
Page 19
Conventions This section defines product-specific terminology, abbreviations, and other conven- tions used throughout this manual. Abbreviations • Binary Multiples The abbreviations K, M, and G (kilo, mega, and giga) represent binary multiples and have the following values. (1024) (1,048,576) (1,073,741,824) For example: = 2 ×...
Page 20
RC — Read To Clear A register field specified as RC is written by hardware and remains unchanged until read. The value may be read by software, at which point, hardware may write a new value into the field. RES — Reserved Bits and fields specified as RES are reserved by Digital Semiconductor and should not be used;...
Page 21
Bit Notation Multiple-bit fields can include contiguous and noncontiguous bits contained in angle brackets (<>). Multiple contiguous bits are indicated by a pair of numbers separated by a colon (:). For example, <9:7,5,2:0> specifies bits 9,8,7,5,2,1, and 0. Similarly, single bits are frequently indicated with angle brackets. For example, <27> specifies bit 27.
Page 22
Security Holes Security holes exist when unprivileged software (that is, software that is running out- side of kernel mode) can: Affect the operation of another process without authorization from the operating • system. • Amplify its privilege without authorization from the operating system. •...
Page 23
An UNPREDICTABLE result may acquire an arbitrary value subject to a few • constraints. Such a result may be an arbitrary function of the input operands or of any state information that is accessible to the process in its current access mode. UNPREDICTABLE results may be unchanged from their previous values.
Equipment Corporation’s RISC (reduced instruction set computing) architecture designed for high performance. The chapter then summarizes the specific features of the Digital Semiconductor Alpha 21164PC microprocessor (hereafter called the 21164PC) that implements the Alpha architecture. Appendix A provides a list of Alpha instructions.
The Architecture The 21164PC uses a set of subroutines, called privileged architecture library code (PALcode), that is specific to a particular Alpha operating system implementation and hardware platform. These subroutines provide operating system primitives for context switching, interrupts, exceptions, and memory management. These subrou- tines can be invoked by hardware or CALL_PAL instructions.
The Architecture 1.1.2 Integer Data Types Alpha architecture supports four integer data types. Data Type Description Byte A byte is eight contiguous bits that start at an addressable byte boundary. A byte is an 8-bit value. A byte is supported in Alpha architecture by the EXTRACT, INSERT, LDBU, MASK, SEXTB, STB, ZAP, PACK, UNPACK, MIN, MAX, and PERR instructions.
21164PC Microprocessor Features • VAX floating-point formats – F_floating – G_floating – D_floating (limited support) 1.2 21164PC Microprocessor Features The 21164PC is a superscalar pipelined processor manufactured using 0.35-µm CMOS technology. It is packaged in a 413-pin IPGA carrier and has removable application-specific heat sinks.
Page 29
21164PC Microprocessor Features • An onchip, dual-read-ported, 8KB data cache. • An onchip write buffer with six 32-byte entries. • A 128-bit data bus with onchip parity and offchip longword parity. • Support for an external second-level cache. The size and access time of the external second-level cache is programmable.
Internal Architecture This chapter provides both an overview of the 21164PC microarchitecture and a sys- tem designer’s view of the 21164PC implementation of the Alpha architecture. The combination of the 21164PC microarchitecture and privileged architecture library code (PALcode) defines the chip’s implementation of the Alpha architecture. If a certain piece of hardware seems to be “architecturally incomplete,”...
21164PC Microarchitecture 2.1 21164PC Microarchitecture The 21164PC microprocessor is a high-performance implementation of Digital Equipment Corporation’s Alpha architecture. Figure 2–1 is a block diagram of the 21164PC that shows the major functional blocks relative to pipeline stage flow. The following paragraphs provide an overview of the chip’s architecture and major func- tional units.
21164PC Microarchitecture – Instruction translation buffer – Branch prediction – Instruction slotting/issue – Interrupt support • Integer execution unit (IEU) (Section 2.1.2) • Floating-point execution unit (FPU) (Section 2.1.3) • Memory address translation unit (MTU) (Section 2.1.4), which includes: – Data translation buffer (DTB) –...
Page 34
21164PC Microarchitecture 2.1.1.1 Instruction Decode and Issue The IDU decodes up to four instructions in parallel and checks that the required resources are available for each instruction. The IDU issues only the instructions for which all required resources are available. The IDU does not issue instructions out of order, even if the resources are available for a later instruction and not for an earlier one.
Page 35
21164PC Microarchitecture Prefetching does not begin until there is a “true” miss. A true miss is a reference that misses in the Icache and then also misses in the refill buffer. If an Icache miss results in a refill buffer hit, prefetching is not started until all the data has been moved from the refill buffer entry into the pipeline.
21164PC Microarchitecture on the top two count values and is predicted not-taken on the bottom two count val- ues. The history status is not initialized on Icache fill, therefore it may “remember” a branch that was evicted from the Icache and subsequently reloaded. The 21164PC does not limit the number of branch predictions outstanding to one.
21164PC Microarchitecture The RET, JSR_COROUTINE, and HW_REI instructions predict the next PC by using the index from the subroutine return stack. The upper bits of the PC are formed from the data in the Icache tag at that index. These predictions are checked against the actual PC in exactly the same way that JMP and JSR predictions are checked.
21164PC Microarchitecture • One superpage maps virtual address bits <39:13> to physical address bits <39:13>, on a one-to-one basis, when virtual address bits <42:41> equal 2. This maps the entire physical address space four times over to the quadrant of the vir- tual address space.
21164PC Microarchitecture Each interrupt source, or group of sources, is assigned an interrupt priority level (IPL), as shown in Table 4–11. The current IPL is set using the IPLR register (see Section 5.1.18). Any interrupts that have an equal or lower IPL are masked. When an interrupt occurs that has an IPL greater than the value in the IPLR register, program control passes to the INTERRUPT PALcode entry point.
Page 40
21164PC Microarchitecture The floating-point divide unit is associated with the floating-point add pipeline but is not pipelined. The FPU can accept two instructions every cycle, with the exception of floating- point divide instructions. The result latency for nondivide, floating-point instructions is four cycles.
Page 41
21164PC Microarchitecture The DTB also supports the optional superpage extensions that are enabled using ICSR<SPE>. The DTB superpage maps provide virtual-to-physical address transla- tion for two regions of the virtual address space, as described in Section 2.1.1.4. PALcode fills and maintains the DTB. The operating system, using PALcode, must ensure that virtual addresses be mapped either through a single DTB entry or through superpage mapping.
Page 42
21164PC Microarchitecture A load instruction that is issued one cycle after a store instruction in the pipeline cre- ates a conflict if both the load and store operations access the same memory location. (The store instruction has not yet updated the location when the load instruction reads it.) This conflict is handled by forcing the load instruction to take a replay trap;...
Page 43
Pipeline Organization 2.1.6.1 Data Cache The data cache (Dcache) is a dual-read-ported, single-write-ported, 8KB cache. It is a write-through, read-allocate, direct-mapped, byte-accessible, physical cache with 32-byte blocks and data parity at the byte level. 2.1.6.2 Instruction Cache The instruction cache (Icache) is a 16KB, virtual, direct-mapped cache with 64-byte blocks and 32-byte fills.
Pipeline Organization Table 2–2 Pipeline Examples—All Cases Pipeline Stage Events Access Icache tag and data. Buffer four instructions, check for branches, calculate branch displace- ments, and check for Icache hit. Slot-swap instructions around so they are headed for pipelines capable of executing them.
Pipeline Organization Table 2–5 Pipeline Examples—Load (Dcache Hit) Pipeline Stage Events Calculate the effective address. Begin the Dcache data and tag store access. Finish the Dcache data and tag store access. Detect Dcache hit. Format the data as required. Bcache arbitration defaults to pipe E0 in anticipation of a possible miss.
Pipeline Organization Table 2–7 Pipeline Examples—Store (Dcache Hit) Pipeline Stage Events Calculate the effective address. Begin the Dcache tag store access. Finish the Dcache tag store access. Detect Dcache hit. Send store to the write buffer simultaneously. Write the Dcache data store if hit (write begins this cycle). 2.2.1 Pipeline Stages and Instruction Issue The 21164PC pipeline divides instruction processing into four static and a number of dynamic stages of execution.
Page 48
Pipeline Organization The nonexception case does not need to drain the pipeline of all outstanding instruc- tions ahead of the aborting instruction. The pipeline can be restarted immediately at a redirected address. Examples of nonexception abort conditions are branch mispre- dictions, subroutine call/return mispredictions, and replay traps.
Scheduling and Issuing Rules 2.2.3 Nonissue Conditions There are two reasons for nonissue conditions. The first is a pipeline stall wherein a valid instruction or set of instructions are prepared to issue but cannot due to a resource conflict (register conflict or function unit conflict). These types of nonissue cycles can be minimized through code scheduling.
Page 50
Scheduling and Issuing Rules Table 2–8 Instruction Classes and Slotting (Sheet 2 of 3) Class Name Pipeline Instruction List MXPR E0 or E1 HW_MFPR, HW_MTPR (depends on the IPR) Integer conditional branches Floating-point conditional branches Jump-to-subroutine instructions: JMP, JSR, RET, or JSR_COROUTINE, BSR, BR, HW_REI, CALLPAL IADD E0 or E1...
Page 51
Scheduling and Issuing Rules Table 2–8 Instruction Classes and Slotting (Sheet 3 of 3) Class Name Pipeline Instruction List FCPYS FM or FA CPYS, not including CPYSN or CPYSE MISC RPCC, TRAPB UNOP None UNOP IEU pipeline 0. IEU pipeline 1. FEU add pipeline.
Scheduling and Issuing Rules • An instruction of class LD cannot be issued simultaneously with an instruction of class ST. • All instructions are discarded at the slotting stage after a predicted-taken IBR or FBR class instruction, or a JSR class instruction. •...
Scheduling and Issuing Rules Instructions [a] (the LDL) and [b] (the first ADDL) in the following example are slotted together. Instruction [b] stalls (split-issue), thus preventing instruction [c] from advancing to the issue stage: Code example showing Code example showing incorrect ordering correct ordering (1) [a] LDL...
Scheduling and Issuing Rules Table 2–9 Instruction Latencies (Sheet 1 of 2) Additional Time Before Result Available to Class Latency Integer Multiply Unit Dcache hits, latency=2. 1 cycle Dcache miss/Bcache hit, latency=10 or longer. Store operations produce no result. — LDx_L Dcache hits, latency=2.
Page 55
Scheduling and Issuing Rules Table 2–9 Instruction Latencies (Sheet 2 of 2) Additional Time Before Result Available to Class Latency Integer Multiply Unit IMULQ Latency=12, plus up to 2 cycles of added latency, depending on 1 cycle the source of the data. Latency until next IMULL, IMULQ, or IMULH instruction can issue (if there are no data dependencies) is 8 cycles plus the number of cycles added to the latency.
Scheduling and Issuing Rules 2.3.3.1 Producer–Producer Latency Producer–producer latency, also known as write-after-write conflicts, cause issue- stalls to preserve write order. If two instructions write the same register, they are forced to do so in different cycles by the IDU. This is necessary to ensure that the correct result is left in the register file after both instructions have executed.
Scheduling and Issuing Rules 2.3.4 Issue Rules The following is a list of conditions that prevent the 21164PC from issuing an instruction: • No instruction can be issued until all of its source and destination registers are clean; that is, all outstanding write operations to the destination register are guar- anteed to complete in issue order and there are no outstanding write operations to the source registers, or those write operations can be bypassed.
Replay Traps • No instruction can be issued to pipe E0 or E1 exactly two cycles before an inte- ger register fill is requested (speculatively) by the CBU, except IMULL, IMULQ, and IMULH instructions and instructions that do not produce any result.
Miss Address File and Load-Merging Rules • Load-after-store trap: A replay trap occurs if a load instruction is issued in the cycle immediately following a store instruction that hits in the Dcache, and both access the same location. The address match is exact for address bits <12:2> (longword granularity), but ignores address bits <42:13>.
Miss Address File and Load-Merging Rules • Merging is prevented for the MAF entry after the first data fill (to that MAF entry) from the Bcache, regardless of whether the Bcache access hits or not. • Load misses that match any MAF address down to the INT32 boundary, but could not merge (for any reason), are replay trapped.
Miss Address File and Load-Merging Rules A bypass is provided so that if the load instruction issues in IEU pipe E0, and no MAF requests are pending, the load instruction’s read request is sent to the CBU immediately, provided the CBU is ready for such an access. Similarly, if a load instruction from IEU pipe E1 misses, and there was no load instruction in pipe E0 to begin with, the E1 load miss is sent to the CBU immediately.
MTU Store Instruction Execution Up to two floating or integer registers may be written for each CBU fill cycle. Fills deliver 32 bytes in two cycles: two INT8s per cycle. The MAF merging rules ensure that there is no more than one register to write for each INT8, so that there is a regis- ter file write port available for each INT8.
Write Buffer and the WMB Instruction A load instruction that is issued one cycle after a store instruction in the pipeline cre- ates a conflict if both access exactly the same memory location. This occurs because the store instruction has not yet updated the location when the load instruction reads it.
Write Buffer and the WMB Instruction 2.7.1 The Write Buffer The write buffer contains six fully associative 32-byte entries. The purpose of the write buffer is to minimize the number of CPU stall cycles by providing a finite, high-bandwidth resource for receiving store data. This is required because the 21164PC can generate store data at the peak rate of one INT8 every CPU cycle.
Write Buffer and the WMB Instruction Each time the write buffer is presented with a store instruction, the physical address generated by the instruction is compared to the address in each valid write buffer entry that is open for merging. If the address is in the same INT32 as an address in a valid write buffer entry (that also contains a store instruction), and the entry is open for merging, then the new store data is merged into that entry and the entry’s byte mask bits are updated.
Performance Measurement Support–Performance Counters • The number of entries in the write buffer exceeds the number programmed in MAF_MODE<WB_CLR_LO_THRESH>. This ensures that these instructions complete as quickly as possible. The MTU requests that a write buffer entry be processed every 264 cycles (provided there is a valid entry in the write buffer), even if the write buffer is not arbitrating.
Performance Measurement Support–Performance Counters therefore, the exception PC might not reflect the exact instruction causing counter overflow. Three counters are provided to allow accurate comparison of two variables under a potentially nonrepeatable experimental condition. The three counters are designated counter 0 (16 bits), counter 1 (16 bits), and counter 2 (14 bits). Counter inputs include: •...
Page 68
Performance Measurement Support–Performance Counters Read and write requests can be to either cacheable or I/O space addresses, but the CBU performance counters only count requests to cacheable address space. The total number of read requests is equal to the sum of the Dstream read requests and the Istream read requests.
Page 69
Performance Measurement Support–Performance Counters Misses in the onchip caches can merge in the MTU before being issued to the CBU. Therefore, MTU read or write requests are not the same as onchip cache misses. Also, two Bcache misses can merge in the CBU and appear on the system bus as a single READ MISS request.
Floating-Point Control Register 2.9 Floating-Point Control Register Figure 2–3 shows the format of the floating-point control register (FPCR) and Table 2–10 describes the fields. Figure 2–3 Floating-Point Control Register (FPCR) Format RAZ/IGN RAZ/IGN INVD DZED OVFD DYN_RM UNDZ UNFD INED LJ-05358.AI4 Table 2–10 Floating-Point Control Register Bit Descriptions (Sheet 1 of 2)
Page 71
Floating-Point Control Register Table 2–10 Floating-Point Control Register Bit Descriptions (Sheet 2 of 2) Name Extent Description (Meaning When Set) DYN_RM <59:58> Dynamic routing mode. Indicates the rounding mode to be used by an IEEE floating-point operate instruction when the instruction’s function field specifies dynamic mode (/D).
Design Examples 2.10 Design Examples The 21164PC can be designed into many different uniprocessor system configura- tions. Figure 2–4 illustrates one possible configuration. This configuration employs additional system/memory controller chipsets. Figure 2–4 shows a typical uniprocessor system with a board-level cache. This sys- tem configuration could be used in standalone or networked workstations.
Page 73
Hardware Interface This chapter contains the 21164PC microprocessor logic symbol and provides a list of signal names and their functions. 3.1 21164PC Microprocessor Logic Symbol Figure 3–1 shows the logic symbol for the 21164PC chip. 29 September 1997 – Subject To Change Hardware Interface 3–1...
Page 75
21164PC Signal Names and Functions 3.2 21164PC Signal Names and Functions The 21164PC is contained in a 413-pin interstitial pin grid array (IPGA) package. There are 264 functional signal pins, 2 spare signal pins (unused), 5 voltage refer- ence pins (unused), 46 external power (Vdd) pins, 22 internal power (Vddi) pins, and 74 ground (Vss) pins.
21164PC Signal Names and Functions The remaining two tables describe the function of each 21164PC external signal. Table 3–1 lists all signals in alphanumeric order. This table provides full signal descriptions. Table 3–2 lists signals by function and provides an abbreviated descrip- tion.
Page 77
21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 2 of 10) Signal Type Count Description clk_mode_h<1:0> Clock test mode. These signals specify a relationship between osc_clk_in_h,l, the CPU cycle time, and the duty-cycle equal- izer. These signals should be deasserted in normal operation mode.
Page 78
21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 3 of 10) Signal Type Count Description cmd_h<3:0> Command bus. These signals drive and receive the commands from the command bus. The following tables define the com- mands that can be driven on the cmd_h<3:0> bus by the 21164PC or the system.
Page 79
21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 4 of 10) Signal Type Count Description System Commands to 21164PC: cmd_h <3:0> Command Meaning 0000 Nothing. 0001 FLUSH Removes block from caches; return dirty data. 0010 INVALIDATE Invalidates the block from caches.
Page 80
21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 5 of 10) Signal Type Count Description data_ram_we_l<3:0> Data RAM write-enable. These signals are asserted for any Bcache write operation. Refer to Section 5.3.1 for timing details. dc_ok_h dc voltage OK. Must be deasserted until dc voltage reaches proper operating level.
Page 81
21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 6 of 10) Signal Type Count Description int4_valid_h<3:0> INT4 data valid. During write operations to noncached space, these signals are used to indicate which INT4 bytes of data are valid.
Page 82
21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 7 of 10) Signal Type Count Description When addr_h<39> is asserted, the int4_valid_h<3:0> signals are considered the addr_h<3:0> bits required for byte/word transactions. The functionality of these bits is tied to the value stored in addr_h<38:37>.
Page 83
21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 8 of 10) Signal Type Count Description irq_h<3:0> System interrupt requests. These signals have multiple modes of operation. During normal operation, these level-sensitive signals are used to signal interrupt requests. During initializa- tion, these signals are used to set up the CPU cycle time divi- sor for sys_clk_out1_h as follows: irq_h<3>...
Page 84
21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 9 of 10) Signal Type Count Description pwr_fail_irq_h Power failure interrupt request. This signal has multiple modes of operation. During initialization, this signal is used to set up sys_clk_out2_ h delay (see Table 4–3). During normal opera- tion, this signal is used to signal a power failure.
Page 85
21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 10 of 10) Signal Type Count Description tag_data_h<32:19> Bcache tag data bits. This bit range supports .5MB to 4MB Bcaches. tag_data_par_h Tag data parity bit. This signal indicates odd parity for tag_data_h<32:19>.
21164PC Signal Names and Functions Table 3–2 lists signals by function and provides an abbreviated description. Table 3–2 21164PC Signal Descriptions by Function (Sheet 1 of 3) Signal Type Count Description Clocks clk_mode_h<1:0> Clock test mode. cpu_clk_out_h CPU clock output. osc_clk_in_h,l Oscillator clock inputs.
Page 87
21164PC Signal Names and Functions Table 3–2 21164PC Signal Descriptions by Function (Sheet 2 of 3) Signal Type Count Description System Interface addr_h<39:4> Address bus. addr_bus_req_h Address bus request. addr_res_h<1:0> Address response. cack_h Command acknowledge. cmd_h<3:0> Command bus. dack_h Data acknowledge. data_bus_req_h Data bus request.
Page 88
21164PC Signal Names and Functions Table 3–2 21164PC Signal Descriptions by Function (Sheet 3 of 3) Signal Type Count Description srom_oe_l Serial ROM output enable. srom_present_l Serial ROM present. tck_h JTAG boundary-scan clock. tdi_h JTAG serial boundary-scan data in. tdo_h JTAG serial boundary-scan data out.
Page 89
Clocks, Cache, and External Interface This chapter describes the 21164PC microprocessor external interface, which includes the backup cache (Bcache) and system interfaces. It also describes the clock circuitry, interrupt signals, and parity generation. It is organized as follows: • Introduction to the external interface •...
Introduction to the External Interface 4.1 Introduction to the External Interface A 21164PC-based system can be divided into three major sections: • 21164PC microprocessor • External Bcache • System interface logic The 21164PC external interface is optimized for uniprocessor-based systems and mandates few design rules.
Introduction to the External Interface The BIU contains a three-entry BIU command/address buffer (BAF) capable of queueing up to three Bcache misses or I/O references. These buffers are capable of merging both read and write miss references, to reduce external system bus traffic. 4.1.2 Bcache Interface The 21164PC includes an interface and control for a required backup cache (Bcache).
Introduction to the External Interface Figure 4–2 Merits of a Multiprobes In Flight – Pipelined Cache Pipelining allows 100% utilization of the data bus. Nonpipelined Cache: index latency 1 latency 2 data D10 D11 D20 D21 Pipelined Cache: index latency 1 latency 2 latency 3 data...
Clocks Figure 4–3 Tag/Data Store Interleaving Interleaving tag write probes with data write operations allows 100% utilization of the data bus. Data writes interleaved with tag probes index latency 1 latency 2 latency 3 latency 4 latency 5 Hit 1 Hit 2 Hit 3 Hit 4...
Clocks 4.2.1 CPU Clock The 21164PC uses the differential input clock lines osc_clk_ in_h,l as a source to generate its CPU clock. The input signals clk_mode_h<1:0> control generation of the CPU clock, as listed in Table 4–1 and as shown in Figure 4–4. The 21164PC uses clk_mode_h<0>...
Clocks Figure 4–4 Clock Signals and Functions 21164PC osc_clk_in_h, l CPU Clock cpu_clk_out_h Divider Symmetrator clk_mode_h<1:0> (/1 or /4) System Clock sys_clk_out1_h Divider irq_h<3:0> (/4 through /15) mch_hlt_irq_h System Clock sys_clk_out2_h pwr_fail_irq_h Delay sys_mch_chk_irq_h (0 through 7) sys_reset_l dc_ok_h MK5502B 4.2.2 System Clock The CPU clock is the source clock used to generate the system clock sys_clk_out1_h.
Clocks Table 4–2 System Clock Divisor (Sheet 2 of 2) irq_h<3> irq_h<2> irq_h<1> irq_h<0> Ratio High High High High High High High High High High High High High High High High High High High Figure 4–5 shows the 21164PC driving the system clock on a uniprocessor system. Figure 4–5 21164PC Uniprocessor Clock Memory ASIC...
Physical Address Considerations even. The output is asymmetric if the divisor is odd. When the divisor is odd, the clock is high for an extra cycle. Refer to Section 7.2 for information on sysclk behavior during reset. Table 4–3 System Clock Delay sys_mch_chk_irq_h pwr_fail_irq_h mch_hlt_irq_h Delay Cycles...
Physical Address Considerations system environment as to which INT8s are accessed. Write merging is permitted. Write accesses are INT32 requests with a mask indicating which INT4s are actually modified. The 21164PC never writes more than 32 bytes at a time in noncached space. The 21164PC does not broadcast accesses to the CBU IPR region if they map to a CBU IPR.
Bcache Structure 4.3.3 Noncached Read Operations Read operations to physical addresses that have addr_h<39> asserted are not cached in the Dcache or Bcache. They are merged like other read operations in the miss address file (MAF). To prevent several read operations to noncached memory from being merged into a single 32-byte bus request, software must insert memory barrier (MB) instructions or set MAF_MODE IPR bit (IO_NMERGE).
Cache Coherency The 21164PC partitions physical address (addr_h<32:4>) into an index field and a tag field. The 21164PC presents index_ h<21:4> and tag_data_h<32:19> to the Bcache interface. The tag size required is Bcache_size/block_size. The system designer uses the signal lines needed for a particular size Bcache. For example, the 1MB Bcache needs index_h<19:4>...
Cache Coherency The system hardware designer need not be concerned about Icache and Dcache coherency. Coherency of the Icache is a software concern—it is flushed with an IMB (PALcode) instruction. The 21164PC requires the system to allow only one change to a block at a time. This means that if the 21164PC gains the bus to read or write a block, I/O devices on the system bus should not be allowed to access that block until the data has been moved.
Cache Coherency System logic notifies the 21164PC of all DMA write operations that occur on the system bus by using the interface FLUSH command. If the block is dirty, the 21164PC provides the data to the system and invalidates the block in the Bcache. If the block is not dirty (clean), data is not returned, and the block is invalidated.
21164PC-to-Bcache Transactions Figure 4–7 Flush-Based Protocol System/Bus States FLUSH FLUSH (DMA Write Operation) (DMA Write Operation) No Data Returned Data Returned to System to System INVAL Data Returned to System INVAL No Data Returned to System READ READ (DMA Read Operation) (DMA Read Operation) PCA017 4.6 21164PC-to-Bcache Transactions...
21164PC-to-Bcache Transactions For every Bcache access, the 21164PC drives the index, address strobe (data_adsc_l), and the SSRAM clock (st_clk) to the SSRAMs to load the initial address. The st_clk may be delayed a programmable number of CPU cycles to facil- itate better control over module timing.
Page 107
21164PC-to-Bcache Transactions Bcache timing is configured using the CBOX_CONFIG and CBOX_CONFIG2 IPRs. Figures 5–48 and 5–51 show the layout of these registers. These registers are normally configured by 21164PC initialization code. Both the 21164PC and system require access to the Bcache through a shared 128-bit data bus.
21164PC-to-Bcache Transactions latency and repetition rate are programmable using the CBOX_CONFIG register fields <11:08> (BC_LATENCY_OFF<3:0>) and <07:04> (BC_CLK_RATIO<3:0>). For private Bcache writes, the 21164PC uses the early write SSRAM protocol controlled by the ADSC# pin. The repetition rate for data writes is programmable through the (BC_CLK_RATIO<3:0>).
21164PC-to-Bcache Transactions 4.6.5 Bcache Private Write Transactions CPU-initiated write operations are broken into two suboperations, namely a write- probe operation and a subsequent data-write operation. The write-probe operation performs the tag store lookup to determine hit or miss status as well as to determine the tag state, clean (V */D) or dirty (V*D).
21164PC-to-Bcache Transactions fore, the data-write operation after the fill operation completes does not update the tag store. The Bcache is nonblocking and allows other transactions to use the Bcache while waiting for outstanding Bcache misses. Figure 4–11 shows an example of the timing for a data-write operation that hits clean to the Bcache during the write probe.
21164PC-to-Bcache Transactions data_adv_l one bc_clk cycle after the launch of the index. It is deasserted in the fol- lowing bc_clk cycle. The longword write enables, data_ram_we_l<3:0>, are driven for each 16-byte of write data at index launch time and at the subsequent bc_clk cycle.
21164PC-to-Bcache Transactions 4.6.5.3 Interleaving Write-Probes The 21164PC is able to interleave data-write operations that hit dirty with write- probe operations, since both operations access different stores (tag and data). This technique is used to fully saturate the data bus during write-hit streams as is shown in Figure 4–13.
21164PC-Initiated System Transactions 4.6.6 Selecting Bcache Options Table 4–7 lists the variables to consider when designing and implementing a Bcache. Table 4–7 Bcache Options Parameter Selection sysclk ratio (4-15) ____ CPU cycles Cache protocol, flush or flush invalidate ____ Longword parity or no parity ____ Bcache size (.5MB to 4MB) ____ MB...
21164PC-Initiated System Transactions • If there is a tag mismatch or the valid bit is clear, a Bcache miss has been detected. If the block to be replaced is clean, the Bcache continues operation while the READ MISS request is sent to the system. If the block to be replaced is dirty, the 21164PC waits for all outstanding probes in flight to complete, and then starts an external READ MISS with VICTIM PENDING transaction that instructs the system logic to access and return data.
Page 117
21164PC-Initiated System Transactions Table 4–8 21164PC-Initiated Interface Commands (Sheet 2 of 2) cmd_h Command <3:0> Description WRITE BLOCK 0110 Request to write a block. When the 21164PC wants to write a 32-byte block of data to noncached memory, it drives the com- mand, address, and first INT16 of data on a sysclk edge.
21164PC-Initiated System Transactions 4.7.1 READ MISS Clean - No Victim A READ MISS command is launched to the system interface when: 1. The Bcache probe for a CPU-initiated READ command detects a miss. 2. The Bcache probe for a CPU-initiated WRITE command detects a miss. 3.
21164PC-Initiated System Transactions Figure 4–14 READ MISS Clean – Bcache Timing Diagram 29 September 1997 – Subject To Change Clocks, Cache, and External Interface 4–31...
21164PC-Initiated System Transactions 4.7.2 FILL The 21164PC provides an st_clkx_h pulse a certain number of cycles after the rising edge of the system clock, determined by the sum of the BC_CLK_DELAY<1:0> and the FILL_OFFSET<2:0> values in the CBOX_CONFIG register (see Section 5.3.1). The value must be from 1 to 7 and cannot be greater than the sysclk ratio.
21164PC-Initiated System Transactions At the end of the fill transaction, the 21164PC does not assert data_ram_oe_l or begin to drive the data bus until the fifth cpu_clk cycle after the sysclk that loads the last dack_h. If systems require more time to turn off their drivers, they must use idle_bc_h in combination with data_bus_req_h to stop 21164PC requests and not send any system requests.
Page 122
21164PC-Initiated System Transactions The use of dack_h for a system Bcache read command (Bcache victim or system command with data movement) is very dependent on the SSRAM style, either pipe- lined or flow-through. The assertion of dack_h is responsible for the assertion of the data_adv_l pin, and is not to be confused with the sampling of data.
21164PC-Initiated System Transactions Figure 4–15 READ MISS with Victim Timing Diagram, Pipelined Mode 29 September 1997 – Subject To Change Clocks, Cache, and External Interface 4–35...
21164PC-Initiated System Transactions Figure 4–16 READ MISS with Victim Timing Diagram, Flow-Through Mode Clocks, Cache, and External Interface 29 September 1997 – Subject To Change 4–36...
21164PC-Initiated System Transactions 4.7.4 WRITE BLOCK The WRITE BLOCK command is used to complete write operations to noncached memory. The 21164PC asserts the WRITE BLOCK command, along with the address at the start of a sysclk cycle. The first 16 bytes of data and the int4_valid signals are driven one cpu_clk cycle later, so that system interface can be assured a one cpu_clk cycle minimum hold time when sampling data on the next sysclk edge.
System-Initiated Transactions Figure 4–17 WRITE BLOCK Timing Diagram sys_clk(4:1) addr_h<39:4> cmd_h<3:0> WRBLK victim_pending_h cack_h fill_h fill_id_h idle_bc_h data_h<127:0> dack_h FM-05560.AI4 4.8 System-Initiated Transactions System commands to the 21164PC are driven on the cmd_h<3:0> signal lines. Before driving these signals, the system must gain control of the command and address buses by using addr_bus_req_h, as described in Section 4.9.1.
System-Initiated Transactions The 21164PC can hold two outstanding commands from the system at any time. The algorithm used by the system to send commands to the 21164PC without overflow- ing the two CBU BIU command buffers is shown in Figure 4–18. Figure 4–18 Algorithm for System Sending Commands to the 21164PC Start Init?
System-Initiated Transactions 4.8.2 Write Invalidate Protocol Commands All 21164PC-based systems that use the write invalidate protocol are expected to use the READ, FLUSH, and INVALIDATE commands to maintain cache coherency. These commands are defined in Table 4–9. Table 4–9 System-Initiated Interface Commands (Write Invalidate Protocol) cmd_h Command...
System-Initiated Transactions 4.8.2.1 21164PC Responses to Flush-Based Protocol Commands The system responds to flush-based protocol commands on addr_res_h<1:0>, as shown in Table 4–10. Table 4–10 21164PC Responses to Flush-Based Protocol Commands READ and FLUSH Commands Bcache 21164PC Response Bcache Miss NOACK Bcache Hit, Not Dirty NOACK...
System-Initiated Transactions 4.8.2.3 INVALIDATE The INVALIDATE command can be used to remove a block from the cache system. Unlike the FLUSH command, any modified data will not be read. The Bcache is probed and invalidated if the block is found. Figure 4–20 shows the timing of an INVALIDATE transaction.
System-Initiated Transactions When using the pipelined SSRAMs, the data output register delays the data an addi- tional sysclk cycle. When the CBOX_CONFIG<BC_REG_REG> bit is set, the data_ram_oe_l deassertion is delayed an additional sysclk cycle to allow the system ample time to sample the delayed Bcache read data. Figure 4–21 READ Timing Diagram (Bcache Hit) Flow-Through SSRAM Clocks, Cache, and External Interface 29 September 1997 –...
Data Bus and Command/Address Bus Contention 4.9 Data Bus and Command/Address Bus Contention The data bus is composed of data_h<127:0> and lw_parity_h<3:0>. The com- mand/address bus is composed of cmd_h<3:0> and addr_h<39:4>. The following sections describe situations that have contention for use of the data bus or contention for use of the command/address bus.
Data Bus and Command/Address Bus Contention 4.9.2 Read/Write Spacing—Data Bus Contention The data bus, data_h<127:0>, can be driven by the 21164PC, the Bcache array, or the system. In the case of private Bcache write operations followed by private Bcache read oper- ations, the 21164PC stops driving the data bus well in advance of the Bcache turning For private Bcache read operations followed by private Bcache write operations, the 21164PC inserts a programmable number of cpu_clk cycles between the read and...
Data Bus and Command/Address Bus Contention For example, if the sysclk ratio is 7, the Bcache read latency is 5, the bc_clk ratio is 3, and two cycles are necessary for tristate turnoff, then the equations would work out to: cpu_clk sysclk ratio sys_clk...
Data Bus and Command/Address Bus Contention To gain control of the data bus, the system must ensure that the Bcache is idle by asserting idle_bc_h for the required time. It can then assert data_bus_req_h. If data_bus_req_h is received asserted at the rising edge of sysclk N, the 21164PC stops driving the bus on the rising edge of sysclk N+1.
Data Bus and Command/Address Bus Contention 4.9.5.2 System READ to FILL (System WRITE) Spacing The time to turn off the Bcache drivers at the end of a system READ (Bcache victim or system command with data movement) is fixed by the 21164PC design (refer to Figure 4–24).
21164PC Interface Restrictions 4.9.5.3 FILL to Private READ or WRITE Operation At the end of the fill, the 21164PC does not begin to drive the data bus until the fifth cpu_clk cycle after the sysclk that loads the last dack_h (refer to Figure 4–25). The 21164PC does not assert data_ram_oe_l until the fifth cycle after the sysclk that loads the last dack_h.
21164PC/System Race Conditions For a WRITE BLOCK operation followed by a fill operation, the earliest point the system can assert the fill_h signal is at the sysclk after the last assertion of dack_h. Fill operations followed by fill operations are special cases. Fill operations can be pipelined back-to-back so that 100% of the data bus bandwidth can be used.
Page 140
21164PC/System Race Conditions 6. There is one exception to rules 3, 4, and 5. If idle_bc_h or a system command arrives while the 21164PC is reading the Bcache, and that read transaction turns into a read miss transaction, and it does not produce a victim, then the 21164PC loads the miss into the pad ring.
21164PC/System Race Conditions 4.11.2 READ MISS with Victim Aborted by FILL Example In Figure 4–26, the 21164PC asserts a READ MISS command with a victim. The system asserts dack_h for two data cycles received from the Bcache and then asserts idle_bc_h.
21164PC/System Race Conditions 4.11.3 idle_bc_h and cack_h Race Example In Figure 4–27, idle_bc_h and cack_h are asserted in the same sysclk cycle. The system takes the READ MISS and BCACHE VICTIM commands before doing any- thing else. The last dack_h meets the requirement that the cack_h arrive before or with the last dack_h.
21164PC/System Race Conditions 4.11.4 READ MISS with idle_bc_h Asserted Example In Figure 4–28, the 21164PC has started a Bcache read operation that misses. The signal idle_bc_h is asserted, but no victim was created, so the read miss request is loaded into the pad ring. The system then takes the request. Figure 4–28 READ MISS with idle_bc_h Asserted Example sys_clk_out1_h READ MISS...
21164PC/System Race Conditions 4.11.5 READ MISS with Victim Aborted by System Command Example In Figure 4–29, the 21164PC produces a READ MISS command with a victim and is waiting for the system to take it when the system takes the bus and requests a flush transaction.
Data Integrity and Bcache Errors 4.11.6 Bcache Hit Under READ MISS Example In Figure 4–30, the 21164PC produces a read miss transaction and requests a fill from the system. A Bcache hit to index j take places while waiting for the fill. The system then returns the requested data in two bursts, asserting cack_h at the same time as the last assertion of dack_h.
Interrupts 4.12.2 Bcache Tag Data Parity The signal line tag_data_par_h is used to maintain parity over tag_data_h<32:19>, tag_valid_h, and tag_dirty_h. A Bcache tag data parity error is usually not recoverable. A Bcache hit is determined based on the tag alone, not the tag parity bit. The CBU records the Bcache probe address and the tag value read from the Bcache.
Interrupts 4.13.1 Interrupt Signals During Initialization The 21164PC interrupt signals work in tandem with the sys_reset_l signal to set the values for clock ratios and clock delays. During initialization, the 21164PC reads system clock configuration parameters from the interrupt pins. Section 4.2.2 and Section 4.2.3 describe how the interrupt signals are used to set system clock values when the system is initialized.
Page 148
Interrupts Table 4–11 Interrupt Priority Level Effect (Sheet 2 of 2) Interrupt Source Target IPL Source Software Interrupt Request 15 Internal Asynchronous system trap ATR pending (for Internal current or more privileged mode) Performance counter interrupt Internal Powerfail interrupt pwr_fail_irq_h System machine check interrupt sys_mch_chk_irq_h and internal...
Page 149
Internal Processor Registers This chapter describes the 21164PC microprocessor internal processor registers (IPRs). It is organized as follows: • Instruction fetch/decode unit and branch unit (IDU) IPRs • Memory address translation unit (MTU) IPRs • Cache control and bus interface unit (CBU) IPRs •...
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1 Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs The IDU internal processor registers (IPRs) are described in Section 5.1.1 through Section 5.1.27. 5.1.1 Istream Translation Buffer Tag (ITB_TAG) Register (101) ITB_TAG is a write-only register written by hardware on an ITBMISS/IACCVIO, with the tag field of the faulting virtual address.
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs contents of the ITB_TAG register. The PTE field is provided by the HW_ MTPR ITB_PTE instruction. Write operations to this register use the memory format bits, as described in the Alpha AXP Architecture Reference Manual. Figure 5–2 shows the ITB_PTE register write format.
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.3 Instruction Translation Buffer Address Space Number (ITB_ASN) Register (103) ITB_ASN is a read/write register that contains the address space number (ASN) of the current process. Figure 5–4 shows the ITB_ASN register format. Figure 5–4 Instruction Translation Buffer Address Space Number (ITB_ASN) Register RAZ/IGN...
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.6 Instruction Translation Buffer Invalidate All (ITB_IA) Register (105) ITB_IA is a write-only register. A write operation to this register invalidates all ITB entries, and resets the ITB not-last-used (NLU) pointer to its initial state. RESET PALcode must execute an HW_MTPR ITB_IA instruction in order to initialize the NLU pointer.
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.8 Formatted Faulting Virtual Address (IFAULT_VA_FORM) Register (112) IFAULT_VA_FORM is a read-only register containing the formatted faulting virtual address on an ITBMISS/IACCVIO (except on IACCVIOs generated by sign-check errors). The formatted faulting address generated depends on whether NT superpage mapping is enabled through ICSR bit SPE<0>.
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.9 Virtual Page Table Base (IVPTBR) Register (113) IVPTBR is a read/write register. Bits <32:30> are UNDEFINED on a read of this register in non-NT mode. Figure 5–8 shows the IVPTBR register format in non-NT mode.
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.10 Icache Parity Error Status (ICPERR_STAT) Register (11A) ICPERR_STAT is a read/write register. The Icache parity error status bits may be cleared by writing a 1 to the appropriate bits. Figure 5–10 and Table 5–3 describe the ICPERR_STAT register format.
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.12 Exception Address (EXC_ADDR) Register (10B) EXC_ADDR is a read/write register used to restart the system after exceptions or interrupts. The HW_REI instruction causes a return to the instruction pointed to by the EXC_ADDR register.
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs Figure 5–12 Exception Summary (EXC_SUM) Register RAZ/IGN RAZ/IGN RAZ/IGN LJ-03484.AI4 Table 5–4 Exception Summary Register Fields Name Extent Type Description <10> Indicates software completion possible. This bit is set after a floating-point instruction containing the /S modifier com- pletes with an arithmetic trap, and if all previous floating- point instructions that trapped since the last HW_MTPR EXC_SUM instruction also contained the /S modifier.
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.14 Exception Mask (EXC_MASK) Register (10D) EXC_MASK is a read/write register that records the destinations of instructions that have caused an arithmetic trap between EXC_MASK write operations. The destina- tion is recorded as a single bit mask in the 64-bit IPR representing F0–F31 and I0–I31.
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.15 PAL Base Address (PAL_BASE) Register (10E) PAL_BASE is a read/write register containing the base address for PALcode. The register is cleared by hardware on reset. Figure 5–14 shows the PAL_BASE register format.
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.17 IDU Control and Status (ICSR) Register (118) ICSR is a read/write register containing IDU-related control and status information. Figure 5–16 and Table 5–5 describe the ICSR register format. Figure 5–16 IDU Control and Status (ICSR) Register RAZ/IGN RAZ/IGN PME<1:0>...
Page 165
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs Table 5–5 IDU Control and Status Register Fields (Sheet 2 of 3) Name Extent Type Description <19> RW,0 If set, enables the motion video instruction (MVI) set. If clear, causes any MVI class instructions to generate a RESDEC trap.
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs Table 5–5 IDU Control and Status Register Fields (Sheet 3 of 3) Name Extent Type Description <36> RW,0 If set, forces bad Icache data parity. MBZ in nor- mal operation. <37> RW,1 Reserved to DIGITAL.
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.19 Interrupt ID (INTID) Register (111) INTID is a read-only register that is written by hardware with the target IPL of the highest priority pending interrupt. The hardware recognizes an interrupt if the IPL being read is greater than the IPL given by IPLR<04:00>.
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.20 Asynchronous System Trap Request (ASTRR) Register (109) ASTRR is a read/write register containing bits to request asynchronous system trap (AST) interrupts in each of the four processor modes (U,S,E,K). In order to generate an AST interrupt, the corresponding enable bit in the ASTER must be set and the current processor mode given in the ICM<04:03>...
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.22 Software Interrupt Request (SIRR) Register (108) SIRR is a read/write register used to control software interrupt requests. A software request for a particular IPL may be requested by setting the appropriate bit in SIRR<15:01>.
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.24 Interrupt Summary (ISR) Register (100) ISR is a read-only register containing information about all pending hardware, soft- ware, and asynchronous system trap (AST) interrupt requests. Figure 5–23 and Table 5–8 describe the ISR register format. Refer to Table 4–11 for a description of which interrupts are enabled for a given interrupt priority level (IPL).
Page 172
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs Table 5–8 Interrupt Summary Register Fields (Sheet 2 of 2) Name Extent Type Description <22> External hardware interrupt—irq_h<2>. <23> External hardware interrupt—irq_h<3>. <27> External hardware interrupt—performance counter 0 (IPL 29). <28> External hardware interrupt—performance counter 1 (IPL 29).
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.25 Serial Line Transmit (SL_XMIT) Register (116) SL_XMIT is a write-only register used to transmit bit-serial data out of the micro- processor chip under the control of a software timing loop. The value of the TMT bit is transmitted offchip on the srom_clk_h signal.
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.26 Serial Line Receive (SL_RCV) Register (117) SL_RCV is a read-only register used to receive bit-serial data under the control of a software timing loop. The RCV bit in the SL_RCV register is functionally connected to the srom_data_h signal.
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.27 Performance Counter (PMCTR) Register (11C) PMCTR is a read/write register that controls the three onchip performance counters. Figure 5–26 and Table 5–11 describe the PMCTR register format. Performance counter interrupt requests are summarized in Section 5.1.24. CBU inputs to the counter select options are described in the PM0_ MUX<2:0>...
Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs Table 5–11 Performance Counter Register Fields Name Extent Type Description CTR0<15:0> <63:48> A 16-bit counter of events selected by SEL0 and enabled by CTL0<1:0>. CTR1<15:0> <47:32> A 16-bit counter. SEL0 <31> Counter0 Select—refer to Table 5–12. <30>...
Memory Address Translation Unit (MTU) IPRs 5.2 Memory Address Translation Unit (MTU) IPRs The MTU internal processor registers (IPRs) are described in Section 5.2.1 through Section 5.2.23. 5.2.1 Dstream Translation Buffer Address Space Number (DTB_ASN) Register (200) DTB_ASN is a write-only register that must be written with an exact duplicate of the ITB_ASN register ASN field.
Memory Address Translation Unit (MTU) IPRs 5.2.3 Dstream Translation Buffer Tag (DTB_TAG) Register (202) DTB_TAG is a write-only register that writes the DTB tag and the contents of the DTB_PTE register to the DTB. To ensure the integrity of the DTBs, the DTB’s PTE array is updated simultaneously from the internal DTB_PTE register when the DTB_TAG register is written.
Page 181
Memory Address Translation Unit (MTU) IPRs Read operations of the DTB_PTE require two instructions. First, a read from the DTB_PTE sends the PTE data to the DTB_PTE_TEMP register. A zero value is returned to the integer register file (IRF) on a DTB_PTE read operation. A second instruction reading from the DTB_PTE_TEMP register returns the PTE entry to the register file.
Memory Address Translation Unit (MTU) IPRs 5.2.5 Dstream Translation Buffer Page Table Entry Temporary (DTB_PTE_TEMP) Register (204) DTB_PTE_TEMP is a read-only holding register used for DTB_PTE data. Read operations of the DTB_PTE require two instructions to return the PTE data to the register file.
Memory Address Translation Unit (MTU) IPRs 5.2.6 Dstream Memory Management Fault Status (MM_STAT) Register (205) MM_STAT is a read-only register that stores information on Dstream faults and Dcache parity errors. The VA, VA_FORM, and MM_STAT registers are locked against further updates until software reads the VA register. The MM_STAT bits are only modified by hardware when the register is not locked and a memory manage- ment error, DTB miss, or Dcache parity error occurs.
Memory Address Translation Unit (MTU) IPRs Table 5–14 Dstream Memory Management Fault Status Register Fields (Sheet 2 of 2) Name Extent Type Description BAD_VA <05> Set if reference had a bad virtual address. <10:06> RA field of the faulting instruction. OPCODE <16:11>...
Memory Address Translation Unit (MTU) IPRs 5.2.8 Formatted Virtual Address (VA_FORM) Register (207) VA_FORM is a read-only register containing the virtual page table entry (PTE) address calculated as a function of the faulting virtual address and the virtual page table base (VA and MVPTBR registers). This is done as a performance enhancement to the Dstream TBmiss PAL flow.
Memory Address Translation Unit (MTU) IPRs Table 5–15 describes the VA_FORM register fields. Table 5–15 Formatted Virtual Address Register Fields Name Extent Type Description NT_Mode=0 VPTB <63:33> Virtual page table base address as stored in MVPTBR. VA<42:13> <32:03> Subset of the original faulting virtual address. NT_Mode=1 VPTB <63:30>...
Memory Address Translation Unit (MTU) IPRs 5.2.10 Dcache Parity Error Status (DC_PERR_STAT) Register (212) DC_PERR_STAT is a read/write register that locks and stores Dcache parity error status. The VA, VA_FORM, and MM_STAT registers are locked against further updates until software reads the VA register. If a Dcache parity error is detected while the Dcache parity error status register is unlocked, the error status is loaded into DC_PERR_STAT<05:02>.
Memory Address Translation Unit (MTU) IPRs Table 5–16 Dcache Parity Error Status Register Fields Name Extent Type Description <00> Set if second Dcache parity error occurred in a cycle after the register was locked. The SEO bit is not set as a result of a second parity error that occurs within the same cycle as the first.
Memory Address Translation Unit (MTU) IPRs 5.2.13 Dstream Translation Buffer Invalidate Single (DTB_IS) Register (20B) DTB_IS is a write-only register. Writing a virtual address to this register invalidates the DTB entry that meets either of the following criteria: • A DTB entry whose VA field matches DTB_IS<42:13> and whose ASN field matches DTB_ASN<63:57>.
Memory Address Translation Unit (MTU) IPRs 5.2.14 MTU Control (MCSR) Register (20F) MCSR is a read/write register that controls features and records status in the MTU. This register is cleared on chip reset but not on timeout reset. Figure 5–39 and Table 5–17 describe the MCSR register format.
Memory Address Translation Unit (MTU) IPRs Table 5–17 MTU Control Register Fields Name Extent Type Description M_BIG_ <00> RW,0 MTU Big Endian mode enable. When set, bit 2 ENDIAN of the physical address is inverted for all long- word Dstream references. SP<1:0>...
Memory Address Translation Unit (MTU) IPRs 5.2.15 Dcache Mode (DC_MODE) Register (216) DC_MODE is a read/write register that controls diagnostic and test modes in the Dcache. This register is cleared on chip reset but not on timeout reset. Figure 5–40 and Table 5–18 describe the DC_MODE register format.
Memory Address Translation Unit (MTU) IPRs Table 5–18 Dcache Mode Register Fields Name Extent Type Description DC_ENA <00> RW,0 Software Dcache enable. When set, the DC_ENA bit enables the Dcache. When clear, the Dcache command is not updated by ST or FILL operations, and all LD operations are forced to miss in the Dcache.
Memory Address Translation Unit (MTU) IPRs 5.2.16 Miss Address File Mode (MAF_MODE) Register (217) MAF_MODE is a read/write register that controls diagnostic and test modes in the MTU miss address file (MAF). This register is cleared on chip reset. MAF_MODE<05> is also cleared on timeout reset. Figure 5–41 and Table 5–19 describe the MAF_MODE register format.
Memory Address Translation Unit (MTU) IPRs Table 5–19 Miss Address File Mode Register Fields (Sheet 1 of 2) Name Extent Type Description DREAD_ <00> RW,0 Miss address file (MAF) DREAD Merge Disable. When set, NOMERGE this bit disables all merging in the DREAD portion of the MAF.
Page 196
Memory Address Translation Unit (MTU) IPRs Table 5–19 Miss Address File Mode Register Fields (Sheet 2 of 2) Name Extent Type Description <07> This bit indicates the status of the MAF WB file. When set, PENDING there are one or more outstanding WB requests in the MAF file.
Memory Address Translation Unit (MTU) IPRs 5.2.17 Dcache Flush (DC_FLUSH) Register (210) DC_FLUSH is a write-only register. A write operation to this register clears all the valid bits in both banks of the Dcache. 5.2.18 Alternate Mode (ALT_MODE) Register (20C) ALT_MODE is a write-only register that specifies the alternate processor mode used by some HW_LD and HW_ST instructions.
Memory Address Translation Unit (MTU) IPRs 5.2.19 Cycle Counter (CC) Register (20D) CC is a read/write register. The 21164PC supports it as described in the Alpha AXP Architecture Reference Manual. The low half of the counter, when enabled, incre- ments once each CPU cycle. The upper half of the CC register is the counter offset. An HW_MTPR instruction writes CC<63:32>.
Memory Address Translation Unit (MTU) IPRs 5.2.20 Cycle Counter Control (CC_CTL) Register (20E) CC_CTL is a write-only register that writes the low 32 bits of the cycle counter to enable or disable the counter. Bits CC<31:04> are written with the value in CC_CTL<31:04>...
Memory Address Translation Unit (MTU) IPRs 5.2.21 Dcache Test Tag Control (DC_TEST_CTL) Register (213) DC_TEST_CTL is a read/write register used exclusively for testing and diagnostics. An address written to this register is used to index into the Dcache array when read- ing or writing to the DC_TEST_TAG register.
Page 201
Memory Address Translation Unit (MTU) IPRs Table 5–22 Dcache Test Tag Control Register Fields (Sheet 2 of 2) Name Extent Type Description DATA <13> Data for Dcache soft repair. When set, a logic level 1 for the programmable soft repair fuses is sent to the Dcache. When clear, a logic level 0 is sent to the Dcache.
Memory Address Translation Unit (MTU) IPRs 5.2.22 Dcache Test Tag (DC_TEST_TAG) Register (214) DC_TEST_TAG is a read/write register used exclusively for testing and diagnostics. When DC_TEST_TAG is read, the value in the DC_TEST_CTL register is used to index into the Dcache. The value in the tag, tag parity, valid, and data parity bits for that index are read out of the Dcache and loaded into the DC_TEST_TAG_TEMP register.
Memory Address Translation Unit (MTU) IPRs Table 5–23 Dcache Test Tag Register Fields Name Extent Type Description TAG_PARITY <02> Tag parity. This bit refers to the Dcache tag parity bit that covers tag bits 32 through 13 (valid bits not covered). OW0_VALID <11>...
Memory Address Translation Unit (MTU) IPRs 5.2.23 Dcache Test Tag Temporary (DC_TEST_TAG_TEMP) Register (215) DC_TEST_TAG_TEMP is a read-only register used exclusively for testing and diagnostics. Reading the Dcache tag array requires a two-step read process: 1. The first read operation from DC_TEST_TAG reads the tag array and data parity bits and loads them into the DC_ TEST_TAG_TEMP register.
Memory Address Translation Unit (MTU) IPRs Table 5–24 Dcache Test Tag Temporary Register Fields Name Extent Type Description TAG_PARITY <02> Tag parity. This bit refers to the Dcache tag parity bit that covers tag bits 32 through 13 (valid bits not covered).
External Interface Control (CBU) IPRs 5.3 External Interface Control (CBU) IPRs Table 5–25 lists specific IPRs for controlling Bcache, system configuration, and log- ging error information. These IPRs cannot be read or written from the system. They are placed in the 1MB region of 21164PC-specific I/O address space ranging from FF FFF0 0000 to FF FFFF FFFF.
External Interface Control (CBU) IPRs 5.3.1 CBU Configuration (CBOX_CONFIG) Register (FF FFF0 0008) CBOX_CONFIG is a read/write register that controls Bcache activity. Figure 5–48 and Table 5–26 describe the CBOX_CONFIG register format. The bits in this regis- ter are initialized to the value indicated in Table 5–26 on reset, but not on timeout reset.
Page 208
External Interface Control (CBU) IPRs Table 5–26 CBU Configuration Register Fields (Sheet 2 of 3) Name Extent Type Description <13:12> RW,0 This field is used to indicate the size of the Bcache. SIZE<1:0> At power-up, this field is initialized to a value that represents a 512KB Bcache.
Page 209
External Interface Control (CBU) IPRs Table 5–26 CBU Configuration Register Fields (Sheet 3 of 3) Name Extent Type Description BC_FILL_ <22:20> RW,1 This offset field represents the additional number of DLY_ CPU cycles to delay the st_clk when processing OFF<2:0> FILL commands.
External Interface Control (CBU) IPRs 5.3.2 CBU Address (CBOX_ADDR) Register (FF FFF0 0088) CBOX_ADDR is a read-only register that contains the physical address associated with errors reported by the CBOX_STATUS register. Its contents is meaningful only when one of the error bits is set. A read of CBOX_STATUS unlocks the CBOX_ADDR register.
External Interface Control (CBU) IPRs 5.3.3 CBU Status (CBOX_STATUS) Register (FF FFF0 0108) CBOX_STATUS is a read-only register. It is locked when any of the error bits are set. Additional errors set the MULTI_ERR error bit in CBOX_STATUS. A read of CBOX_STATUS unlocks and clears CBOX_STATUS and unlocks CBOX_ADDR.
Page 212
External Interface Control (CBU) IPRs Table 5–28 CBU Status Register Fields (Sheet 2 of 2) Name Extent Type Description TAG_DIRTY <17> RO,0 This bit is the value of the TAG_DIRTY bit for the failing address. If set, the data had been modified and not written to memory.
External Interface Control (CBU) IPRs 5.3.4 CBU Configuration #2 (CBOX_CONFIG2) Register (FF FFF0 0188) CBOX_CONFIG2 is a read/write register that controls Bcache and memory, the per- formance counters, and the debug test port. Figure 5–51 and Table 5–29 describe the CBOX_CONFIG2 register format.
Page 214
External Interface Control (CBU) IPRs Table 5–29 CBU Configuration #2 Register Fields (Sheet 2 of 3) Name Extent Type Description DBG_SEL=0 DBG_SEL=1 spa req spc != NOP replay req scc code<0> io_wr or scc code<1> rmv req BC_THREE_MISS <6> RW,0 Allow three read misses to be launched to the system. This feature assumes the system can guarantee that fills can be returned in order.
Page 215
External Interface Control (CBU) IPRs Table 5–29 CBU Configuration #2 Register Fields (Sheet 3 of 3) Name Extent Type Description PM1_MUX<2:0> <13:11> RW,0 This field selects the CBU events used for performance counter #1. PM1_MUX <2:0> Counter 1 is used to count: Bcache Dstream read requests (the total num- ber of Dstream read requests from the MTU).
PALcode Storage Registers 5.4 PALcode Storage Registers The 21164PC IEU register file has eight extra registers that are called the PALshadow registers. The PALshadow registers overlay R8 through R14 and R25 when the CPU is in PALmode and ICSR<SDE> is set. Thus, PALcode can consider R8 through R14 and R25 as local scratch.
Restrictions Table 5–30 CBU IPR PALcode Restrictions (Sheet 2 of 2) Condition Restriction Any undefined CBU IPR address. No store instructions. Bcache in force hit mode. No STx_C to cacheable space. Clearing of BC_FORCE_HIT in Must be followed by MB, read operation of CBOX_CONFIG.
Page 218
Restrictions Table 5–31 PALcode Restrictions Table (Sheet 2 of 5) Y if checked The following in cycle 0: Restrictions (Note: Numbers refer to cycle number): by PVC Any store instruction No HW_MFPR DC_PERR_STAT in 1,2. No HW_MFPR MAF_MODE in 1,2 (WB_PENDING may not be updated).
Page 219
Restrictions Table 5–31 PALcode Restrictions Table (Sheet 3 of 5) Y if checked The following in cycle 0: Restrictions (Note: Numbers refer to cycle number): by PVC HW_MTPR ICSR: SPE, If HW_REI_STALL, then no HW_REI_STALL in 0,1. If HW_REI, then no HW_REI in 0,1,2,3,4. HW_MTPR ICSR: SPE Must flush Icache.
Page 220
Restrictions Table 5–31 PALcode Restrictions Table (Sheet 4 of 5) Y if checked The following in cycle 0: Restrictions (Note: Numbers refer to cycle number): by PVC HW_MTPR No load or store instructions in 1. DC_PERR_STAT No HW_MFPR DC_PERR_STAT in 1,2. HW_MTPR No HW_MFPR DC_TEST_TAG in 1,2,3.
Page 221
Restrictions Table 5–31 PALcode Restrictions Table (Sheet 5 of 5) Y if checked The following in cycle 0: Restrictions (Note: Numbers refer to cycle number): by PVC HW_MFPR ITB_PTE No HW_MFPR ITB_PTE_TEMP in 1,2,3. HW_MFPR No outstanding DC fills in 0. DC_TEST_TAG No HW_MFPR DC_TEST_TAG_TEMP issued or slotted in 1.
Privileged Architecture Library Code This chapter describes the 21164PC privileged architecture library code (PALcode). The chapter is organized as follows: • PALcode description • PALmode environment • Invoking PALcode • PALcode entry points • Required PALcode function codes • 21164PC implementation of the architecturally reserved opcodes 6.1 PALcode Description Privileged architecture library code (PALcode) is macrocode that provides an archi- tecturally defined operating-system-specific programming interface that is common...
PALmode Environment • CALL_PAL instructions PALcode has characteristics that make it appear to be a combination of microcode, ROM BIOS, and system service routines, though the analogy to any of these other items is not exact. PALcode exists for several major reasons: •...
Invoking PALcode • The program has privileged access to all the computer hardware. Most of the functions handled by PALcode are privileged and need control of the lowest lev- els of the system. • Interrupts are disabled. If a long sequence of instructions need to be executed atomically, interrupts cannot be allowed.
Page 226
Invoking PALcode behaves as if the PC were still longword aligned for the purposes of Istream fetch and execute. On HW_REI, the new state of PALmode is copied from EXC_ADDR<00>. When an event occurs that needs to invoke PALcode, the 21164PC first drains the pipeline.
PALcode Entry Points 6.4 PALcode Entry Points PALcode is invoked at specific entry points. The 21164PC has two types of PAL- code entry points: CALL_PAL and traps. 6.4.1 CALL_PAL Entry CALL_PAL entry points are used whenever the IDU encounters a CALL_PAL instruction in the instruction stream (Istream).
PALcode Entry Points • PC<05:01> = 0 • PC<00> = 1 (PALmode) The minimum number of cycles for a CALL_PAL execution is four. Number of Cycles Description Minimum TRAPB for empty pipe. Typically this will be four cycles. Issue the CALL_PAL instruction. The minimum length of a PAL flow.
21164PC Implementation of the Architecturally Reserved Opcodes These architecturally reserved opcodes contain different options to the Note: 21064 opcodes of the same names. Table 6–3 Opcodes Reserved for PALcode 21164PC Architecture Mnemonic Opcode Mnemonic Function HW_LD PAL1B Performs Dstream load instructions. HW_ST PAL1F Performs Dstream store instructions.
21164PC Implementation of the Architecturally Reserved Opcodes Figure 6–1 HW_LD Instruction Format OPCODE DISP LOCK VPTE QUAD WRTCK PHYS LJ-03469.AI4 Table 6–4 HW_LD Format Description Field Value Description OPCODE The OPCODE field contains 1B — Destination register number. — Base register for memory address. PHYS The effective address for the HW_LD is virtual.
21164PC Implementation of the Architecturally Reserved Opcodes 6.6.2 HW_ST Instruction PALcode uses the HW_ST instruction to access memory outside of the realm of nor- mal Alpha memory management and to do special forms of Dstream store instruc- tions. Figure 6–2 and Table 6–5 describe the format and fields of the HW_ST instruction.
21164PC Implementation of the Architecturally Reserved Opcodes 6.6.3 HW_REI Instruction The HW_REI instruction is used to return instruction flow to the PC pointed to by the EXC_ADDR IPR. The value in EXC_ADDR<0> will be used as the new value of PALmode after the HW_REI instruction. The IDU uses the return prediction stack to speed the execution of HW_REI.
21164PC Implementation of the Architecturally Reserved Opcodes 6.6.4 HW_MFPR and HW_MTPR Instructions The HW_MFPR and HW_MTPR instructions are used to access internal state from the IDU, MTU, and Dcache. The HW_MFPR from IDU IPRs has a latency of one cycle (HW_MFPR in cycle n results in data available to the using instruction in cycle n+1).
Initialization and Configuration This chapter provides information on 21164PC-specific microprocessor/system ini- tialization and configuration. It is organized as follows: • Input signals sys_reset_l and dc_ok_h and booting • sysclk ratio and delay • Built-in self-test (BiSt) • Serial read-only memory (SROM) interface port •...
Input Signals sys_reset_l and dc_ok_h and Booting After power has reached the proper operating point, signal dc_ok_h must be asserted. Then, signal sys_reset_l must be deasserted. At this point, the 21164PC recognizes a powered-up state. If signal dc_ok_h is not asserted, signal sys_reset_l is forced asserted internally.
Input Signals sys_reset_l and dc_ok_h and Booting Table 7–1 provides the reset state of each external signal pin. Table 7–1 21164PC Signal Pin Reset State (Sheet 1 of 3) Signal Reset State Clocks clk_mode_h<1:0> NA (input). cpu_clk_out_h Clock output. osc_clk_in_h,l Must be clocking.
Page 238
Input Signals sys_reset_l and dc_ok_h and Booting Table 7–1 21164PC Signal Pin Reset State (Sheet 2 of 3) Signal Reset State System Interface addr_h<39:4> Driven or tristated depending upon addr_bus_req_h at most recent sysclk edge. If driven, the value is unspecified. addr_bus_req_h NA (input).
Input Signals sys_reset_l and dc_ok_h and Booting Table 7–1 21164PC Signal Pin Reset State (Sheet 3 of 3) Signal Reset State srom_data_h NA (input). srom_oe_l Deasserted. srom_present_l NA (input). tck_h NA (input). tdi_h NA (input). tdo_h NA (input). temp_sense NA (input). test_status_h<1>...
sysclk Ratio and Delay 7.2 sysclk Ratio and Delay While in reset, the 21164PC reads sysclk configuration parameters from the interrupt signal pins. These inputs should be driven with the correct configuration values whenever sys_reset_l is asserted. Refer to Section 4.2.2 and Section 4.2.3 for rele- vant input signals and ratio/delay values.
Serial Read-Only Memory Interface Port If srom_present_l is asserted during setup, then the system performs an SROM load as follows: 1. The srom_oe_l signal supplies the output enable to the SROM. 2. The srom_clk_h signal supplies the clock to the ROM that causes it to advance to the next bit.
External Interface Initialization 7.6.1 Icache Initialization The Icache is not kept coherent with memory. When it is necessary to make it coher- ent with memory, the following procedure is used. The CALL_PAL IMB function performs this function by using this procedure. 1.
Internal Processor Register Reset State 7.8 Internal Processor Register Reset State Many IPR bits are not initialized by reset. They are located in error-reporting regis- ters and other IPR states. They must be initialized by initialization PALcode. Table 7–2 lists the state of all internal processor registers (IPRs) immediately follow- ing reset.
Page 245
Internal Processor Register Reset State Table 7–2 Internal Processor Register Reset State (Sheet 2 of 3) Reset State Comments INTID UNDEFINED ASTRR UNDEFINED PALcode must initialize. ASTER UNDEFINED PALcode must initialize. SIRR UNDEFINED PALcode must initialize. HWINT_CLR UNDEFINED PALcode must initialize. UNDEFINED SL_XMIT Cleared...
Timeout Reset Table 7–2 Internal Processor Register Reset State (Sheet 3 of 3) Reset State Comments MCSR Cleared Cleared on chip reset but not on timeout reset. DC_MODE Cleared Cleared on chip reset but not on timeout reset. MAF_MODE Cleared Cleared on chip reset.
IEEE 1149.1 Test Port Reset 7.10 IEEE 1149.1 Test Port Reset Signal trst_l must be asserted when sys_reset_l is asserted or when dc_ok_h is deasserted. Continuous trst_l assertion during normal operation is used to guarantee that the IEEE 1149.1 test port does not affect 21164PC operation. 29 September 1997 –...
Error Detection and Error Handling This chapter provides an overview of the error handling strategy of the 21164PC. Each internal cache (instruction cache [Icache] and data cache [Dcache]) implements parity protection for tag and data. Longword parity protection is implemented for memory and backup cache (Bcache) data.
Error Flows The Icache is not flushed by hardware in this event. If an Icache parity Note: error occurs early in the PALcode routine at the machine check entry point, an infinite loop may result. • Recommendation: Flush the Icache early in the MCHK routine. 8.1.2 Dcache Data Parity Error •...
Error Flows • Probably will not be able to recover by deleting a single process, because exact address is unknown, and a load may have falsely hit. 8.1.4 Istream Data Parity Errors (Bcache or Memory) • Machine check occurs before the instruction causing the error is executed. •...
Error Flows • CBOX_STATUS: <MEMORY> is set if source of fill data is memory/system; is clear if source is Bcache. • CBOX_ADDR: Contains the physical address bits <39:04> of the octaword associated with the error. 8.1.6 Bcache Tag Parity Errors—Istream •...
Error Flows • CBOX_ADDR: Contains the physical address bits <39:04> of the octaword associated with the error. 8.1.8 System Read Operations of the Bcache The 21164PC does not check the parity on outgoing Bcache data. If it is bad, the receiving processor will detect it.
MCHK Flow to determine if certain hardware is present). The purpose of this error detection mechanism is to attempt to prevent system hang in order to write a machine check stack frame. • ICPERR_STAT: <TMR> is set. 8.2 MCHK Flow The following flow is the recommended IPR access order to determine the source of a machine check.
MCK_INTERRUPT Flow • If none of the previous conditions are true, then there is either an IRD that can be retried or the source of the MCHK is a fill_error_h. Add code for query of sys- tem status. • The case can be retried if any one or several of the following are true (and none of the previous conditions were true): –...
Electrical Data This chapter describes the electrical characteristics of the 21164PC component and its interface pins. It is organized as follows: • Electrical characteristics • dc characteristics • Clocking scheme • ac characteristics • Power supply considerations 9.1 Electrical Characteristics Table 9–1 lists the maximum ratings for the 21164PC and Table 9–2 lists the operat- ing voltages.
DC Characteristics Table 9–1 21164PC Absolute Maximum Ratings (Sheet 2 of 2) Characteristics Ratings −0.5 V to 4.6 V Signal input or output applied Typical Vdd worst case power @ Vdd = 3.3 V Frequency = 400 MHz 2.5 W Frequency = 466 MHz 2.5 W Frequency = 533 MHz...
DC Characteristics Vclamp will be clamped to Vclamp provided that the current does not exceed Iclamp. The 21164PC may be damaged if the voltage exceeds Vclamp or the current exceeds Iclamp. 9.2.3 Output Signal Pins Output pins are ordinary 3.3-V CMOS outputs. Although output signals are rail-to- rail, timing is specified to Vdd/2.
Page 260
DC Characteristics Table 9–3 CMOS DC Input/Output Characteristics (Sheet 2 of 2) Parameter Requirements Symbol Description Min. Max. Units Test Conditions Iozh_pu Output with pull-up leakage — ±100 µA Vin = Vdd V current (tristate) Vclamp Maximum clamping voltage — Vdd + 1.0 V Iclamp = 100 mA Peak power supply current for...
Clocking Scheme 9.3 Clocking Scheme The differential input clock signals osc_clk_in_h,l run at the internal frequency of the time base for the 21164PC. The output signal cpu_clk_out_h toggles with an unspecified propagation delay relative to the transitions on osc_clk_in_h,l. The 21164PC provides a system clock to run the chip synchronous to the system. The 21164PC generates and drives out a system clock, sys_clk_out1_h.
Clocking Scheme allows a clock source of arbitrary dc bias to be ac coupled to the 21164PC. The peak- to-peak amplitude of the clock source must be between 0.6 V and 3.0 V. Either a square-wave or a sinusoidal source may be used. Full-rail clocks may be driven by testers.
AC Characteristics 9.3.3 AC Coupling Using series coupling (blocking) capacitors renders the 21164PC clock input pins insensitive to the oscillator’s dc level. When connected this way, oscillators with any dc offset relative to Vss can be used provided they can drive a signal into the osc_clk_in_h,l pins with a peak-to-peak level of at least 600 mV, but no greater than 3.0 V peak-to-peak.
AC Characteristics Figure 9–3 Input/Output Pin Timing Tcycle Internal CPU Clock Tdsu 2.0 V Input Signals 0.8 V Input Timing Internal CPU Clock Output Signals Output Timing MK−1455−19 Because the speed and complexity of microprocessors has increased substantially over the years, it is necessary to change the way they are tested. Traditional assump- tions that all loads can be lumped into some accumulation of capacitance cannot be employed any more.
AC Characteristics There is no source termination resistor in the 21164PC fabricated in 0.35-µm CMOS process technology. The source impedance of the driver is approximately 32 Ω ±17. The circuit is designed to deliver a TTL signal under worst-case conditions. Under light load, high drive voltages, and fast process conditions there may be considerable overdrive.
AC Characteristics Outgoing Bcache index and data signals are driven off the internal clock edge and the incoming Bcache tag and data signals are latched on the same internal clock edge. Table 9–6 and Table 9–7 show the output driver characteristics for the normal driver and big driver, respectively.
AC Characteristics Output pin timing is specified for lumped 40-pF and 10-pF loads for the normal driver and lumped 60-pF, 40-pF, and 10-pF loads for the big driver. In some cases, the circuit may have loads higher than 40 pF (60 pF for big driver). The 21164PC can safely drive higher loads provided the average charging or discharging current from each pin is 11 mA or less for normal output drivers or 25 mA or less for big output drivers.
AC Characteristics 9.4.2.2 sys_clk-Based Systems All timing is specified relative to the rising edge of the internal CPU clock. Table 9–8 shows 21164PC system clock sys_clk_out1_h output timing. Setup and hold times are specified independent of the relative capacitive loading of sys_clk_out1_h,l, addr_h<39:4>, data_h<127:0>, and cmd_h<3:0>...
AC Characteristics Figure 9–5 shows sys_clk system timing. Figure 9–5 sys_clk System Timing Relationship of CPU Clock and sys_clk_out1 CPU Clock Tsysd sys_clk_out1 Memory Read (Pipe_Latch Mode) sys_clk_out1 Tsysd Tsysd Tsysd CPU Clock Taod Taoh Address/Command Out Ttacksu dack Tdsu Data In Memory Read (Non-Pipe_Latch Mode) sys_clk_out1...
AC Characteristics 9.4.3 Timing—Additional Signals This section lists timing for all other signals. Asynchronous Input Signals The following is a list of the asynchronous input signals: clk_mode_h<1:0> dc_ok_h sys_reset_l irq_h<3:0> mch_hlt_irq_h pwr_fail_irq_h sys_mch_chk_irq_h These signals can also be used synchronously. Miscellaneous Signals Table 9–9 and Table 9–10 list the timing for miscellaneous input-only and output- only signals.
AC Characteristics Signals in Table 9–11 are used to control Bcache data transfers. These signals are driven off the CPU clock. The timing of these signals does not change when switch- ing over to the sys_clk_out timing domain. Table 9–11 Bcache Control Signal Timing Signal Specification Value...
AC Characteristics Figure 9–6 BiSt Timing Event —Timeline Deassert BiSt Start Deassert* BiSt Done sys_reset_l (test_status_h<1:0>=01) Internal Reset (test_status_h<1:0>=00) (T%Z_RESET_B_L) The timing for deassertion of internal reset (time t , see asterisk) is valid only if an SROM is not present (indicated by keeping signal srom_present_l deasserted). If an SROM is present, the SROM load is performed once the BiSt completes.
AC Characteristics 9.4.4.2 Automatic SROM Load Timing The SROM load is triggered by the conclusion of BiSt if srom_present_l is asserted. The SROM load occurs at the internal cycle time of approximately 126 CPU cycles for srom_clk_h, but the behavior at the pins may shift slightly. Refer to Chapter 7 for more information on input signals, booting, and the SROM interface port.
AC Characteristics Figure 9–8 is a timing diagram of an SROM load sequence. Figure 9–8 Serial ROM Load Timing sys_reset_l srom_oe_l srom_clk_h srom_data_h = 4 x sysclk period + 1.1 ns 131, 072 Bits Total = 0 ns MK145507B The minimum srom_clk_h cycle = (126 − sysclk ratio) × (CPU cycle time). The maximum srom_clk_h to srom_data_h delay allowable (in order to meet the required setup time) = [126 −...
Power Supply Considerations Table 9–16 lists the clock test modes. Table 9–16 Clock Test Modes clk_mode_h Mode <1> <0> Notes Normal (1×) clock mode Normal (1×) clock mode Symmetrator is enabled. Clock reset Clock reset Symmetrator is enabled. 9.4.6 IEEE 1149.1 (JTAG) Performance Table 9–17 lists the standard mandated performance specifications for the IEEE 1149.1 circuits.
Power Supply Considerations Plus 5 V is not used in the 21164PC. The voltage difference between the Vdd pins and Vss pins must never be greater than 3.46 V, and the voltage difference between the Vddi pins and Vss pins must never be greater than 2.6 V. If the differentials exceed these limits, the 21164PC chip will be damaged.
Power Supply Considerations Use capacitors that are as physically small as possible. Connect the capacitors directly to the 21164PC Vddi and Vss pins by short surface etch (0.64 cm [0.25 in] or less). The small capacitors generally have better electrical characteristics than the larger units, and will more readily fit close to the IPGA pin field.
Page 280
Power Supply Considerations There is no derating for shorter transient periods or lower transient voltages (for example, a 400-mV transient voltage lasting for 100 µs is not acceptable). All input and bidirectional signals are diode-clamped to Vdd and Vss. A current greater than Iclamp on an individual pin could damage the 21164PC.
Thermal Management This chapter describes the 21164PC thermal management and thermal design consid- erations. 10.1 Operating Temperature The 21164PC is specified to operate when the temperature at the center of the heat sink (T ) is 71.8°C for 400 MHz, 69.8°C for 466 MHz, or 67.5°C for 533 MHz. Temperature (T ) should be measured at the center of the heat sink (between the two package studs).
Operating Temperature Table 10–2 Maximum T at Various Airflows Airflow (linear ft/min) 1000 Frequency: 400 MHz, Power: 26.5 W @Vdd = 3.3 V and @Vddi = 2.5 V with heat sink 1 (°C) — 26.8 46.6 51.9 54.6 57.2 with heat sink 2 (°C) 51.9 (includes 52×10 mm fan) Frequency: 466 MHz, Power: 30.5 W @Vdd = 3.3 V and @Vddi = 2.5 V...
Heat-Sink Specifications 10.2 Heat-Sink Specifications Figure 10–1 describes the specifications of heat sink 1. Heat sink 2 has the exact same specifications, plus an added 52×10 mm fan. Figure 10–1 Heat Sink 1 (1.870 in) 4.75 cm 4.75 cm 4.20 cm (1.655 in) (1.870 in) 2.16 cm...
Thermal Design Considerations 10.3 Thermal Design Considerations Follow these guidelines for printed circuit board (PCB) component placement: • Orient the 21164PC on the PCB with the heat-sink fins aligned with the airflow direction. • Avoid preheating ambient air. Place the 21164PC on the PCB so that inlet air is not preheated by any other PCB components.
Mechanical Packaging Information This chapter describes the 21164PC mechanical packaging including chip package physical specifications and a signal/pin list. For heat-sink dimensions, refer to Chapter 10. 11.1 Mechanical Specifications Figure 11–1 shows the package physical dimensions without a heat sink. 29 September 1997 –...
Signal Descriptions and Pin Assignment 11.2 Signal Descriptions and Pin Assignment This section provides detailed information about the 21164PC pinout. The 21164PC has 413 pins aligned in an interstitial pin grid array (IPGA) design. 11.2.1 Signal Pin Lists Table 11–1 lists the 21164PC signal pins and their corresponding pin grid array (PGA) locations in alphabetic order;...
Testability and Diagnostics This chapter describes the 21164PC user-oriented testability features. The 21164PC also has several internal testability features that are implemented for factory use only. These features are beyond the scope of this document. 12.1 Test Port Pins Table 12–1 summarizes the test port pins and their functions. Table 12–1 21164PC Test Port Pins Pin Name Type Function...
Test Interface 12.2 Test Interface The 21164PC test interface supports a serial ROM interface, a serial diagnostic ter- minal interface, and an IEEE 1149.1 test access port. These ports are available and set to normal test interface mode when port_mode_h<1:0>=00. Driving these pins to a value of anything other than 00 redefines all other test interface pins and invokes special factory test modes not covered in this document.
Test Interface DIGITAL recommends that the trst_l pin be driven low (asserted) when Note: the JTAG (IEEE 1149.1) logic is not in use. 2. Coverage of oscillator differential input pins The two differential clock input pins, osc_clk_in_h and osc_clk_in_l, do not have any boundary-scan cells associated with them (noncompliant spec 10.4.1(b) in IEEE 1149.1–1993).
Test Interface TAP Controller The TAP controller contains a state machine. It interprets IEEE 1149.1 protocols received on signal tms_h and generates appropriate clocks and control signals for the testability features under its jurisdiction. The state machine is shown in Figure 12–2. Figure 12–2 TAP Controller State Machine Test Logic Reset...
Page 299
Test Interface During the capture operation, the shift register stage of IR is loaded with the value 00001. This automatic load feature is useful for testing the integrity of the IEEE 1149.1 scan chain on the module. Table 12–3 Instruction Register Selected IR<4:0>...
Boundary-Scan Register 12.2.2 Test Status Pin One test status signal test_status_h<1> pin is used for extracting test status informa- tion from the chip. System reset drives the test status pin low. The default operation for test_status_h<1> is to output the IPR-written value. •...
Alpha Instruction Set A.1 Alpha Instruction Summary This appendix contains a summary of all Alpha architecture instructions. All values are in hexadecimal radix. Table A–1 describes the contents of the Format and Opcode columns that are in Table A–2. Table A–1 Instruction Format and Opcode Notation Instruction Format Opcode...
Alpha Instruction Summary Qualifiers for operate instructions are shown in Table A–2. Qualifiers for IEEE and VAX floating-point instructions are shown in Tables A–5 and A–6, respectively. Table A–2 Architecture Instructions (Sheet 1 of 8) Mnemonic Format Opcode Description ADDF 15.080 Add F_floating ADDG...
Page 307
Alpha Instruction Summary Table A–2 Architecture Instructions (Sheet 2 of 8) Mnemonic Format Opcode Description CMOVE if ≥ zero CMOVGE 11.46 CMOVGT 11.66 CMOVE if > zero CMOVLBC 11.16 CMOVE if low bit clear CMOVLBS 11.14 CMOVE if low bit set CMOVE if ≤...
Page 308
Alpha Instruction Summary Table A–2 Architecture Instructions (Sheet 3 of 8) Mnemonic Format Opcode Description CVTGQ 15.0AF Convert G_floating to quadword CVTLQ 17.010 Convert longword to quadword CVTQF 15.0BC Convert quadword to F_floating CVTQG 15.0BE Convert quadword to G_floating CVTQL 17.030 Convert quadword to longword CVTQL/SV...
Page 309
Alpha Instruction Summary Table A–2 Architecture Instructions (Sheet 4 of 8) Mnemonic Format Opcode Description Floating branch if ≥ zero FBGE FBGT Floating branch if > zero Floating branch if ≤ zero FBLE FBLT Floating branch if < zero Floating branch if ≠ zero FBNE FCMOVEQ 17.02A...
Page 311
Alpha Instruction Summary Table A–2 Architecture Instructions (Sheet 6 of 8) Mnemonic Format Opcode Description MSKWH 12.52 Mask word high MSKWL 12.12 Mask word low MT_FPCR 17.024 Move to floating-point control register MULF 15.082 Multiply F_floating MULG 15.0A2 Multiply G_floating MULL 13.00 Multiply longword...
Page 312
Alpha Instruction Summary Table A–2 Architecture Instructions (Sheet 7 of 8) Mnemonic Format Opcode Description S8SUBQ 10.3B Scaled subtract quadword by 8 SEXTB 1C.00 Store byte SEXTW 1C.01 Store word 12.39 Shift left logical 12.3C Shift right arithmetic 12.34 Shift right logical Store byte Store F_floating Store G_floating...
VAX Floating-Point Instructions Table A–5 IEEE Floating-Point Instruction Function Codes (Sheet 3 of 3) Mnemonic None /SVC /SVI /SVIC CVTTQ Mnemonic /SVD /SVID /SVM /SVIM CVTTQ Note: Because underflow cannot occur for CMPTxx, there is no difference in function or performance between CMPTxx/S and CMPTxx/SU. It is intended that software generate CMPTxx/SU in place of CMPTxx/S.
Opcode Summary Table A–6 VAX Floating-Point Instruction Function Codes (Sheet 2 of 2) Mnemonic None /SUC DIVG MULF MULG SUBF SUBG Mnemonic None /SVC CVTGQ A.4 Opcode Summary Table A–7 lists all Alpha opcodes from 00 (CALL_PAL) through 3F (BGT). In the table, the column headings that appear over the instructions have a granularity of 8 The rows beneath the Offset column supply the individual hexadecimal number to resolve that granularity.
Required PALcode Function Codes A.5 Required PALcode Function Codes The opcodes listed in Table A–8 are required for all Alpha implementations. The notation used is oo.ffff, where oo is the hexadecimal 6-bit opcode and ffff is the hexadecimal 26-bit function code. Table A–8 Required PALcode Function Codes Mnemonic Type...
Page 320
21164PC Microprocessor IEEE Floating-Point Conformance The divide-by-zero trap is always enabled. If the trap occurs, then the destination register is UNPREDICTABLE. For VAX architecture format, this exception is signaled whenever the numerator is valid and the denominator is zero. For IEEE format, this exception is signaled whenever the numerator is valid and nonzero, with a denominator of ±0.
Page 321
21164PC Microprocessor IEEE Floating-Point Conformance If a CVTQL results in an integer overflow (IOV), then FPCR<INE> is automati- cally set. (The INE trap is never signaled to the IDU because there is no CVTQL opcode that enables the inexact trap.) •...
21164PC Microprocessor Specifications Table B–1 lists specifications for the 21164PC. Table B–1 21164PC Microprocessor Specifications (Sheet 1 of 2) Feature Description Cycle time range 2.50 ns (400 MHz) to 1.87 ns (533 MHz) Process technology 0.35-µm CMOS Transistor count 3.5 million Die size 8.65 ×...
Page 324
Table B–1 21164PC Microprocessor Specifications (Sheet 2 of 2) Feature Description Onchip L1 Dcache 8KB, physical, direct-mapped, write-through, 32-byte block, 32-byte fill Onchip L1 Icache 16KB, virtual, direct-mapped, 64-byte block, 32-byte fill, 128 address space numbers (ASNs) (MAX_ASN=127) Onchip data 64-entry, fully associative, not-last-used replacement, 8K pages, translation buffer 128 ASNs (MAX_ASN=127), full granularity hint support...
Page 325
Serial Icache Load Predecode Values The following C code calculates the predecode values of a serial Icache load. A soft- ware tool called the SROM Packer converts a binary image into a format suitable for Icache serial loading. This tool is available from DIGITAL. #include <stdio.h>...
Page 327
tag, predecodes, owparity; int device_size; ** define the ROM size in bits to determine the maximum number of instructions allowed ** define the number of bits per instruction for 21164PC ICache #define ROMSIZE 262144 #define B_PER_INST 64 main(int argc, char *argv[]) int i, j;...
Page 328
if (argc>1) strcpy(filename,argv[1]); if (argc>2) strcpy(ofilename,argv[2]); if (argc>3) strcpy(hfilename,argv[3]); if (NULL == (infile = fopen(filename,"rb"))) printf("input file open error: %s\n", filename); exit(0); if (NULL == (outfile = fopen(ofilename, "wb"))) printf("binary output file open error: %s\n", ofilename); exit(0); if (NULL == (hexfile = fopen(hfilename, "w"))) printf("hex output file open error: %s\n", hfilename);...
Page 329
build_vector(instr, outvector, &instatus, &instr_count); /* build the vector */ if (instr_count > MAX_INSTR){ printf("\nev5fmt Warning: input file too long.\n"); printf("\tThere are %d instructions in the input file\n", instr_count); printf("\tTruncated after %d instructions\n\n", instr_count, MAX_INSTR); fprintf(hexfile,":15%04X00",offset); chksum = (offset & 0xff) + (offset >> 8) + 0x15; for (j=0;...
Page 330
/* invert bit 2 to match fill scan chain attribute */ owparity ^= eparity(instr[j]); pdparity = eparity(predecodes); /* bhtvector */ for (j=0;j<7;j++) t = BHTfillmap[j]; outvector[t>>5] |= ((bhtvector >> j) & 1) << (t&0x1f); /* instructions */ for (k=0;k<4;k++) for (j=0;j<32;j++) t = dfillmap[j+k*32];...
Page 331
/* asm */ outvector[asmfillmap>>5] |= asm << (asmfillmap&0x1f); /* tag */ for (j=0;j<29;j++) t = tagfillmap[j]; outvector[t>>5] |= ((tag >> j) & 1) << (t&0x1f); int eparity(int x) x = x ^ (x >> 16); x = x ^ (x >> 8); x = x ^ (x >>...
Page 332
int ld; int store; int br; int call_pal; int bsr; int ret_rei; int jmp; int jsr_cor; int jsr; int cond_br; opcode = EXTV(inst, 31, 26 ); func = EXTV(inst, 12, 5); jsr_type = EXTV(inst, 15,14); ra = EXTV(inst,25,21); e0_only = (opcode == 0x24) || /* STF */ (opcode == 0x25) || /* STG */...
Errata Sheet Table D–1 lists the revision history for this document. Table D–1 Document Revision History Date Revision September 29, 1997 Preliminary version, EC-R2W0A-TE 29 September 1997 – Subject To Change Errata Sheet D–1...
Page 339
Support, Products, and Documentation If you need technical support, a Digital Semiconductor Product Catalog, or help deciding which documentation best meets your needs, visit the Digital Semiconductor World Wide Web Internet site: http://www.digital.com/semiconductor You can also call the Digital Semiconductor Information Line or the Digital Semiconductor Customer Technology Center.
Page 340
Digital Semiconductor Products To order the Digital Semiconductor Alpha 21164PC microprocessor, contact your local distributor. The following table lists some of the semiconductor products avail- able from Digital Semiconductor. Note: The following products and order numbers might have been revised. For the latest versions, contact your local distributor.
Page 341
Title Order Number SPICE Models for Alpha Microprocessors and Peripheral Chips: An EC–QA4XE–TE Application Note Alpha Microprocessors SROM Mini-Debugger User’s Guide EC–QHUXC–TE Alpha Microprocessors Motherboard Debug Monitor User’s Guide EC–QHUVE–TE Alpha Microprocessors Motherboard Software Design Tools EC–QHUWC–TE User’s Guide To purchase the Alpha AXP Architecture Reference Manual, contact your local distributor or call Butterworth-Heinemann (Digital Press) at 1-800-366-2665.
Page 343
Glossary The glossary defines terms and spells out acronyms associated with the Alpha 21164PC microprocessor and chips in general. abort The unit stops the operation it is performing, without saving status, to perform some other operation. Advanced bipolar/CMOS technology. address space number (ASN) An optionally implemented register used to reduce the need for invalidation of cached address translations for process-specific addresses when a context switch occurs.
Page 344
assert To cause a signal to change to its logical true state. See asynchronous system trap. asynchronous system trap (AST) A software-simulated interrupt to a user-defined routine. ASTs enable a user process to be notified asynchronously, with respect to that process, of the occurrence of a specific event.
Page 345
BiSr Built-in self-repair. BiSt Built-in self-test. Binary digit. The smallest unit of data in a binary notation system, designated as 0 or Bus interface unit. See CBU. block exchange Memory feature that improves bus bandwidth by paralleling a cache victim write- back with a cache miss fill.
Page 346
byte Eight contiguous bits starting on an addressable byte boundary. The bits are num- bered right to left, 0 through 7. byte granularity Memory systems are said to have byte granularity if adjacent bytes can be written concurrently and independently by different processes or processors. cache See cache memory.
Page 347
The Alpha 21164PC microprocessor contains two onchip internal caches. See also write-through cache and write-back cache.
Page 348
CMOS Complementary metal-oxide semiconductor. A silicon device formed by a process that combines PMOS and NMOS semiconductor material. conditional branch instructions Instructions that test a register for positive/negative or for zero/nonzero. They can also test integer registers for even/odd. control and status register (CSR) A device or controller register that resides in the processor’s I/O space.
Page 349
direct memory access (DMA) Access to memory by an I/O device that does not require processor intervention. dirty One status item for a cache block. The cache block is valid and has been written so that it may differ from the copy in system main memory. dirty victim Used in reference to a cache block in the cache of a system bus node.
Page 350
EPLD Erasable programmable logic device. external cache A cache memory provided outside of the microprocessor chip, usually located on the same module. Also called board-level or module-level cache. FEPROM Flash-erasable programmable read-only memory. FEPROMs can be bank- or bulk- erased. Contrast with EEPROM. Field-effect transistor.
Page 351
granularity A characteristic of storage systems that defines the amount of data that can be read and/or written with a single instruction, or read and/or written independently. VAX systems have byte or multibyte granularities, whereas disk systems typically have 512-byte or greater granularities. For a given storage device, a higher granularity generally yields a greater throughput.
Page 352
The term INTnn, where nn is one of 2, 4, 8, 16, 32, or 64, refers to a data field size of nn contiguous NATURALLY ALIGNED bytes. For example, INT4 refers to a NAT- URALLY ALIGNED longword. internal processor register (IPR) One of many registers internal to the Alpha 21164PC microprocessor. IPGA Interstitial pin grid array. JFET Junction field-effect transistor.
Page 353
Large-scale integration. machine check An operating system action triggered by certain system hardware-detected errors that can be fatal to system operation. Once triggered, machine check handler software analyzes the error. Miss address file. main memory The large memory, external to the microprocessor, used for holding most instruction code and data.
Page 354
module A board on which logic devices (such as transistors, resistors, and memory chips) are mounted and connected to perform a specific system function. module-level cache See external cache. Metal-oxide semiconductor. MOSFET Metal-oxide semiconductor field-effect transistor. Medium-scale integration. Memory address translation unit. The logic unit within the 21164PC microprocessor that performs address translation, interfaces to the Dcache, and performs several other functions.
Page 355
NATURALLY ALIGNED data Data stored in memory such that the address of the data is evenly divisible by the size of the data in bytes. For example, an ALIGNED longword is stored such that the address of the longword is evenly divisible by 4. NMOS N-type metal-oxide semiconductor.
Page 356
parity A method for checking the accuracy of data by calculating the sum of the number of ones in a piece of binary data. Even parity requires the correct sum to be an even number. Odd parity requires the correct sum to be an odd number. Pin grid array.
Page 357
programmable array logic (PAL) A device that can be programmed by a process that blows individual fuses to create a circuit. PROM Programmable read-only memory. pull-down resistor A resistor placed between a signal line and a negative voltage. pull-up resistor A resistor placed between a signal line to a positive voltage.
Page 358
reliability The probability a device or system will not fail to perform its intended functions dur- ing a specified time interval when operated under stated conditions. reset An action that causes a logic unit to interrupt the task it is performing and go to its initialized state.
Page 359
fully associative organization, in which data from anywhere in main memory can be put anywhere in the cache. An “n-way set-associative” cache allows data from a given address in main memory to be cached in any of n locations. SIMM Single inline memory module.
Page 360
superpipelined Describes a pipelined machine that has a larger number of pipe stages and more complex scheduling and control. See also pipeline. superscalar Describes a machine architecture that allows multiple independent instructions to be issued in parallel during a given clock cycle. The part of a cache block that holds the address information used to determine if a memory operation is a hit or a miss on that cache block.
Page 361
UVPROM Ultraviolet (erasable) programmable read-only memory. valid Allocated. Valid cache blocks have been loaded with data and may return cache hits when accessed. victim Used in reference to a cache block in the cache of a system bus node. The cache block is valid but is about to be replaced due to a cache block resource conflict.
Page 362
WRITE BLOCK A transaction in which the 21164PC requests that an external logic unit process write data. write data wrapping System feature that reduces apparent memory latency by allowing write data cycles to differ the usual low-to-high sequence. Requires cooperation between the 21164PC and external hardware.
Page 363
Index Associated documentation Abbreviations ASTER register 5-20 register access ASTRR register 5-20 Aborts 2-17 Absolute Maximum Rating ac coupling Bcache 2-13 addr_bus_req_h errors 4-57 description hit under READ MISS example 4-57 operation 4-38 4-45 interface addr_cmd_par_h introduction 4-2 to 4-6 operation 4-45 9-16...
Page 369
SIRR register 5-21 SL_RCV register 5-26 Queues SL_XMIT register 5-25 entry-pointer 2-34 Slotting 2-21 Specifications mechanical 11-1 Race conditions SROM 2-13 21164PC and system 4-51 srom_clk_h Race examples operation 5-25 9-16 9-19 9-20 12-1 idle_bc_h and cack_h 4-54 srom_data_h READ MISS transaction (no Bcache) 4-31 operation 5-26...
Page 370
tag_data_par_h Timing diagrams description 3-13 Bcache hit under READ MISS 4-57 operation 4-58 9-17 bus contention 4-45 FILL to private read or write 4-50 tag_dirty_h idle_bc_h and cack_h 4-54 description 3-13 READ MISS with idle_bc_h asserted 4-55 operation 9-16 READ MISS with victim 4-53 tag_ram_oe_l READ MISS with victim abort...
Need help?
Do you have a question about the Alpha 21164PC and is the answer not in the manual?
Questions and answers