MIPS MIPS32 74Kf Programming Manual

page of 156

/ 156
Contents
Table of Contents
Bookmarks

Table of Contents

Quick Links

Programming the MIPS32® 74K™ Core

Family

Document Number: MD00541

Revision 02.14

March 30, 2011

Table of Contents

Need help?

Do you have a question about the MIPS32 74Kf and is the answer not in the manual?

Questions and answers

Summary of Contents for MIPS MIPS32 74Kf

Page 1 Programming the MIPS32® 74K™ Core Family Document Number: MD00541 Revision 02.14 March 30, 2011...
Page 2 MIPS and MIPS’ affiliates do not assume any liability arising out of the application or use of this information, or of any error or omission in such information.
Page 3: Table Of Contents
Table of Contents Chapter 1: Introduction ........................11 1.1: Chapters of this manual..........................12 1.2: Conventions............................... 12 1.3: 74K™ core features........................... 13 1.4: A brief guide to the 74K™ core implementation ..................14 1.4.1: Notes on pipeline overview diagram (Figure 1.1):................14 1.4.2: Branches and branch delays......................
Page 4 5.1: Hazard barrier instructions ........................67 5.2: MIPS32® Architecture Release 2 - enhanced interrupt system(s) ............68 5.2.1: Traditional MIPS® interrupt signalling and priority ................69 5.2.2: VI mode - multiple entry points, interrupt signalling and priority............70 5.2.3: External Interrupt Controller (EIC) mode..................70 5.3: Exception Entry Points ..........................
Page 5 6.5.7: Delays caused by dependency on FPU status register fields ............85 6.5.8: Slower operation in MIPS I™ compatibility mode ................85 Chapter 7: The MIPS32® DSP ASE ..................... 87 7.1: Features provided by the MIPS® DSP ASE....................87 7.2: The DSP ASE control register ........................88 7.2.1: DSP accumulators ...........................
Page 6 B.3.4: The DDataLo, IDataHi and IDataLo registers ................150 B.3.5: The ErrorEPC register ........................150 Appendix C: MIPS® Architecture quick-reference sheet(s) ............151 C.1: General purpose register numbers and names ..................151 C.2: User-level changes with Release 2 of the MIPS32® Architecture ............151 C.2.1: Release 2 of the MIPS32®...
Page 7 List of Figures Figure 1.1: Overview of The 74K™ Pipeline ......................14 Figure 2.1: Fields in the Config Register......................... 22 Figure 2.2: Fields in the Config1 Register....................... 23 Figure 2.3: Fields in the Config2 Register....................... 23 Figure 2.4: Config3 Register Format........................24 Figure 2.5: Config6 Register Format........................
Page 8 Figure 8.15: Fields in the hardware breakpoint control registers (IBCn, DBCn) ........... 117 Figure 8.16: Fields in the TCBCONTROLE register ..................... 122 Figure 8.17: Fields in the TCBCONFIG register ....................123 Figure 8.18: Fields in the TraceControl Register ....................123 Figure 8.19: Fields in the TraceControl2 Register ....................
Page 9 List of Tables Table 2.1: Roles of Config registers........................21 Table 2.2: 74K™® core releases and PRId[Revision] fields ................... 26 Table 3.1: Basic MIPS32® architecture memory map .................... 29 Table 3.2: Fixed memory mapping.......................... 30 Table 3.3: Cache Code Values ..........................34 Table 3.4: Operations on a cache line available with the cache instruction............
Page 10 Programming the MIPS32® 74K™ Core Family, Revision 02.14...
Page 11: Chapter 1: Introduction
More precisely, you should deﬁnitely be reading this manual if you have an OS, compiler, or low-level application which already runs on some earlier MIPS CPU, and you want to adapt it to the 74K core. So this document concen- trates on where a MIPS 74K family core behaves differently from its predecessors.
Page 12: Chapters Of This Manual
74Kf™. • Chapter 7, “The MIPS32® DSP ASE” on page 87: A brief summary of the MIPS DSP ASE (revision 2), avail- able on members of the 74K core family. • Chapter 8, “74K™ core features for debug and profiling” on page 102: the debug unit, performance counters and watchpoints.
Page 13: 74K™ Core Features
74K core in your design and shows up as the Conﬁg7[AR] bit. L2 (secondary) cache: you can conﬁgure your 74K core with MIPS Technologies’ L2 cache between 128Kbyte and 1Mbyte in size. Full details are in “MIPS® PDtrace™ Interface and Trace Control Block Specification”, MIPS Technologies document MD00439. Current revision is 4.30: you need revision 4 or greater to get multithreading trace information.
Page 14: A Brief Guide To The 74K™ Core Implementation
(even on cache hits, the data cannot be available for some number of instructions). Earlier MIPS Technologies cores had no real trouble with dependencies (dependent instructions, in almost all cases, can run in consecutive cycles).
Page 15 Introduction units and each has a phrase (in italics) summarizing what it does. The three-letter acronyms match those found in the detailed descriptions, and the pipeline stage names used in the detailed descriptions are across the top. To simplify the picture the integer multiply unit and the (optional) ﬂoating point unit have been omitted —...
Page 16 1.4 A brief guide to the 74K™ core implementation There are a few simple instructions where the ALU produces its results in one clock (they’re listed in Table 4.3), but most ALU instructions require two clocks: so, in the 74K core, dependent ALU instructions cannot usually be run back-to-back.
Page 17: 2: Branches And Branch Delays
Branch instructions are identiﬁed very early (in fact, they’re marked when instructions are fetched into the I– cache). MIPS branch and jump instructions (at least those not dependent on register values) are easy to decode, and the IFU decodes them locally to calculate the target address.
Page 18: 3: Loads And Load-To-Use Delays
Even short-pipeline MIPS CPUs can’t deliver load data to the immediately following instruction without a delay, even on a cache hit. Simple MIPS pipelines typically deliver the data one clock later: a one clock “load-to-use delay”. Compilers and programmers try to put some useful and non-dependent operation between the load and its ﬁrst use.
Page 19: 4: Queues, Resource Limits And Consequences
It’s like the skewed pipeline which experts in MIPS Technologies’ 24K® family might remember, and has the same motivation: ALU operations dependent on recent loads are more common than loads dependent on recent ALU oper- ations.
Page 20 1.4 A brief guide to the 74K™ core implementation Programming the MIPS32® 74K™ Core Family, Revision 02.14...
Page 21: Chapter 2: Initialization And Identity
Chapter 2 Initialization and identity What happens when the CPU is ﬁrst powered up? These functions are perhaps more often associated with a ROM monitor than an OS. 2.1 Probing your CPU - Conﬁg CP0 registers The four registers Conﬁg Conﬁg1-3 are 32-bit CP0 registers which contain information about the CPU’s capa- bilities.
Page 22: 1: The Config Register
DSP: read 1 if I-side and/or D-side scratchpad (SPRAM) is ﬁtted, see Section 3.6, "Scratchpad memory/ SPRAM". (Don’t confuse this with the MIPS DSP ASE, whose presence is indicated by Conﬁg3[DDSP].) UDI: reads 1 if your core implements user-deﬁned "CorExtend" instructions. “CorExtend” is available on cores whose name ends in "Pro".
Page 23: 2: The Config1-2 Registers
2 “BAT” type 3 MIPS-standard ﬁxed mapping VI: 1 if the L1 I-cache is virtual (both indexed and tagged using virtual address). No contemporary MIPS Technologies core has a virtual I-cache. K0: as described in the notes above on Conﬁg[K23] etc, this ﬁeld determines the cacheing behaviour of the ﬁxed kseg0 memory region .
Page 24: 3: The Config3 Register
Writing this bit controls a signal out to the L2 cache hardware. However, reading it does not read back what you just wrote: it reﬂects the value of a signal sent back from the L2 cache. With MIPS Technologies' L2 cache logic, that feedback signal will reﬂect the value you just wrote, with some implementation-dependent delay (it's unlikely to be...
Page 25: 4: The Config6 Register
UserLocal register, typically used by software threads packages. DSP2P, DSPP: DSPP reads 1 if the MIPS DSP extension is implemented — as described in Chapter 7, “The DSP2P MIPS32® DSP ASE” on page 87. If so, reads 1 if your CPU conforms to revision 2 of the DSP ASE — as the 74K core does.
Page 26: 5: Cpu-Specific Configuration - Config7
CP0 features must have a new ﬁeld. PRId[Rev]: The revision number of the core design, used to index entries in errata lists etc. By MIPS Technologies’ convention the revision ﬁeld is divided into three subﬁelds: a major and minor number; with a nonzero "patch" revi- sion number is for a release with no functional change.
Page 27 2.2.0 / 0x48 Allow up to 9 TCs, alias-free 64KB L1 D-cache option. August 31, 2006 2_2_1 2.2.1 / 0x49 Enable use of MIPS SOC-it® L2 Cache Controller. October 12, 2006 2_3_* 2.3.0 / 0x4c Less interlocks round cache instructions, relocatable January 3, 2007 reset exception vector location.
Page 28 2.2 PRId register — identifying your CPU type Programming the MIPS32® 74K™ Core Family, Revision 02.14...
Page 29: Chapter 3: Memory Map, Caching, Reads, Writes And Translation
A TLB-equipped sees the memory map described by the [MIPS32] architecture (which will be familiar to anyone who has used a 32-bit MIPS architecture CPU) and is summarized in Table 3.1. The TLB gives you access to a full 32-bit physical address on the system interface. More information about the TLB in Section 3.8, "The TLB and...
Page 30: Fixed Mapping Option
The MIPS architecture permits implementations a fair amount of freedom as to the order in which loads and stores appear at the CPU interface. Most of the time anything goes: so long as the software behaves correctly, the MIPS architecture places few constraints on the order of reads and writes seen by some other agent in a system.
Page 31: 2: The "Sync" Instruction In 74K™ Family Cores
Memory map, caching, reads, writes and translation The WBB (Write Back Buffer) queue holds data waiting to be sent out over the system interface, either from D-cache writebacks or uncached/write-through store instructions. FSB (Fill Store buffer) queue entries are used to hold data that is waiting to be written into the D-cache. An FSB entry gets used during a cache miss (when it holds the reﬁll data), or a write which hits in the cache (when it holds the data the CPU wrote).
Page 32: 3: Write Gathering And "Write Buffer Flushing" In 74K™ Family Cores
But this section is about when caches aren’t invisible any more. Like most modern MIPS CPUs, the 74K core has separate primary I- and D-caches. They are virtually-indexed and physically-tagged, so you may need to deal with cache aliases, see Section 3.4.9, "Cache...
Page 33: 2: Cacheability Options
There’s very little in this manual about that option. — see “MIPS® PDtrace™ Interface and Trace Control Block Specification”, MIPS Technologies document MD00439. Current revision is 4.30: you need revision 4 or greater to get multithreading trace information. [L2CACHE].
Page 34: 3: Uncached Accelerated Writes
3.4 Caches shows the code values used in - the same codes are used in the entries used to set the behavior of EntryLo[C] Conﬁg regions with ﬁxed mappings (the latter are described in Table 3.2.) Some of the undeﬁned cacheability code values are reserved for use in cache-coherent systems. Table 3.3 Cache Code Values Code Cached?
Page 35: 5: Cache Instructions And Cp0 Cache Tag/Data Registers
3.4.5 Cache instructions and CP0 cache tag/data registers MIPS Technologies’ cores use different CP0 registers for cache operations targeted at different caches. That’s already quite confusing, but to make it more interesting these registers have somehow got different names — those used here...
Page 36: Table 3.4 Operations On A Cache Line Available With The Cache Instruction
3.4 Caches Table 3.4 Operations on a cache line available with the cache instruction Value Command What it does Index invalidate Sets the line to “invalid”. If it’s a D-cache or L2 cache line which is valid and “dirty” (has been written by CPU since fetched from memory), then write the con- tents back to memory ﬁrst.
Page 37: 6: L1 Cache Instruction Timing
Memory map, caching, reads, writes and translation and in C header ﬁles. I hope Table 3.1 helps. In the rest of this document we’ll either use the full software name or (quite often) just talk of without qualiﬁcation.: TagLo Table 3.1 Caches and their CP0 cache tag/data registers Cache CP0 Registers CP0 number...
Page 38: 9: Cache Aliases
1; but if it’s Conﬁg7[IAR] There’s a fair amount of rather ugly code in the MIPS Linux kernel to work around aliases. D-cache aliases (in particular) are dealt with at the cost of quite a large number of extra invalidate operations.
Page 39: 10: Cache Locking
1. Refer to Conﬁg7[IVA] Section B.2.1 “The Config7 register” for details. The MIPS Technology supplied L2 cache (if conﬁgured) is physically indexed and physically tagged, so does not suf- fer from aliases. 3.4.10 Cache locking [MIPS32] provides for a mechanism to lock a cache line so it can’t be replaced. This avoids cache misses on one par- ticular piece of data, at the cost of reducing overall cache efﬁciency.
Page 40: 12: L23Taglo Regiser
3.4 Caches : the cache address tag - a physical address because the 74K core’s caches are physically tagged. It holds bits PTagLo 31–12 of the physical address - the low 12 bits of the address are implied by the position of the data in the cache. ×...
Page 41: 15: Taglo Registers In Special Modes
Memory map, caching, reads, writes and translation Figure 3.5 L23DataHi Register Format DATA Table 3.6 L23DataHi Register Field Description Fields Read / Name Bit(s) Description Write Reset State DATA 31:0 High-order data read from the cache data array. Undeﬁned 3.4.15 TagLo registers in special modes The usual register ﬁelds are a view of the underlying cache tags.
Page 42: 17: Errctl Register
: the way-number of the cache entry where the error occurred. Caution: for the L1 caches (which are no more than 4-way set associative) this is a two-bit ﬁeld. But an L2 cache might be more highly set-associative, and then this ﬁeld grows down. In particular, MIPS’ (possibly 8-way set associative) L2 cache uses a 3-bit ﬁeld as shown.
Page 43: Bus Error Exception
Memory map, caching, reads, writes and translation Figure 3.7 Fields in the ErrCtl Register 12 11 PE PO WST SPR PCO ITC LBE WABE L2P PCD DYT SE Two ﬁelds are ‘overﬂow’ from the CacheErr register and relate to the error state: FE/SE : Used to detect nested errors.
Page 44: Scratchpad Memory/Spram
I- and D-side (ISPRAM and DSPRAM). MIPS Technologies provide the interface on which users can build many types and sizes of SPRAM. We also provide a “reference design” for both ISPRAM andDSPRAM, which is what is described here. If you keep the programming interface the same as the reference design, you’re more likely to be able to ﬁnd software support.
Page 45: Figure 3.8: Spram (Scratchpad Ram) Configuration Information In Taglo
(but it will always be a multiple of 4KB). In some MIPS cores using this sort of tag setup there could be multiple scratchpad regions indicated by two or more of these tag pairs. But the reference design provided with the 74K core can only have one I-side and one D-side region.
Page 46: Common Device Memory Map
3.7 Common Device Memory Map Don’t forget to set back to zero when you’re done. ErrCtl[SPR] 3.7 Common Device Memory Map In order to preserve the limited CP0 register address space, many new architectural enhancements, particularly those requiring several registers, will be memory mapped, that is, accessed by uncached load and store instructions. In order to avoid creating dozens of memory regions to be managed, the common device memory map (CDMM) was created to group them into one region.
Page 47: The Tlb And Translation
3.8 The TLB and translation The TLB is the key piece of hardware which MIPS architecture CPUs have for memory management. It’s a hardware array, and for maintenance you access ﬁelds by their index. For memory translation, it’s a real content-addressed memory, whose input is a virtual page address together with the “address space identiﬁer”...
Page 48: 2: Live Translation And Micro-Tlbs
It costs six extra clocks to reﬁll the ITLB for any access whose translation is not already present. In 74K family cores (unlike other cores from MIPS Technologies) there is no D-side micro-TLB — D-side translation uses the main TLB directly. uTLB entries can only map 4KB and 16KB pages (main TLB entries can handle a whole range of sizes from 4KB to 256MB).
Page 49: 4: Reading And Writing Tlb Entries - Entrylo0-1, Entryhi And Pagemask Registers
Memory map, caching, reads, writes and translation Of these: determines which TLB entry is accessed by tlbwi. It’s also used for the result of a tlbp (the Index instruction you use to see whether a particular address would be successfully translated by the CPU). only Index implements enough bits to index the TLB, however big that is;...
Page 50: 5: Tlb Initialization And Duplicate Entries
Since the TLB is a fully-associative array and entries are written by index, it’s possible to load duplicate entries - two or more entries which match the same virtual address/ASID. In older MIPS CPUs it was essential to avoid duplicate entries - even duplicate entries where all the entries are marked “invalid”.
Page 51: 6: Tlb Exception Handlers - Badvaddr, Context, And Contextconfig Registers
“kseg0” virtual addresses for the initial all-invalid entries. Most MIPS Technologies cores protect themselves and you by taking a “machine check” exception if a TLB update would have created a duplicate entry Some earlier MIPS Technologies cores suffer a machine check even if duplicate entries are both invalid.
Page 52: Figure 3.15: Fields In The Context Register When Config3Ctxtc=1 Or Config3Sm=1
3.8 The TLB and translation =0 and =0, then the register is organized in such a way that the operating system Conﬁg3 Conﬁg3 Context CTXTC can directly reference a 16-byte structure in memory that describes the mapping. For PTE structures of other sizes, the content of this register can be used by the TLB reﬁll handler after appropriate shifting and masking.
Page 53: Figure 3.16: Fields In The Contextconfig Register
Memory map, caching, reads, writes and translation register is optional and its existence is denoted by the register ﬁelds. ContextConﬁg Conﬁg3 Conﬁg3 CTXTC Figure 3.16 shows the formats of the Register. ContextConﬁg Figure 3.16 Fields in the ContextConﬁg register VirtualIndex is a mask of 0 to 32 contiguous 1 bits that cause the corresponding bits of the register to be writ- VirtualIndex...
Page 54 3.8 The TLB and translation Programming the MIPS32® 74K™ Core Family, Revision 02.14...
Page 55: Chapter 4: Programming The 74K™ Core In User Mode
Chapter 4 Programming the 74K™ core in user mode This chapter is not very long, because in user mode one MIPS32-compliant CPU looks much like another. But not everything — sections include: • Section 4.1, "User-mode accessible “Hardware registers”" • Section 4.2, "Prefetching data": how it works.
Page 56: Prefetching Data
4.2 Prefetching data returns zero, that means that your hardware ensures that your caches are instruction/data coher- SYNCI_Step ent, and you don’t need to use at all. synci • CC (2): user-mode read-only access to the CP0 register, for high-resolution counting. Which wouldn’t be Count much good without.
Page 57: The Multiplier
Figure B.3 and notes. 4.4 The multiplier As is traditional with MIPS CPUs, the integer multiplier is a semi-detached unit with its own pipeline. All MIPS32 CPUs implement: • : a 32×32 multiply of two GPRs (signed and unsigned versions) with a 64-bit result delivered in the...
Page 58: Tuning Software For The 74K™ Family Pipeline
Many of the most powerful instructions in the MIPS DSP ASE are variants of multiply or multiply-accumulate oper- ations, and are described in Chapter 9, “The MIPS32® DSP ASE” on page 121Chapter 7, “The MIPS32® DSP ASE”...
Page 59: 2: Branch Delay Slot
The rationale for this is that it’s extremely difﬁcult to fetch the branch target quickly enough to avoid a delay, so the extra instruction runs “for free”... Most of the time, the compiler deals well with this single delay slot. MIPS low-level programmers ﬁnd it odd at ﬁrst, but you get used to it! 4.6 Tuning ﬂoating-point...
Page 60: Branch Misprediction Delays
4.7 Branch misprediction delays 4.7 Branch misprediction delays In a long-pipeline design like this, branches would be expensive if you waited until the branch was executed before fetching any more instructions. See Section 1.4 “A brief guide to the 74K‘ core implementation” for what is done about this: but the upshot is that where the fetch logic can’t compute the target address, or guesses wrong, that’s going to cost 12 or more lost cycles (since when we’re not blocked on a cache miss we hope to average substantially more...
Page 61: Data Dependency Delays
(and there are notes about it, above). The MIPS instruction set is efﬁcient for short pipelines because, most of the time, dependent instructions can be run nose-to-tail, just one clock apart, without extra delay. Even in the more sophisticated 74K family CPUs, most depen- dent instructions can run just two clocks apart.
Page 62: Table 4.2 Register → Eager Consumer Delays
4.10 Data dependency delays Table 4.2 Register → eager consumer delays Reg → Eager consumer Applies when... GPR → load/store 1 the GPR value is an address operand. Store data is not needed early. ACC → multiply instructions 3 the ACC value came from any multiply instruction which saturates the accumulator value.
Page 63: Table 4.3: Producer → Register Delays
Table 4.3 Producer → register delays Lazy producer → Reg Applies when... All bitwise logical instructions, including immediate versions These instructions only are “not lazy”: their result addu rd,rs,$0 (add zero, aka mov) can be used in the next cycle by any ALU instruc- sll with shift amount 8 or less tion.
Page 64: 1: More Complicated Dependencies
The access is interlocked, and will lead to a delay of up to three clocks. We don’t expect that to be a problem (but if you know different, please get in touch with MIPS Technologies). 4.10.1 More complicated dependencies There can be delays which are dependent on the dynamic allocation of resources inside the CPU.
Page 65: Advice On Tuning Instruction Sequences (Particularly Dsp)
You can often avoid this by using the “masked” versions of these instructions to read or write only the ﬁeld you’re particularly interested in. 4.12 Multiply/divide unit and timings As is traditional with MIPS CPUs, the integer multiplier is a semi-detached unit with its own pipeline. All MIPS32 CPUs implement: •...
Page 66 4.12 Multiply/divide unit and timings No multiply/divide operation ever produces an exception - even divide-by-zero is silent — compilers typically insert explicit check code where it’s required. Timing varies. Multiply-accumulate instructions (there are many different ﬂavors of MAC in the DSP ASE) have been pipelined and tuned to achieve a 1-instruction-per-clock repeat rate, even for sequences of instructions targeting the same accumulator.
Page 67: Chapter 5: Kernel-Mode (Os) Programming And Release 2 Of The Mips32® Architecture
But they’re most often met around CP0 read/writes, so they found their way to this chapter. Traditionally, MIPS CPUs left the kernel/low-level software engineer with the job of designing sequences which are guaranteed to run correctly, usually by padding the dangerous operation with enough nop or ssnop instructions.
Page 68: Mips32® Architecture Release 2 - Enhanced Interrupt System(S)
5.2 MIPS32® Architecture Release 2 - enhanced interrupt system(s) ducer affects even the instruction fetch of the consumer - that’s an “instruction hazard” - or only affecting the opera- tion of the consuming instruction (an “execution hazard”). Hazard barriers come in two strengths: ehb deals only with execution hazards, while eret, jr.hb and jalr.hb are barriers to both kinds of hazard.
Page 69: 1: Traditional Mips® Interrupt Signalling And Priority
5.2.1 Traditional MIPS interrupt signalling and priority Before we discuss the new features, we should remind you what was there already. On traditional MIPS systems the Cause[IP] CPU takes an interrupt exception on any cycle where one of the eight possible interrupt sources visible in is active, enabled by the corresponding enable bit in Status[IM], and not otherwise inhibited.
Page 70: 2: Vi Mode - Multiple Entry Points, Interrupt Signalling And Priority
5.2 MIPS32® Architecture Release 2 - enhanced interrupt system(s) The original MIPS32 speciﬁcation adds an option to this. If you set the Cause[IV] bit, the same priority-blind inter- rupt handling happens but control is passed to an interrupt exception entry point which is separate from the general exception handler.
Page 71: Exception Entry Points
ﬁxed entry points. But there were already complications. When a CPU starts up main memory is typically random and the MIPS caches are unusable until initialized; so MIPS CPUs start up in uncached ROM memory space and the exception entry points...
Page 72: 1: Summary Of Exception Entry Points
EBase BASE is 0x8000.0000, as it will be where the software, ignoring the register, leaves it at its power-on value — that’s also compatible with older MIPS CPUs. Otherwise BASE is the 4Kbyte-aligned address found in EBase after you ignore the low 12 bits...
Page 73: Shadow Registers
On to the details... MIPS shadow registers come as one or more extra complete set of 32 general purpose registers. The CPU only changes register sets on an exception or when returning from an exception with eret.
Page 74: Figure 5.4 Fields In The Srsmap Register
5.4 Shadow registers If the CPU is not in EIC mode, this ﬁeld reads zero. Conﬁg3[VInt] Cause[IV] In VI mode (no external interrupt controller, reads 1 and has been set 1) the core sees only eight possible interrupt numbers; the SRSMap register contains eight 4-bit ﬁelds deﬁning the register set to use for each of the eight interrupt levels.
Page 75: Saving Power
Kernel-mode (OS) programming and Release 2 of the MIPS32® Architecture If you are remaining with “classic” interrupt mode, it’s still possible to use one shadow set for all exception handlers - including interrupt handlers - by setting non-zero. SRSCtl[ESS] In “EIC” interrupt mode, this register has no effect and the shadow set number to be used is determined by an input bus from the interrupt controller.
Page 76 5.6 The HWREna register - Control user rdhwr access HWREna[CCRes]: Set this bit 1 so a user-mode can determine whether Count runs at the full clock rate or rdhwr 3 some divisor. HWREna[CC]: Set this bit 1 so a user-mode can read out the value of the Count register.
Page 77: Chapter 6: Floating Point Unit
• Omits “paired single” and MIPS-3D extensions: those are primarily aimed at 3D graphics, and are described as optional in [MIPS64V2]. • Uses an autonomous 7-stage pipeline: all data transfers are interlocked, so the programmer is never aware of the pipeline.
Page 78: Basic Instruction Set
When 32-bit data is held in a 64-bit register, the high 32 bits are don’t care. The MIPS Architecture’s 32-bit and 64-bit ﬂoating point formats are compatible with the deﬁnitions of “single preci- sion” and “double precision” in [IEEE754].
Page 79: Floating Point Loads And Stores
6.4.2 FPU “unimplemented” exceptions (and how to avoid them) It’s a long-standing feature of the MIPS Architecture that FPU hardware need not support every corner-case of the IEEE standard. But to ensure proper IEEE compatibility to the software system, an FPU which can’t manage to gen- erate the correct value in every case must detect a combination of operation and operands it can’t do right.
Page 80: 3: Fpu Control Register Maps
F64/L/W/D/S: this is a 64-bit ﬂoating point unit and implements 64-bit integer (“L”), 32-bit integer (“W”), 64-bit FP double (“D”) and 32-bit FP single (“S”) operations. • 3D: does not implement the MIPS-3D ASE. • PS: does not implement the paired-single instructions described in [MIPS64V2] •...
Page 81: Figure 6.3 Floating Point Control/Status Register And Alternate Views
== [0,0,0]. FCSR[FS,FO,FN] To get the best performance compatible with a guarantee of no “unimplemented” exceptions, set == [1,1,1]. Just occasionally for legacy applications developed with older MIPS CPUs which did not have the options, FCSR[FS,FO,FN] you might set == [1,0,0].
Page 82: Fpu Pipeline And Instruction Timing
- even 1/3 is inexact in binary. Then the: – Enables ﬁeld is "write 1 to take a MIPS exception if this condition occurs" - rarely done. With the IEEE exception-catcher disabled, the hardware/emulator together will provide a suitable exceptional result. – Cause ﬁeld records what if any conditions occurred in the last-executed FP instruction.
Page 84: 1: Fpu Register Dependency Delays
6.5.4 Delays when main pipeline waits for FPU to decide not to take an exception The MIPS architecture requires FP exceptions to be “precise”, which (in particular) means that no instruction after the FP instruction causing the exception may do anything software-visible. That means that an FP instruction in the main pipeline may not be committed, nor leave the main pipeline, until the FPU can either report the exception, or conﬁrm that the instruction will not cause an exception.
Page 85: 5: Delays When Main Pipeline Waits For Fpu To Accept An Instruction
Software written for those old CPUs is incompatible with the full modern FPU, so there’s a compatibility bit provided Status[FR] - set zero to use MIPS I compatible code. This comes at the cost of slower repeat rates for FP instruc- tions, because in compatibility mode not all the bypasses shown in the pipeline diagram above are active.
Page 86 6.5 FPU pipeline and instruction timing Programming the MIPS32® 74K™ Core Family, Revision 02.14...
Page 87: Chapter 7: The Mips32® Dsp Ase
The MIPS32® DSP ASE The MIPS DSP ASE is provided to accelerate a large range of DSP algorithms. You can get most programming infor- mation from this chapter. There’s more detail in the formal DSP ASE speciﬁcation [MIPSDSP], but expect to read through lots of material aimed at hardware implementors.
Page 88: The Dsp Ase Control Register
Data” or vector operations, where the same arithmetic operation is applied in parallel to several sets of operands. In the MIPS DSP ASE, some operations are SIMD type - two 16-bit operations or four 8-bit operations are car- ried out in parallel on operands packed into a single 32-bit general-purpose register. Instructions operating on vectors can be recognized because the name includes.ph (paired-half, usually signed, often fractional) or.qb...
Page 89: 1: Dsp Accumulators
You can ﬁnd out if your core supports the DSP ASE by testing the Config3[DDSP] bit (see notes to Figure 2.4). Then you need to enable use of instructions from the MIPS DSP ASE by setting to 1. Status[MX] Programming the MIPS32® 74K™ Core Family, Revision 02.14...
Page 90: Dsp Instructions
7.4 DSP instructions 7.4 DSP instructions The DSP instruction set is nothing like the regular and orthogonal MIPS32 instruction set. It’s a collection of special- case instructions, in many cases aimed at the known hot-spots of important algorithms. We’ll summarize the instructions under headings, but then list all of them in Section 7.2, "DSP instructions in alphabetical order", an alphabetically-ordered list which provides a terse but usually-sufﬁcient description of what...
Page 91: 2: Arithmetic - 64-Bit
The MIPS32® DSP ASE 7.4.2 Arithmetic - 64-bit addsc/addwc generate and use a carry bit, for efﬁcient 64-bit add. 7.4.3 Arithmetic - saturating and/or SIMD Types • 32-bit signed saturating arithmetic: addq_s.w, subq_s.w and absq_s.w. • Paired-half and quad-byte SIMD arithmetic: perform the same operation simultaneously on both 16-bit halves or all four 8-bit bytes of a 32-bit register.
Page 92: 6: Conversions To And From Simd Types
7.4 DSP instructions 7.4.6 Conversions to and from SIMD types Conversion operations from larger to smaller fractional types have names which start “precrq...” for “precision reduction, fractional”. Conversion operations from smaller to larger have names which start “prece...” for “preci- sion expansion”.
Page 93: 8: Multiply Q15S From Paired-Half And Accumulate
The mask bits tie up with ﬁelds like this: Table 7.1 Mask bits for instructions accessing the DSPControl register Mask Bit DSPControl ﬁeld scount ouflag ccond 22. Well, an integer instruction is also included in the MIPS SmartMIPS™ ASE. Programming the MIPS32® 74K™ Core Family, Revision 02.14...
Page 94: 11: Accumulator Access Instructions
7.4 DSP instructions 7.4.11 Accumulator access instructions • Historical instructions which now access new accumulators: the familiar mfhi/mflo/mthi/mtlo instructions now take an optional extra accumulator-number parameter. • Shift and move to general register: extr.w/extr_r.w/extr_rs.w gets a 32-bit ﬁeld from an accumulator (starting at bit 0 up to 31) and puts the value in a general purpose register.
Page 95: 13: Other Dsp Ase Instructions
The MIPS32® DSP ASE • Q15 dot product from paired-half, and accumulate: dpaq_s.w.ph does a SIMD multiply of the Q15 halves of the operands, then adds the results and saturates to form a Q31 fraction, which is accumulated into a Q32.31 frac- tion in the accumulator.
Page 96: Almost Alphabetically-Ordered Table Of Dsp Ase Instructions
7.6 Almost Alphabetically-ordered table of DSP ASE instructions typedef long long int64; typedef int int32; /* accumulator type */ typedef signed long long q32_31; typedef signed int q31; #define MAX31 0x7FFFFFFF #define MIN31 -(1<<31) #define SAT31(x) (x > MAX31 ? MAX31: x < MIN31 ? MIN31: x) typedef signed short q15;...
Page 97 The MIPS32® DSP ASE Table 7.2 DSP instructions in alphabetical order Instruction Description Add setting carry, then add with carry. The carry bit is kept in DSPControl[c]. So to add addsc rd,rs,rt the 64-bit values in registers yhi/ylo, zhi/zlo to produce a 64-bit value in xhi/xlo, just do: addwc rd,rs,rt addsc xlo, ylo, zlo;...
Page 98 7.6 Almost Alphabetically-ordered table of DSP ASE instructions Table 7.2 DSP instructions in alphabetical order Instruction Description extr.w rt,ac,shift Extracts a bit ﬁeld from an accumulator into a general purpose register. The LS bit of the extracted ﬁeld can start anywhere from bit zero to 31 of the accumulator: extr_r.w rt,ac,shift int64 ac;...
Page 99 The MIPS32® DSP ASE Table 7.2 DSP instructions in alphabetical order Instruction Description mulsaq_s.w.ph ac,rs,rt ac += (LEFT_H(rs)*LEFT_H(rt)) - (RIGHT_H(rs)*RIGHT_H(rt)); The multiplications are done to Q31 values, saturated if they overﬂow (which is only possible when -1¥-1 makes +1). The accumulator is really a Q32.31 value, so is unlikely to overﬂow;...
Page 100: Dsp Ase Instruction Timing
7.7 DSP ASE instruction timing Table 7.2 DSP instructions in alphabetical order Instruction Description shll.ph rd, rt, sa 2×SIMD (paired-half) shift left. The “v” versions take the shift amount from a register, and the “_s” versions saturate the result to a signed 16-bit range. shllv.ph rd, rt, rs shll_s.ph rd, rt, sa shllv_s.ph rd, rt, rs...
Page 101 The MIPS32® DSP ASE Programming the MIPS32® 74K™ Core Family, Revision 02.14...
Page 102: Chapter 8: 74K™ Core Features For Debug And Profiling
8.1.5, "The “dseg” memory decode region". • A distinguished debug exception. In MIPS EJTAG, this is a special “super-exception” marked by a special debug-exception-level ﬂag, so you can use an EJTAG debugger even on regular exception handler code. See Section 8.1.2, "Debug mode"...
Page 103: 1: Debug Communications Through Jtag
Table 8.1 JTAG instructions for the EJTAG unit JTAG “Instruction” Description IDCODE Reads out the MIPS core and revision - not very interesting for software, not described further here. Reads bit-ﬁeld showing what EJTAG options are implemented - see Figure 8.5 below.
Page 104: 3: Exceptions In Debug Mode
Table 8.2. The MIPS trace solution provides software the ability to access the on-chip trace memory. The TCB Registers are mapped to drseg space and this allows software to directly access the on-chip trace memory using load and store instructions.
Page 105: Table 8.2: Ejtag Debug Memory Region Map ("Dseg")
74K™ core features for debug and profiling Table 8.2 EJTAG debug memory region map (“dseg”) Virtual Address Region/sub-regions Location/register Virtual Address kseg2 0xE000.0000 0xE000.0000 0xFF1F.FFFF 0xFF1F.FFFF dseg dmseg fastdata 0xFF20.0000 0xFF20.0000 0xFF20.000F 0xFF20.000F 0xFF20.0010 0xFF20.0010 debug entry 0xFF20.0200 0xFF20.0200 0xFF2F.FFFF 0xFF2F.FFFF drseg DCR register 0xFF30.0000...
Page 106: 6: Ejtag Cp0 Registers, Particularly Debug
8.1 EJTAG on-chip debug unit • dseg: is the whole debug-mode-only memory area. It’s possible for debug-mode software to read the “kseg2”-mapped locations “underneath” by setting (see Figure 8.1). Debug[LSNM] • dmseg: is the memory region where reads and writes are implemented by the probe. But if no active probe is plugged in, or if DCR[PE] is clear, then accesses here cause reads and writes to be handled like regular “kseg3”...
Page 107: Figure 8.1 Fields In The Ejtag Cp0 Debug Register
ﬁeld is undeﬁned. The value will be one of those deﬁned for , as shown in Table Cause[ExcCode] B.5. NoSSt : read-only - reads 0 because single-step is implemented (it always is on MIPS Technologies cores). : set 1 to enable single-step. Programming the MIPS32® 74K™ Core Family, Revision 02.14...
Page 108: 7: The Dcr (Debug Control) Memory-Mapped Register
8.1 EJTAG on-chip debug unit Figure 8.2 Exception cause bits in the debug register Debug DDBSImpr DDBLImpr DINT DIB DDBS DDBL DBp DSS DDBSImpr : imprecise store breakpoint - see Section 8.1.13, "Imprecise debug breaks" below. DEPC probably points to an instruction some time later in sequence than the store which triggered the breakpoint.
Page 109: Figure 8.4 Fields In The Memory-Mapped Dcr (Debug Control) Register
74K™ core features for debug and profiling Figure 8.4 Fields in the memory-mapped DCR (debug control) register PCno FDCI PCIM DASQ DASe DAS ASID RDVec CBT PCSE INTE NMIE NMIP SRE Where: : (read only) reports CPU endianness (1 == big). FDCImpl : (read only) 1 if the Fast Debug Channel is available.
Page 110: 8: The Debugvectoraddr Memory-Mapped Register
8.1 EJTAG on-chip debug unit 8.1.8 The DebugVectorAddr memory-mapped register This is another memory-mapped EJTAG register . It’s found in “drseg” at location 0xFF30.0020 as shown in Table (but only accessible if the CPU is in debug mode). The ﬁelds are in Figure 8.5: RdVec...
Page 111: Figure 8.7 Fields In The Jtag-Accessible Ejtag_Control Register
: 1 because the 74K core always supports the MIPS16 instruction set extension. MIPS16 NoDMA : 1 - MIPS Technologies cores do not provide EJTAG "DMA" (which would allow a probe to directly read and write anything attached to the 74K core’s OCP interface). MIPS32/64 : the zero indicates this is a 32-bit CPU.
Page 112: 10: Fast Debug Channel
8.1 EJTAG on-chip debug unit : (read-only) when software reads or writes "dmseg" this tells the probe whether it was a word, byte or whatever-size transfer: Byte-within-word Size code Transfer Size address EJTAG_ADDRESS[1-0] EJTAG_CONTROL[Psz] Byte Halfword Word Tri-byte (lowest address 3 bytes) Tri-byte (highest address 3 bytes) Doze/Halt : (read-only) indicates CPU not fully awake.
Page 113: Figure 8.8 Fast Debug Channel
74K™ core features for debug and profiling ing access to the transmit (core to probe) and receive FIFOs. These FIFOs are included to isolate the software visible interface from the physical transfer of bits to the probe and allow some ‘burstiness’ of data. Associated with each 32- bit piece of data is a 4-bit Channel ID.
Page 114: Figure 8.10: Fields In The Fdc Config (Fdcfg) Register
8.1 EJTAG on-chip debug unit Where: DevID : (read only) indicates the device ID - 0xfd in this case. : (read only) indicates how many 64B blocks (minus 1) this device uses - value of 2, indicating 3 blocks for DevSize : (read only) Revision number of the device - currently 0.
Page 115: 11: Ejtag Breakpoint Registers
. The latter has, in theory, two extra ﬁelds (bits 29-28) used to ﬂag implementations which can’t do a load/store break conditional on the data value. However, MIPS cores with hardware breakpoints always include the value check, so these bits read zero anyway. So the registers are as shown in Figure 8.14.
Page 116: Figure 8.14 Fields In The Ibs/Dbs (Ejtag Breakpoint Status) Registers
8.1 EJTAG on-chip debug unit Figure 8.14 Fields in the IBS/DBS (EJTAG breakpoint status) registers 29 28 27 24 23 4 3 2 1 BCN = 2 BS1-0 ASID- BCN = 4 BSD3-0 Where: : is 1 if the breakpoints can use ASID matching to distinguish addresses from different address spaces; on the ASIDsup 74K core that’s available if and only if a TLB is ﬁtted.
Page 117: 12: Understanding Breakpoint Conditions
74K™ core features for debug and profiling Figure 8.15 Fields in the hardware breakpoint control registers (IBCn, DBCn) 18 17 DBCn ASIDuse BAI7-0 NoSB NoLB BLM7-0 0 TE 0 BE IBCn ASIDuse TE 0 BE The ﬁelds are: : set 1 to compare the ASID as well as the address. ASIDuse : "byte (lane) access ignore"...
Page 118: 13: Imprecise Debug Breaks
Most exceptions in MIPS architecture CPUs are precise. But because of the way the 74K core optimizes loads and stores by permitting the CPU to run on at least until it needs to use the data from a load, data breakpoints which ﬁlter on the data value are imprecise.
Page 119: 15: Jtag-Accessible And Memory-Mapped Pdtrace Tcb Registers
74K™ core features for debug and profiling used intrusive interrupt-based PC-sampling for many years, so there are tools which can readily interpret this sort of data. When PC sampling is conﬁgured in your core, it runs continuously. Some sleight of hand is used if the CPU is hang- ing on a wait instruction.
Page 120 8.1 EJTAG on-chip debug unit Table 8.5 Mapping TCB Registers in drseg (Continued) Description Offset in drseg Register Name TCBTW 0x3100 Trace Word read register. This register holds the Trace Word just read from on-line trace mem- ory. TCBRDP 0x3108 Trace Word Read pointer.
Page 121: Pdtrace™ Instruction Trace Facility
8.2 PDtrace™ instruction trace facility An instruction trace is a set of data generated when a program runs which allows you to recreate the sequence of instructions executed, possibly with additional information included about data values. Instruction traces rapidly become enormous, and are typically generated in some kind of abbreviated form, which may be reconstructed by software which is in possession of a copy of the binary code of your system.
Page 122: Figure 8.16 Fields In The Tcbcontrole Register
8.2 PDtrace™ instruction trace facility but we’ll document ﬁelds and conﬁgured values which are speciﬁc to 74K family CPUs. With the new feature of enabling software to access the on-chip trace memory, all the JTAG-accessible registers are visible to software via a load or store to their drseg memory mapped location.
Page 123: 2: Cp0 Registers For The Pdtrace™ Logic
Fields in the TCBCONFIG register Figure 8.17 11 10 9 8 TRIG CRMax CRMin PiN OnT OfT In TCBCONFIG: CF1: read-only, reads zero because there are no more TCB conﬁguration registers. PiN: read-only, reads zero because the 74K core is a single-issue (single pipeline) processor. REV: reads 1, denoting compliance with revision 4.0 of the TCB speciﬁcation.
Page 124 8.2 PDtrace™ instruction trace facility : "trace all branch" - when 1, output all branch addresses in full. Normally, predictable branches need not be sent. : "inhibit overﬂow" - slow the CPU rather than lose trace data because you can’t capture it fast enough. : do trace in various CPU modes: separate bits independently ﬁlter for debug, exception, kernel, supervisor D, E, K, S, U and user mode.
Page 125: 3: Jtag Triggers And Local Control Through Traceibpc/Tracedbpc
: best considered together, these read-only bits tell you whether there is an on-chip trace mem- TraceControl2[TBI,TBU] ory, on-probe trace memory, or both - and which is currently in use. TBI TBU On-chip or probe trace memory? only on-chip memory available only probe memory available Both available, currently using on-chip Both available, currently using probe...
Page 126: 4: Usertracedata1 Reg And Usertracedata2 Reg
8.2 PDtrace™ instruction trace facility : each three-bit ﬁeld encodes tracing options independently, for up to nine EJTAG I- and D-side IBPC8-0, DBPC8-0 breakpoints (this is generous: your 74K core will typically have no more than 4 I- and 2 D-breakpoints). Each entry can be set as follows: xBPC ﬁeld Description...
Page 127 • There must have been a cycle recently when there was an “on trigger”, that is: – The CPU tripped an EJTAG breakpoint with the IBCn[TE] DBCn[TE] bit set to request a trace trigger (for I-side and D-side respectively); – (respectively) was set to enable triggers from EJTAG breakpoints;...
Page 128: Cp0 Watchpoints
8.3 CP0 Watchpoints 8.3 CP0 Watchpoints Some s may be built with no EJTAG debug unit to save space, and some debug software may not know how to use core EJTAG resources. So it may be worth conﬁguring the four non-EJTAG CP0 watchpoint registers. In 74K s you core get two I-side and two D-side registers.
Page 129: Performance Counters
WatchHi0-3[Mask]: implements address ranges. Set bits in WatchHi0-3[Mask] to mark corresponding WatchLo0- 3[VAddr] address bits to be ignored when deciding whether this is a match. WatchHi0-3[I,R,W]: read your WatchHi0-3 after a watch exception, and these ﬁelds tell you what type of access (if anything) matched.
Page 130: 1: Reading The Event Table
After the exception is handled and control returns, the branch instruction is re-executed: all MIPS branch instructions are contrived so the re-execution does exactly the same thing as the ﬁrst time. But the branch instruction is “really” run twice, and any performance count will show that.
Page 131: Table 8.8: Performance Counter Event Codes In The Perfctl0-3[Event] Field
Table 8.8 Performance Counter Event Codes in the PerfCtl0-3[Event] ﬁeld. Event counter0/2 counter1/3 Cycles Instructions graduated jr $31 (return) instructions that are predicted jr $31 predicted but guessed wrong Cycles where no instruction is fetched because it has jr $31 (return) instructions fetched and not pre- no “next address”...
Page 132 8.4 Performance counters Table 8.8 Performance Counter Event Codes in the PerfCtl0-3[Event] ﬁeld. Event counter0/2 counter1/3 ALU-pipe bubble issued. The resulting empty pipe- Reserved stage guarantees that some resource will be unused for a cycle, sometime soon. Used, for example, to guarantee an opportunity to write mfc1 data into a Cycles when one instruction is issued.
Page 133 SI_PCEvent pin of the core. Reserved for CP2 event Implementation-speciﬁc event from ISPRAM block. Implementation-speciﬁc event from DSPRAM MIPS standard ISPRAM (see Section block. MIPS standard DSPRAM (see Section 3.6 “Scratchpad memory/SPRAM”) does not pro- 3.6 “Scratchpad memory/SPRAM”) does not pro- vide such an event.
Page 134 8.4 Performance counters Programming the MIPS32® 74K™ Core Family, Revision 02.14...
Page 135: Appendix A: References
Then there are some architectural extensions: [MIPSDSP]:“The MIPS DSP Application-Specific Extension to the MIPS32 Architecture”, MIPS Technologies document MD00372. [DSPWP]: “Programming the MIPS® 74K™ Family Cores for DSP”, MIPS Technologies white paper, document number MD00544. [MIPS16e]:“The MIPS16e™ Application-Specific Extension to the MIPS32 Architecture”, MIPS Technologies document MD00074.
Page 136 Books about programming the MIPS® architecture [SEEMIPSRUN]: “See MIPS Run, 2nd Edition”, author Dominic Sweetman, Morgan Kaufmann ISBN 1-55860-410- 3. A general and wide-ranging programmers introduction to the MIPS architecture, updated in 2006 to reﬂect the current version of [MIPS32].
Page 137: Appendix B: Cp0 Register Summary And Reference
Power-up state of CP0 registers The traditions of the MIPS architecture regard it as software’s job to initialize CP0 registers. As a rule, only ﬁelds where a wrong setting would prevent the CPU from booting are forced to an appropriate state by reset; other ﬁelds - including other ﬁelds in the same register - are random.
Page 138: Table B.2 Cp0 Registers By Number
Table B.1 Register index by name (Continued) Name Number Name Number Name Number Name Number Count IDataHi SRSCtl Wired 29.1 12.2 DDataLo IDataLo 28.3 28.1 Table B.2 CP0 registers by number Register Description Page Index Index into the TLB array 3.8.3, p.48 Random Randomly generated index into the TLB array...
Page 139 CP0 register summary and reference Table B.2 CP0 registers by number (Continued) Register Description Page WatchLo0-3 18.0-3 Watchpoint address and qualiﬁers 8.3.1, p.128 WatchHi0-3 19.0-3 Watchpoint control/status 8.3.2, p.128 23.0 Debug EJTAG Debug status/control register 8.1.6, p.106 TraceControl 23.1 Control ﬁelds for the PDtrace unit 8.1.6, p.106 23.2 TraceControl2...
Page 140: B.1: Miscellaneous Cp0 Register Descriptions
B.1 Miscellaneous CP0 register descriptions Table B.3 CP0 Registers Grouped by Function Basic modes Status BadVAddr DEPC 12.0 24.0 OS/userland UserLocal Context DESAVE 31.0 EJTAG Debug thread ID Cause ContextConﬁg Debug 13.0 23.0 Exception Control EntryHi TraceControl 14.0 10.0 23.1 Management Compare 11.0...
Page 141: B.1.1: Status Register
Status[FR]: if there is a ﬂoating point unit, set 0 for MIPS I compatibility mode (which means you have only 16 real FP registers, with 16 odd FP register numbers reserved for access to the high bits of double-precision values).
Page 142: Table B.4 Encoding Privilege Level In Status[Um,Sm]
Can be written back to zero, but never written to 1. The name of the ﬁeld originated as a "TLB Shutdown" — historical MIPS CPUs quietly stopped translating addresses when they detected TLB abuse. Status[SR]: MIPS32 architecture "soft reset" bit: the 74K core’s interface only supports a full external reset, so this always reads zero.
Page 143: B.1.2: The Userlocal Register
CP0 register summary and reference Status[IE]: global interrupt enable, 0 to disable all interrupts. The instructions allow you to write this bit with- out affecting the rest of Status. B.1.2 The UserLocal register Not interpreted by hardware, this register is suitable for a kernel-maintained thread ID whose value can be read by user-level code with rdhwr $29, so long as HWREna[UL] is set.
Page 144: Table B.5 Values Found In Cause[Exccode]
Instruction or data reference matched a watchpoint “Machine check” — second valid TLB entry mapping same virtual address. MCheck Thread-related exception, only for CPUs supporting the MIPS MT ASE. Thread Reserved (some kind of thread exception for a MT CPU).
Page 145: B.1.4: The Epc Register
If the instruction we’d really like to return to is in a branch delay slot, points to the branch instruction and Cause[BD] will be set. All MIPS branch instructions may be re-executed successfully, so returning to the branch is the right thing to do in this case. B.1.5 Count and Compare These two 32-bit registers form a useful and ﬂexible timer.
Page 146: Table B.6: Fields In The Config7 Register
Set either of these bits to arrange that for CP1/CP2 respectively, data will be sent only in instruction order. Data from the core to the CP is tagged with an “age” ﬁeld. MIPS Technologies' standard FPU accepts data out-of- CP1IO order, interpreting the age ﬁeld to associated data with the correct instruction.
Page 147 CP0 register summary and reference Conﬁg7[SUI]: Strict Uncached Instruction (SUI) policy control. Set this to run uncached instruction strictly in order and (as far as possible) unpipelined. This will be quite slow (the policy of itself will introduce a 15-cycle bubble between each instructions), but you’ll hardly notice because running uncached is already so slow.
Page 148: B.3: Registers For Cache Diagnostics
B.3 Registers for Cache Diagnostics When it is set to “1”, the instruction will be signalled on the core’s OCP interface as an “ordering barrier” trans- sync action, using a -speciﬁc encoding. sync Conﬁg7[ES] bit cannot be set (will always read zero and will have no effect) unless the OCP input signal SI_SyncTxEn is asserted —...
Page 149: B.3.2: Dual (Virtual And Physical) Tags In The 74K Core D-Cache - Dtaghi Register
From a software viewpoint the D-cache looks just like the “standard” MIPS virtually-indexed physically-tagged cache, though there is occasionally an unexpected delay when the virtual tag “prediction” is wrong — the CPU pipe- line treats this like a cache miss, and as a side-effect the virtual tag is adjusted so it will work correctly next time.
Page 150: B.3.4: The Ddatalo, Idatahi And Idatalo Registers
B.3 Registers for Cache Diagnostics The individual PREC ﬁelds hold precode information for pairs of adjacent instructions in the I-cache line, and the ﬁelds hold parity over them. B.3.4 The DDataLo, IDataHi and IDataLo registers On 74K family cores, test software can read or write data directly from/to the cache array using a index load cache tag /store data instruction.
Page 151: Appendix C: Mips® Architecture Quick-Reference Sheet(S)
Table C.1 shows those names related to both the “o32” ABI (almost universally used for 32-bit MIPS applications), but also the minor variations in the “n32” and “n64” ABIs deﬁned by Silicon Graphics. If you’re not sure what an ABI is, just read the “o32” column! Table C.1 Conventional names of registers with usage mnemonics...
Page 152: C.2.2: Release 2 Of The Mips32® Architecture - Hardware Registers From User Mode
C.2 User-level changes with Release 2 of the MIPS32® Architecture Table C.2 Release 2 of the MIPS32® Architecture - new instructions Instruction(s) Description Hazard barriers; wait until side-effects from earlier instructions are all complete (that is, can be guaranteed to apply in full to all instructions issued after the barrier). jalr.hb rd, rs These defend you respectively against: jr.hb rs...
Page 153: C.3: Fpu Changes In Release 2 Of The Mips32® Architecture
MIPS® Architecture quick-reference sheet(s) • CC (2): user-mode read-only access to the CP0 Count register, for high-resolution counting. Which wouldn’t be much good without... • CCRes (3): which tells you how fast Count counts. It’s a divider from the pipeline clock (if the rdhwr instruction reads a value of “2”, then...
Page 154 C.3 FPU changes in Release 2 of the MIPS32® Architecture Programming the MIPS32® 74K™ Core Family, Revision 02.14...
Page 155: Appendix D: Revision History
Signiﬁcant changes are deﬁned as those which you should take note of as you use the MIPS IP. Changes to correct grammar, spelling errors or similar may or may not be noted with change bars.
Page 156 Revision Date Description 2.14 March 30, 2011 • Add Type and TypeInfo fields in implementation register. • Add Cache miss PC Sampling feature. Programming the MIPS32® 74K™ Core Family, Revision 02.14 Copyright © Wave Computing, Inc. All rights reserved. www.wavecomp.ai...

MIPS MIPS32 74Kf Programming Manual

Chapter 1: Introduction

Chapter 2: Initialization and Identity

Chapter 3: Memory Map, Caching, Reads, Writes and Translation

Chapter 4: Programming the 74K™ Core in User Mode

Chapter 5: Kernel-Mode (OS) Programming and Release 2 of the MIPS32® Architecture

Chapter 6: Floating Point Unit

Chapter 7: The MIPS32® DSP ASE

Chapter 8: 74K™ Core Features for Debug and Profiling

Appendix A: References

Appendix B: CP0 Register Summary and Reference

Appendix C: MIPS® Architecture Quick-Reference Sheet(S)

Appendix D: Revision History

Quick Links

Need help?

Questions and answers

Related Manuals for MIPS MIPS32 74Kf

Summary of Contents for MIPS MIPS32 74Kf

Table of Contents