Renesas SuperH SH-4A Software Manual

Renesas SuperH SH-4A Software Manual

32-bit risc microcomputer
Hide thumbs Also See for SuperH SH-4A:
Table of Contents

Advertisement

Quick Links

REJ09B0003-0150Z
32
Rev.1.50
Revision Date: Oct. 29, 2004
The revision list can be viewed directly by
clicking the title page.
The revision list summarizes the locations of
revisions and additions. Details should always
be checked by referring to the relevant text.
Renesas 32-Bit RISC Microcomputer
SH-4A
Software Manual
SuperH™ RISC engine Family

Advertisement

Table of Contents
loading

Summary of Contents for Renesas SuperH SH-4A

  • Page 1 The revision list summarizes the locations of revisions and additions. Details should always be checked by referring to the relevant text. SH-4A Software Manual Renesas 32-Bit RISC Microcomputer SuperH™ RISC engine Family Rev.1.50 Revision Date: Oct. 29, 2004...
  • Page 2 Rev. 1.50, 10/04, page ii of xx...
  • Page 3 (iii) prevention against any malfunction or mishap. Notes regarding these materials 1. These materials are intended as a reference to assist our customers in the selection of the Renesas Technology Corp. product best suited to the customer's application; they do not convey any license under any intellectual property rights, or any other rights, belonging to Renesas Technology Corp.
  • Page 4: General Precautions On Handling Of Product

    General Precautions on Handling of Product 1. Treatment of NC Pins Note: Do not connect anything to the NC pins. The NC (not connected) pins are either not connected to any of the internal circuitry or are they are used as test pins or to reduce noise. If something is connected to the NC pins, the operation of the LSI is not guaranteed.
  • Page 5 Configuration of This Manual This manual comprises the following items: 1. General Precautions on Handling of Product 2. Configuration of This Manual 3. Preface 4. Contents 5. Overview 6. Description of Functional Modules • CPU and System-Control Modules • On-Chip Peripheral Modules The configuration of the functional description of each module differs according to the module.
  • Page 6: Preface

    Preface The SH-4A is a RISC (Reduced Instruction Set Computer) microcomputer which includes a Renesas Technology-original RISC CPU as its core. Target Users: This manual was written for users who will be using the SH-4A in the design of application systems. Users of this manual are expected to understand the fundamentals of electrical circuits, logical circuits, microcomputers, and assembly/C languages programming.
  • Page 7 Abbreviations Arithmetic Logic Unit ASID Address Space Identifier Central Processing Unit Floating Point Unit Least Recently Used Least Significant Bit Memory Management Unit Most Significant Bit Program Counter RISC Reduced Instruction Set Computer Translation Lookaside Buffer Rev. 1.50, 10/04, page vii of xx...
  • Page 8 Rev. 1.50, 10/04, page viii of xx...
  • Page 9: Table Of Contents

    Contents Section 1 Overview....................1 Features..........................1 Changes from SH-4 to SH-4A ..................4 Section 2 Programming Model ................7 Data Formats........................7 Register Descriptions ......................8 2.2.1 Privileged Mode and Banks ................. 8 2.2.2 General Registers....................11 2.2.3 Floating-Point Registers..................12 2.2.4 Control Registers ....................
  • Page 10 Exception Flow ......................... 72 5.5.1 Exception Flow....................72 5.5.2 Exception Source Acceptance................73 5.5.3 Exception Requests and BL Bit ................74 5.5.4 Return from Exception Handling................. 74 Description of Exceptions....................75 5.6.1 Resets........................75 5.6.2 General Exceptions....................77 5.6.3 Interrupts......................91 5.6.4 Priority Order with Multiple Exceptions .............
  • Page 11 7.3.1 Unified TLB (UTLB) Configuration ..............131 7.3.2 Instruction TLB (ITLB) Configuration..............133 7.3.3 Address Translation Method................134 MMU Functions........................ 136 7.4.1 MMU Hardware Management ................136 7.4.2 MMU Software Management ................136 7.4.3 MMU Instruction (LDTLB)................. 137 7.4.4 Hardware ITLB Miss Handling ................139 7.4.5 Avoiding Synonym Problems ................
  • Page 12 8.3.6 OC Two-Way Mode .................... 173 Instruction Cache Operation ..................... 173 8.4.1 Read Operation ....................173 8.4.2 Prefetch Operation ....................174 8.4.3 IC Two-Way Mode....................174 Cache Operation Instruction ..................... 175 8.5.1 Coherency between Cache and External Memory ..........175 8.5.2 Prefetch Operation ....................
  • Page 13 10.1.2 ADDC (Add with Carry): Arithmetic Instruction ..........205 10.1.3 ADDV (Add with (V flag) Overflow Check): Arithmetic Instruction....206 10.1.4 AND (AND Logical): Logical Instruction............208 10.1.5 BF (Branch if False): Branch Instruction............. 210 10.1.6 BF/S (Branch if False with Delay Slot): Branch Instruction........ 212 10.1.7 BRA (Branch): Branch Instruction ..............
  • Page 14 10.1.44 NEGC (Negate with Carry): Arithmetic Instruction..........287 10.1.45 NOP (No Operation): System Control Instruction..........288 10.1.46 NOT (Not-logical Complement): Logical Instruction ......... 289 10.1.47 OCBI (Operand Cache Block Invalidate): Data Transfer Instruction....290 10.1.48 OCBP (Operand Cache Block Purge): Data Transfer Instruction......291 10.1.49 OCBWB (Operand Cache Block Write Back): Data Transfer Instruction...
  • Page 15 10.2.2 BSRF (Branch to Subroutine Far): Branch Instruction (Delayed Branch Instruction)................344 10.2.3 JSR (Jump to Subroutine): Branch Instruction (Delayed Branch Instruction)..346 10.2.4 LDC (Load to Control Register): System Control Instruction (Privileged Instruction) ..................348 10.2.5 LDS (Load to FPU System register): System Control Instruction ....... 349 10.2.6 STC (Store Control Register): System Control Instruction (Privileged Instruction) ..................
  • Page 16 Section 11 List of Registers................427 11.1 Register Addresses (by functional module, in order of the corresponding section numbers) ......428 11.2 Register States in Each Operating Mode ................430 Appendix ......................431 CPU Operation Mode Register (CPUOPM) ..............431 Instruction Prefetching and Its Side Effects..............
  • Page 17 Figures Section 1 Overview Figure 2.1 Data Formats ......................... 7 Figure 2.2 CPU Register Configuration in Each Processing Mode ..........10 Figure 2.3 General Registers ......................11 Figure 2.4 Floating-Point Registers ....................13 Figure 2.5 Relationship between SZ bit and Endian..............18 Figure 2.6 Formats of Byte Data and Word Data in Register ............
  • Page 18 Figure 7.9 Flowchart of Memory Access Using UTLB.............. 134 Figure 7.10 Flowchart of Memory Access Using ITLB ............. 135 Figure 7.11 Operation of LDTLB Instruction................138 Figure 7.12 Memory-Mapped ITLB Address Array..............147 Figure 7.13 Memory-Mapped ITLB Data Array ................ 148 Figure 7.14 Memory-Mapped UTLB Address Array ..............
  • Page 19 Tables Section 1 Overview Table 1.1 Features........................1 Table 1.2 Changes from SH-4 to SH-4A .................. 4 Section 2 Programming Model Table 2.1 Initial Register Values....................9 Table 2.2 Bit Allocation for FPU Exception Handling............19 Section 3 Instruction Set Table 3.1 Execution Order of Delayed Branch Instructions ...........
  • Page 20 Section 8 Caches Table 8.1 Cache Features...................... 159 Table 8.2 Store Queue Features .................... 159 Table 8.3 Register Configuration..................162 Table 8.4 Register States in Each Processing State .............. 162 Section 9 L Memory Table 9.1 L Memory Addresses.................... 187 Table 9.2 Register Configuration..................
  • Page 21: Section 1 Overview

    32-bit instructions. The features of the SH-4A are listed in table 1.1. Table 1.1 Features Item Features • Renesas Technology original architecture • 32-bit internal data bus • General-register files:  Sixteen 32-bit general registers (eight 32-bit shadow registers) ...
  • Page 22 Item Features • Floatingpoint unit On-chip floating-point coprocessor • (FPU) Supports single-precision (32 bits) and double-precision (64 bits) • Supports IEEE754-compliant data types and exceptions • Two rounding modes: Round to Nearest and Round to Zero • Handling of denormalized numbers: Truncation to zero or interrupt generation for IEEE754 compliance •...
  • Page 23 Item Features • Cache memory Instruction cache (IC)  4-way set associative  32-byte block length • Operand cache (OC)  4-way set associative  32-byte block length  Selectable write method (copy-back or write-through) • Storage queue (32 bytes × 2 entries) Note: For the size of instruction cash and operand cash, see corresponding hardware manual on the product.
  • Page 24: Changes From Sh-4 To Sh-4A

    Changes from SH-4 to SH-4A Table 1.2 summarizes the changes from SH-4 to SH-4A based on the sections and sub-sections in this manual. Table 1.2 Changes from SH-4 to SH-4A Section No. and Sub- Sub-section Name section Name Changes  ...
  • Page 25 Section No. and Sub- Sub-section Name section Name Changes 7. Memory 7.1.1 Address Spaces Area P4 configuration is modified. Management Unit On-chip RAM space is deleted. Register The page table entry assist register (PTEA) Descriptions is deleted. A physical address space control register is added.
  • Page 26 Section No. and Sub- Sub-section Name section Name Changes 8. Caches Features Instruction cache capacity is changed to 32 Kbytes. The caching method is changed to a 4-way set-associative method. Register An on-chip memory control register is Descriptions added. 8.2.1 Cache Control Modified.
  • Page 27: Section 2 Programming Model

    Section 2 Programming Model The programming model of the SH-4A is explained in this section. The SH-4A has registers and data formats as shown below. Data Formats The data formats supported in the SH-4A are shown in figure 2.1. Byte (8 bits) Word (16 bits) Longword (32 bits) 31 30...
  • Page 28: Register Descriptions

    Register Descriptions 2.2.1 Privileged Mode and Banks Processing Modes: This LSI has two processing modes, user mode and privileged mode. This LSI normally operates in user mode, and switches to privileged mode when an exception occurs or an interrupt is accepted. There are four kinds of registers—general registers, system registers, control registers, and floating-point registers—and the registers that can be accessed differ in the two processing modes.
  • Page 29: Table 2.1 Initial Register Values

    Floating-Point Registers and System Registers Related to FPU: There are thirty-two floating- point registers, FR0–FR15 and XF0–XF15. FR0–FR15 and XF0–XF15 can be assigned to either of two banks (FPR0_BANK0–FPR15_BANK0 or FPR0_BANK1–FPR15_BANK1). FR0–FR15 can be used as the eight registers DR0/2/4/6/8/10/12/14 (double-precision floating- point registers, or pair registers) or the four registers FV0/4/8/12 (register vectors), while XF0–...
  • Page 30: Figure 2.2 Cpu Register Configuration In Each Processing Mode

    R0 _ BANK0* R0 _ BANK0* R0 _ BANK1* R1 _ BANK0* R1 _ BANK1* R1 _ BANK0* R2 _ BANK0* R2 _ BANK0* R2 _ BANK1* R3 _ BANK0* R3 _ BANK1* R3 _ BANK0* R4 _ BANK0* R4 _ BANK0* R4 _ BANK1* R5 _ BANK0* R5 _ BANK0*...
  • Page 31: General Registers

    2.2.2 General Registers Figure 2.3 shows the relationship between the processing modes and general registers. The SH-4A has twenty-four 32-bit general registers (R0_BANK0 to R7_BANK0, R0_BANK1 to R7_BANK1, and R8 to R15). However, only 16 of these can be accessed as general registers R0 to R15 in one processing mode.
  • Page 32: Floating-Point Registers

    2.2.3 Floating-Point Registers Figure 2.4 shows the floating-point register configuration. There are thirty-two 32-bit floating- point registers, FPR0_BANK0 to FPR15_BANK0, AND FPR0_BANK1 to FPR15_BANK1, comprising two banks. These registers are referenced as FR0 to FR15, DR0/2/4/6/8/10/12/14, FV0/4/8/12, XF0 to XF15, XD0/2/4/6/8/10/12/14, or XMTRX. Reference names of each register are defined depending on the state of the FR bit in FPSCR (see figure 2.4).
  • Page 33: Figure 2.4 Floating-Point Registers

    FPSCR.FR = 0 FPSCR.FR = 1 FPR0_BANK0 XMTRX FPR1_BANK0 FPR2_BANK0 FPR3_BANK0 FPR4_BANK0 FPR5_BANK0 FPR6_BANK0 FPR7_BANK0 FPR8_BANK0 FPR9_BANK0 DR10 FR10 FPR10_BANK0 XD10 XF10 FR11 FPR11_BANK0 XF11 FV12 DR12 FR12 FPR12_BANK0 XD12 XF12 FR13 FPR13_BANK0 XF13 DR14 FR14 FPR14_BANK0 XD14 XF14 FR15 FPR15_BANK0 XF15 FPR0_BANK1...
  • Page 34: Control Registers

    2.2.4 Control Registers Status Register (SR) BIt: Initial value: R/W: BIt: IMASK Initial value: R/W: Initial Bit Name Value Description — Reserved For details on reading/writing this bit, see General Precautions on Handling of Product. Processing Mode Selects the processing mode. 0: User mode (Some instructions cannot be executed and some resources cannot be accessed.) 1: Privileged mode...
  • Page 35: Appendix

    Initial Bit Name Value Description FPU Disable Bit When this bit is set to 1 and an FPU instruction is not in a delay slot, a general FPU disable exception occurs. When this bit is set to 1 and an FPU instruction is in a delay slot, a slot FPU disable exception occurs.
  • Page 36: System Registers

    Vector Base Register (VBR) (32 bits, Privileged Mode, Initial Value = H'00000000): VBR is referenced as the branch destination base address in the event of an exception or interrupt. For details, see section 5, Exception Handling. Saved General Register 15 (SGR) (32 bits, Privileged Mode, Initial Value = Undefined): The contents of R15 are saved to SGR in the event of an exception or interrupt.
  • Page 37 Floating-Point Status/Control Register (FPSCR) BIt: Cause Initial value: R/W: BIt: Cause Enable (EN) Flag Initial value: R/W: Initial Bit Name Value Description 31 to 22 — All 0 Reserved For details on reading/writing this bit, see General Precautions on Handling of Product. Floating-Point Register Bank 0: FPR0_BANK0 to FPR15_BANK0 are assigned to FR0 to FR15 and FPR0_BANK1 to FPR15_BANK1...
  • Page 38: Figure 2.5 Relationship Between Sz Bit And Endian

    Initial Bit Name Value Description 17 to 12 Cause All 0 FPU Exception Cause Field FPU Exception Enable Field 11 to 7 Enable (EN) All 0 FPU Exception Flag Field 6 to 2 Flag All 0 Each time an FPU operation instruction is executed, the FPU exception cause field is cleared to 0.
  • Page 39: Memory-Mapped Registers

    Table 2.2 Bit Allocation for FPU Exception Handling Invalid Division Overflow Underflow Inexact Field Name Error (E) Operation (V) by Zero (Z) Cause FPU exception Bit 17 Bit 16 Bit 15 Bit 14 Bit 13 Bit 12 cause field Enable FPU exception None Bit 11...
  • Page 40: Data Formats In Registers

    Data Formats in Registers Register operands are always longwords (32 bits). When a memory operand is only a byte (8 bits) or a word (16 bits), it is sign-extended into a longword when loaded into a register. Figure 2.6 Formats of Byte Data and Word Data in Register Data Formats in Memory Memory data formats are classified into bytes, words, and longwords.
  • Page 41: Processing States

    A + 1 A + 2 A + 3 A + 11 A + 10 A + 9 A + 8 Address A Address A + 8 Byte 0 Byte 1 Byte 2 Byte 3 Byte 3 Byte 2 Byte 1 Byte 0 0 15 Address A + 4 Address A + 4...
  • Page 42: Usage Notes

    Usage Notes 2.7.1 Notes on Self-Modified Codes The SH-4A prefetches instructions to accelerate the processing speed. Therefore if the instruction in the memory is modified and it is executed immediately, then the pre-modified code in the prefetch buffer may be executed. And the SH4AL-DSP supports each instruction and operand cache, the coherency should be considered.
  • Page 43: Section 3 Instruction Set

    Section 3 Instruction Set The SH-4A's instruction set is implemented with 16-bit fixed-length instructions. The SH-4A can use byte (8-bit), word (16-bit), longword (32-bit), and quadword (64-bit) data sizes for memory access. Single-precision floating-point data (32 bits) can be moved to and from memory using longword or quadword size.
  • Page 44 #1, R0 ; T bit is not changed by ADD operation CMP/EQ R1, R0 ; If R0 = R1, T bit is set to 1 TARGET ; Branches to TARGET if T bit = 1 (R0 = R1) In an RTE delay slot, the SR bits are referenced as follows. In instruction access, the MD bit is used before modification, and in data access, the MD bit is accessed after modification.
  • Page 45: Addressing Modes

    Addressing Modes Addressing modes and effective address calculation methods are shown in table 3.2. When a location in virtual memory space is accessed (AT in MMUCR = 1), the effective address is translated into a physical memory address. If multiple virtual memory space systems are selected (SV in MMUCR = 0), the least significant bit of PTEH is also referenced as the access ASID.
  • Page 46 Addressing Instruction Calculation Mode Format Effective Address Calculation Method Formula Register @(disp:4, Rn) Effective address is register Rn contents with Byte: Rn + disp → EA indirect with 4-bit displacement disp added. After disp is displacement zero-extended, it is multiplied by 1 (byte), 2 Word: Rn + (word), or 4 (longword), according to the operand disp ×...
  • Page 47 Addressing Instruction Calculation Mode Format Effective Address Calculation Method Formula PC-relative @(disp:8, PC) Effective address is PC + 4 with 8-bit displacement Word: PC + 4 + disp × 2 → with disp added. After disp is zero-extended, it is displacement multiplied by 2 (word), or 4 (longword), according to the operand size.
  • Page 48 Addressing Instruction Calculation Mode Format Effective Address Calculation Method Formula PC-relative disp:12 Effective address is PC + 4 with 12-bit PC + 4 + disp × 2 → Branch- displacement disp added after being sign-extended Target multiplied by 2. PC + 4 + disp × 2 disp (sign-extended) ×...
  • Page 49: Instruction Set

    Instruction Set Table 3.3 shows the notation used in the SH instruction lists shown in tables 3.4 to 3.13. Table 3.3 Notation Used in Instruction List Item Format Description Instruction OP.Sz SRC, DEST Operation code mnemonic Size SRC: Source operand DEST: Source and/or destination operand Source register...
  • Page 50 Item Format Description T bit Value of T bit after —: No change instruction execution  "New" means the instruction which is newly added in this LSI. Note: Scaling (×1, ×2, ×4, or ×8) is executed according to the size of the instruction operand. Rev.
  • Page 51: Table 3.4 Fixed-Point Transfer Instructions

    Table 3.4 Fixed-Point Transfer Instructions Instruction Operation Instruction Code Privileged T Bit imm → sign extension → Rn #imm,Rn 1110nnnniiiiiiii — — — MOV.W @(disp*,PC), Rn (disp × 2 + PC + 4) → sign 1001nnnndddddddd — — — extension → Rn MOV.L @(disp*,PC), Rn (disp ×...
  • Page 52 — words → Rn Rm:Rn middle 32 bits → XTRCT Rm,Rn 0010nnnnmmmm1101 — — — The assembler of Renesas uses the value after scaling (×1, ×2, or ×4) as the Note: displacement (disp). Rev. 1.50, 10/04, page 32 of 448...
  • Page 53: Table 3.5 Arithmetic Operation Instructions

    Table 3.5 Arithmetic Operation Instructions Instruction Operation Instruction Code Privileged T Bit Rn + Rm → Rn Rm,Rn 0011nnnnmmmm1100 — — — Rn + imm → Rn #imm,Rn 0111nnnniiiiiiii — — — Rn + Rm + T → Rn, ADDC Rm,Rn 0011nnnnmmmm1110 —...
  • Page 54 Instruction Operation Instruction Code Privileged T Bit Rn – 1 → Rn; 0100nnnn00010000 — Comparison — when Rn = 0, 1 → T result When Rn ≠ 0, 0 → T EXTS.B Rm,Rn Rm sign-extended from 0110nnnnmmmm1110 — — — byte →...
  • Page 55: Table 3.6 Logic Operation Instructions

    Table 3.6 Logic Operation Instructions Instruction Operation Instruction Code Privileged T Bit Rn & Rm → Rn Rm,Rn 0010nnnnmmmm1001 — — — R0 & imm → R0 #imm,R0 11001001iiiiiiii — — — AND.B #imm, @(R0,GBR) (R0 + GBR) & imm 11001101iiiiiiii —...
  • Page 56: Table 3.7 Shift Instructions

    Table 3.7 Shift Instructions Instruction Operation Instruction Code Privileged T Bit T ← Rn ← MSB ROTL 0100nnnn00000100 — — LSB → Rn → T ROTR 0100nnnn00000101 — — T ← Rn ← T ROTCL 0100nnnn00100100 — — T → Rn → T ROTCR 0100nnnn00100101 —...
  • Page 57: Table 3.8 Branch Instructions

    Table 3.8 Branch Instructions Instruction Operation Instruction Code Privileged T Bit When T = 0, disp × 2 + PC + label 10001011dddddddd — — — 4 → PC When T = 1, nop BF/S label Delayed branch; when T = 0, 10001111dddddddd —...
  • Page 58 Instruction Operation Instruction Code Privileged T Bit Rm → SPC Rm,SPC 0100mmmm01001110 Privileged — — Rm → DBR Rm,DBR 0100mmmm11111010 Privileged — — Rm → Rn_BANK (n = 0 to 7) Rm,Rn_BANK 0100mmmm1nnn1110 Privileged — — (Rm) → SR, Rm + 4 → Rm LDC.L @Rm+,SR 0100mmmm00000111 Privileged LSB...
  • Page 59 Instruction Operation Instruction Code Privileged T Bit VBR → Rn VBR,Rn 0000nnnn00100010 Privileged — — SSR → Rn SSR,Rn 0000nnnn00110010 Privileged — — SPC → Rn SPC,Rn 0000nnnn01000010 Privileged — — SGR → Rn SGR,Rn 0000nnnn00111010 Privileged — — DBR → Rn DBR,Rn 0000nnnn11111010 Privileged —...
  • Page 60: Table 3.10 Floating-Point Single-Precision Instructions

    Table 3.10 Floating-Point Single-Precision Instructions Instruction Operation Instruction Code Privileged T Bit H'0000 0000 → FRn FLDI0 1111nnnn10001101 — — — H'3F80 0000 → FRn FLDI1 1111nnnn10011101 — — — FRm → FRn FMOV FRm,FRn 1111nnnnmmmm1100 — — — (Rm) → FRn FMOV.S @Rm,FRn 1111nnnnmmmm1000 —...
  • Page 61: Table 3.11 Floating-Point Double-Precision Instructions

    Table 3.11 Floating-Point Double-Precision Instructions Instruction Operation Instruction Code Privileged T Bit FABS DRn & H'7FFF FFFF FFFF 1111nnn001011101 — — — FFFF → DRn DRn + DRm → DRn FADD DRm,DRn 1111nnn0mmm00000 — — — When DRn = DRm, 1 → T FCMP/EQ DRm,DRn 1111nnn0mmm00100 —...
  • Page 62: Table 3.13 Floating-Point Graphics Acceleration Instructions

    Table 3.13 Floating-Point Graphics Acceleration Instructions Instruction Operation Instruction Code Privileged T Bit DRm → XDn FMOV DRm,XDn — — — 1111nnn1mmm01100 XDm → DRn FMOV XDm,DRn — — — 1111nnn0mmm11100 XDm → XDn FMOV XDm,XDn — — — 1111nnn1mmm11100 (Rm) →...
  • Page 63: Section 4 Pipelining

    Section 4 Pipelining The SH-4A is a 2-ILP (instruction-level-parallelism) superscalar pipelining microprocessor. Instruction execution is pipelined, and two instructions can be executed in parallel. Pipelines Figure 4.1 shows the basic pipelines. Normally, a pipeline consists of seven stages: instruction fetch (I1/I2), decode and register read (ID), execution (E1/E2/E3), and write-back (WB). An instruction is executed as a combination of basic pipelines.
  • Page 64: Table 4.1 Representations Of Instruction Execution Patterns

    Figure 4.2 shows the instruction execution patterns. Representations in figure 4.2 and their descriptions are listed in table 4.1. Table 4.1 Representations of Instruction Execution Patterns Representation Description CPU EX pipe is occupied CPU LS pipe is occupied (with memory access) CPU LS pipe is occupied (without memory access) Either CPU EX pipe or CPU LS pipe is occupied E1/S1...
  • Page 65: Figure 4.2 Instruction Execution Patterns (1)

    (1-1) BF, BF/S, BT, BT/S, BRA, BSR: 1 issue cycle + 0 to 2 branch cycles Note: In branch instructions that are categorized E1/S1 E2/s2 E3/s3 as (1-1), the number of branch cycles may be reduced by prefetching. (I1) (I2) (ID) (Branch destination instruction) (1-2) JSR, JMP, BRAF, BSRF:...
  • Page 66: Figure 4.2 Instruction Execution Patterns (2)

    (2-1) 1-step operation (EX type): 1 issue cycle EXT[SU].[BW], MOVT, SWAP, XTRCT, ADD*, CMP*, DIV*, DT, NEG*, SUB*, AND, AND#, NOT, OR, OR#, TST, TST#, XOR, XOR#, ROT*, SHA*, SHL*, CLRS, CLRT, SETS, SETT Note: Except for AND#, OR#, TST#, and XOR# instructions using GBR relative addressing mode (2-2) 1-step operation (LS type): 1 issue cycle MOVA...
  • Page 67: Figure 4.2 Instruction Execution Patterns (3)

    (3-1) Load/store: 1 issue cycle MOV.[BWL], MOV.[BWL] @(d,GBR) (3-2) AND.B, OR.B, XOR.B, TST.B: 3 issue cycles E1S1 E2S2 E3S3 (3-3) TAS.B: 4 issue cycles E1S1 E2S2 E3S3 E1S1 E2S2 E3S3 (3-4) PREF, OCBI, OCBP, OCBWB, MOVCA.L, SYNCO: 1 issue cycle (3-5) LDTLB: 1 issue cycle E1s1 E2s2...
  • Page 68: Figure 4.2 Instruction Execution Patterns (4)

    (4-1) LDC to Rp_BANK/SSR/SPC/VBR: 1 issue cycle (4-2) LDC to DBR/SGR: 4 issue cycles (4-3) LDC to GBR: 1 issue cycle (4-4) LDC to SR: 4 issue cycles + 3 branch cycles E1s1 E2s2 E3s3 (Branch to the (I1) (I2) (ID) next instruction.) (4-5) LDC.L to Rp_BANK/SSR/SPC/VBR: 1 issue cycle...
  • Page 69: Figure 4.2 Instruction Execution Patterns (5)

    (4-9) STC from DBR/GBR/Rp_BANK/SSR/SPC/VBR/SGR: 1 issue cycle (4-10) STC from SR: 1 issue cycle E1s1 E2s2 E3s3 (4-11) STC.L from DBR/GBR/Rp_BANK/SSR/SPC/VBR/SGR: 1 issue cycle (4-12) STC.L from SR: 1 issue cycle E1S1 E2S2 E3S3 (4-13) LDS to PR: 1 issue cycle (4-14) LDS.L to PR: 1 issue cycle (4-15) STS from PR: 1 issue cycle (4-16) STS.L from PR: 1 issue cycle...
  • Page 70: Figure 4.2 Instruction Execution Patterns (6)

    (5-1) LDS to MACH/L: 1 issue cycle (5-2) LDS.L to MACH/L: 1 issue cycle (5-3) STS from MACH/L: 1 issue cycle (5-4) STS.L from MACH/L: 1 issue cycle (5-5) MULS.W, MULU.W: 1 issue cycle (5-6) DMULS.L, DMULU.L, MUL.L: 1 issue cycle (5-7) CLRMAC: 1 issue cycle (5-8) MAC.W: 2 issue cycle (5-9) MAC.L: 2 issue cycle...
  • Page 71: Figure 4.2 Instruction Execution Patterns (7)

    (6-1) LDS to FPUL: 1 issue cycle (6-2) STS from FPUL: 1 issue cycle (6-3) LDS.L to FPUL: 1 issue cycle (6-4) STS.L from FPUL: 1 issue cycle (6-5) LDS to FPSCR: 1 issue cycle (6-6) STS from FPSCR: 1 issue cycle (6-7) LDS.L to FPSCR: 1 issue cycle (6-8) STS.L from FPSCR: 1 issue cycle (6-9) FPU load/store instruction FMOV: 1 issue cycle...
  • Page 72: Figure 4.2 Instruction Execution Patterns (8)

    (6-12) Single-precision FABS, FNEG/double-precision FABS, FNEG: 1 issue cycle (6-13) FLDI0, FLDI1: 1 issue cycle (6-14) Single-precision floating-point computation: 1 issue cycle FCMP/EQ, FCMP/GT, FADD, FLOAT, FMAC, FMUL, FSUB, FTRC, FRCHG, FSCHG, FPCHG (6-15) Single-precision FDIV/FSQRT: 1 issue cycle FEDS (Divider occupied cycle) (6-16) Double-precision floating-point computation: 1 issue cycle FCMP/EQ, FCMP/GT, FADD, FLOAT, FSUB, FTRC, FCNVSD, FCNVDS (6-17) Double-precision floating-point computation: 1 issue cycle...
  • Page 73: Figure 4.2 Instruction Execution Patterns (9)

    (6-19) FIPR: 1 issue cycle (6-20) FTRV: 1 issue cycle (6-21) FSRRA: 1 issue cycle FEPL Function computing unit occupied cycle (6-22) FSCA: 1 issue cycle FEPL Function computing unit occupied cycle Figure 4.2 Instruction Execution Patterns (9) Rev. 1.50, 10/04, page 53 of 448...
  • Page 74: Parallel-Executability

    Parallel-Executability Instructions are categorized into six groups according to the internal function blocks used, as shown in table 4.2. Table 4.3 shows the parallel-executability of pairs of instructions in terms of groups. For example, ADD in the EX group and BRA in the BR group can be executed in parallel. Table 4.2 Instruction Groups Instruction...
  • Page 75: Table 4.3 Combination Of Preceding And Following Instructions

    Instruction Group Instruction AND.B #imm,@(R0,GBR) LDC.L @Rm+,SR PREFI TRAPA ICBI LDTLB TST.B #imm,@(R0,GBR) LDC Rm,DBR MAC.L SLEEP XOR.B #imm,@(R0,GBR) LDC Rm, SGR MAC.W STC SR,Rn LDC Rm,SR MOVCO STC.L SR,@-Rn LDC.L @Rm+,DBR MOVLI SYNCO LDC.L @Rm+,SGR TAS.B OR.B #imm,@(R0,GBR) [Legend] Rm/Rn @adr: Address SR1:...
  • Page 76: Issue Rates And Execution Cycles

    Issue Rates and Execution Cycles Instruction execution cycles are summarized in table 4.4. Instruction Group in the table 4.4 corresponds to the category in the table 4.2. Penalty cycles due to a pipeline stall are not considered in the issue rates and execution cycles in this section. 1.
  • Page 77 Table 4.4 Issue Rates and Execution Cycles Functional Instruction Execution Execution Category No. Instruction Group Issue Rate Cycles Pattern Data transfer EXTS.B Rm,Rn instructions EXTS.W Rm,Rn EXTU.B Rm,Rn EXTU.W Rm,Rn Rm,Rn #imm,Rn MOVA @(disp,PC),R0 MOV.W @(disp,PC),Rn MOV.L @(disp,PC),Rn MOV.B @Rm,Rn MOV.W @Rm,Rn MOV.L...
  • Page 78 Functional Instruction Execution Execution Category No. Instruction Group Issue Rate Cycles Pattern Data transfer MOV.W R0,@(disp,Rn) instructions MOV.L Rm,@(disp,Rn) MOV.B Rm,@(R0,Rn) MOV.W Rm,@(R0,Rn) MOV.L Rm,@(R0,Rn) MOV.B R0,@(disp,GBR) MOV.W R0,@(disp,GBR) MOV.L R0,@(disp,GBR) MOVCA.L R0,@Rn MOVCO.L R0,@Rn MOVLI.L @Rm,R0 MOVUA.L @Rm,R0 3-10 MOVUA.L @Rm+,R0 3-10...
  • Page 79 Functional Instruction Execution Execution Category No. Instruction Group Issue Rate Cycles Pattern Fixed-point CMP/PL arithmetic CMP/PZ instructions CMP/STR Rm,Rn DIV0S Rm,Rn DIV0U DIV1 Rm,Rn DMULS.L Rm,Rn DMULU.L Rm,Rn MAC.L @Rm+,@Rn+ MAC.W @Rm+,@Rn+ MUL.L Rm,Rn MULS.W Rm,Rn MULU.W Rm,Rn Rm,Rn NEGC Rm,Rn Rm,Rn SUBC...
  • Page 80 Functional Instruction Execution Execution Category Group Cycles Pattern No. Instruction Issue Rate Logical XOR.B #imm,@(R0,GBR) instructions Shift ROTL instructions ROTR ROTCL ROTCR 100 SHAD Rm,Rn 101 SHAL 102 SHAR 103 SHLD Rm,Rn 104 SHLL 105 SHLL2 106 SHLL8 107 SHLL16 108 SHLR 109 SHLR2 110 SHLR8...
  • Page 81 Functional Instruction Execution Execution Category Group Cycles Pattern No. Instruction Issue Rate System 126 CLRT control 127 ICBI 8+5+3 instructions 128 SETS 129 SETT 130 PREFI 5+5+3 131 SYNCO Undefined Undefined 132 TRAPA #imm 8+5+1 133 RTE 134 SLEEP Undefined Undefined 135 LDTLB 136 LDC...
  • Page 82 Functional Instruction Execution Execution Category Group Cycles Pattern No. Instruction Issue Rate System 158 STC DBR,Rn control 159 STC SGR,Rn instructions 160 STC GBR,Rn 161 STC Rp_BANK,Rn 162 STC SR,Rn 4-10 163 STC SSR,Rn 164 STC SPC,Rn 165 STC VBR,Rn 166 STC.L DBR,@-Rn 4-11...
  • Page 83 Functional Instruction Execution Execution Category Group Cycles Pattern No. Instruction Issue Rate Single- 189 FLDS FRm,FPUL 6-10 precision 190 FSTS FPUL,FRn 6-11 floating-point 191 FABS 6-12 instructions 192 FADD FRm,FRn 6-14 193 FCMP/EQ FRm,FRn 6-14 194 FCMP/GT FRm,FRn 6-14 195 FDIV FRm,FRn 6-15 196 FLOAT...
  • Page 84 Functional Instruction Execution Execution Category Group Cycles Pattern No. Instruction Issue Rate Double- 220 FSQRT 6-18 precision 221 FSUB DRm,DRn 6-16 floating-point 222 FTRC DRm,FPUL 6-16 instructions FPU system 223 LDS Rm,FPUL control 224 LDS Rm,FPSCR instructions 225 LDS.L @Rm+,FPUL 226 LDS.L @Rm+,FPSCR 227 STS...
  • Page 85: Section 5 Exception Handling

    Section 5 Exception Handling Summary of Exception Handling Exception handling processing is handled by a special routine which is executed by a reset, general exception handling, or interrupt. For example, if the executing instruction ends abnormally, appropriate action must be taken in order to return to the original program sequence, or report the abnormality before terminating the processing.
  • Page 86: Trapa Exception Register (Tra)

    5.2.1 TRAPA Exception Register (TRA) The TRAPA exception register (TRA) consists of 8-bit immediate data (imm) for the TRAPA instruction. TRA is set automatically by hardware when a TRAPA instruction is executed. TRA can also be modified by software. Bit: Initial value: R/W: Bit:...
  • Page 87: Exception Event Register (Expevt)

    5.2.2 Exception Event Register (EXPEVT) The exception event register (EXPEVT) consists of a 12-bit exception code. The exception code set in EXPEVT is that for a reset or general exception event. The exception code is set automatically by hardware when an exception occurs. EXPEVT can also be modified by software. Bit: Initial value: R/W:...
  • Page 88: Interrupt Event Register (Intevt)

    5.2.3 Interrupt Event Register (INTEVT) The interrupt event register (INTEVT) consists of a 14-bit exception code. The exception code is set automatically by hardware when an exception occurs. INTEVT can also be modified by software. Bit: Initial value: R/W: Bit: INTCODE Initial value: R/W:...
  • Page 89: Exception Handling Functions

    Exception Handling Functions 5.3.1 Exception Handling Flow In exception handling, the contents of the program counter (PC), status register (SR), and R15 are saved in the saved program counter (SPC), saved status register (SSR), and saved general register15 (SGR), and the CPU starts execution of the appropriate exception handling routine according to the vector address.
  • Page 90: Exception Types And Priorities

    Exception Types and Priorities Table 5.3 shows the types of exceptions, with their relative priorities, vector addresses, and exception/interrupt codes. Table 5.3 Exceptions Exception Transition Direction* Exception Execution Priority Priority Vector Exception Category Mode Exception Level* Order* Address Offset Code* Reset Abort type Power-on reset...
  • Page 91 Exception Transition Direction* Exception Execution Priority Priority Vector Exception Category Mode Exception Level* Order* Address Offset Code* Interrupt Completion Nonmaskable interrupt — (VBR) H'600 H'1C0 type General interrupt request — (VBR) H'600 — Note: 1. When UBDE in CBCR = 1, PC = DBR. In other cases, PC = VBR + H'100. 2.
  • Page 92: Exception Flow

    Exception Flow 5.5.1 Exception Flow Figure 5.1 shows an outline flowchart of the basic operations in instruction execution and exception handling. For the sake of clarity, the following description assumes that instructions are executed sequentially, one by one. Figure 5.1 shows the relative priority order of the different kinds of exceptions (reset, general exception, and interrupt).
  • Page 93: Exception Source Acceptance

    5.5.2 Exception Source Acceptance A priority ranking is provided for all exceptions for use in determining which of two or more simultaneously generated exceptions should be accepted. Five of the general exceptions—general illegal instruction exception, slot illegal instruction exception, general FPU disable exception, slot FPU disable exception, and unconditional trap exception—are detected in the process of instruction decoding, and do not occur simultaneously in the instruction pipeline.
  • Page 94: Exception Requests And Bl Bit

    5.5.3 Exception Requests and BL Bit When the BL bit in SR is 0, exceptions and interrupts are accepted. When the BL bit in SR is 1 and an exception other than a user break is generated, the CPU's internal registers and the registers of the other modules are set to their states following a manual reset, and the CPU branches to the same address as in a reset (H'A0000000).
  • Page 95: Description Of Exceptions

    Description of Exceptions The various exception handling operations explained here are exception sources, transition address on the occurrence of exception, and processor operation when a transition is made. 5.6.1 Resets Power-On Reset: • Condition: Power-on reset request • Operations: Exception code H'000 is set in EXPEVT, initialization of the CPU and on-chip peripheral module is carried out, and then a branch is made to the reset vector (H'A0000000).
  • Page 96 Instruction TLB Multiple Hit Exception: • Source: Multiple ITLB address matches • Transition address: H'A0000000 • Transition operations: The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.
  • Page 97: General Exceptions

    5.6.2 General Exceptions Data TLB Miss Exception: • Source: Address mismatch in UTLB address comparison • Transition address: VBR + H'00000400 • Transition operations: The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10].
  • Page 98 Instruction TLB Miss Exception: • Source: Address mismatch in ITLB address comparison • Transition address: VBR + H'00000400 • Transition operations: The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.
  • Page 99 Initial Page Write Exception: • Source: TLB is hit in a store access, but dirty bit D = 0 • Transition address: VBR + H'00000100 • Transition operations: The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10].
  • Page 100 Data TLB Protection Violation Exception: • Source: The access does not accord with the UTLB protection information (PR bits) shown below. Privileged Mode User Mode Only read access possible Access not possible Read/write access possible Access not possible Only read access possible Only read access possible Read/write access possible Read/write access possible...
  • Page 101 Instruction TLB Protection Violation Exception: • Source: The access does not accord with the ITLB protection information (PR bits) shown below. Privileged Mode User Mode Access possible Access not possible Access possible Access possible • Transition address: VBR + H'00000100 •...
  • Page 102 Data Address Error: • Sources:  Word data access from other than a word boundary (2n +1)  Longword data access from other than a longword data boundary (4n +1, 4n + 2, or 4n +3)  Quadword data access from other than a quadword data boundary (8n +1, 8n + 2, 8n +3, 8n + 4, 8n + 5, 8n + 6, or 8n + 7) ...
  • Page 103 Instruction Address Error: • Sources:  Instruction fetch from other than a word boundary (2n +1)  Instruction fetch from area H'80000000 to H'FFFFFFFF in user mode Area H'E5000000 to H'E5FFFFFF can be accessed in user mode. For details, see section 9, L Memory.
  • Page 104 Unconditional Trap: • Source: Execution of TRAPA instruction • Transition address: VBR + H'00000100 • Transition operations: As this is a processing-completion-type exception, the PC contents for the instruction following the TRAPA instruction are saved in SPC. The value of SR and R15 when the TRAPA instruction is executed are saved in SSR and SGR.
  • Page 105 General Illegal Instruction Exception: • Sources:  Decoding of an undefined instruction not in a delay slot Delayed branch instructions: JMP, JSR, BRA, BRAF, BSR, BSRF, RTS, RTE, BT/S, BF/S Undefined instruction: H'FFFD  Decoding in user mode of a privileged instruction not in a delay slot Privileged instructions: LDC, STC, RTE, LDTLB, SLEEP, but excluding LDC/STC instructions that access GBR •...
  • Page 106 Slot Illegal Instruction Exception: • Sources:  Decoding of an undefined instruction in a delay slot Delayed branch instructions: JMP, JSR, BRA, BRAF, BSR, BSRF, RTS, RTE, BT/S, BF/S Undefined instruction: H'FFFD  Decoding of an instruction that modifies PC in a delay slot Instructions that modify PC: JMP, JSR, BRA, BRAF, BSR, BSRF, RTS, RTE, BT, BF, BT/S, BF/S, TRAPA, LDC Rm,SR, LDC.L @Rm+,SR, ICBI, PREFI ...
  • Page 107 General FPU Disable Exception: • Source: Decoding of an FPU instruction* not in a delay slot with SR.FD =1 • Transition address: VBR + H'00000100 • Transition operations: The PC and SR contents for the instruction at which this exception occurred are saved in SPC and SSR.
  • Page 108 Slot FPU Disable Exception: • Source: Decoding of an FPU instruction in a delay slot with SR.FD =1 • Transition address: VBR + H'00000100 • Transition operations: The PC contents for the preceding delayed branch instruction are saved in SPC. The SR and R15 contents when this exception occurred are saved in SSR and SGR.
  • Page 109 Pre-Execution User Break/Post-Execution User Break: • Source: Fulfilling of a break condition set in the user break controller • Transition address: VBR + H'00000100, or DBR • Transition operations: In the case of a post-execution break, the PC contents for the instruction following the instruction at which the breakpoint is set are set in SPC.
  • Page 110 FPU Exception: • Source: Exception due to execution of a floating-point operation • Transition address: VBR + H'00000100 • Transition operations: The PC and SR contents for the instruction at which this exception occurred are saved in SPC and SSR . The R15 contents at this time are saved in SGR. Exception code H'120 is set in EXPEVT.
  • Page 111: Interrupts

    5.6.3 Interrupts NMI (Nonmaskable Interrupt): • Source: NMI pin edge detection • Transition address: VBR + H'00000600 • Transition operations: The PC and SR contents for the instruction immediately after this exception is accepted are saved in SPC and SSR. The R15 contents at this time are saved in SGR. Exception code H'1C0 is set in INTEVT.
  • Page 112: Priority Order With Multiple Exceptions

    General Interrupt Request: • Source: The interrupt mask level bits setting in SR is smaller than the interrupt level of interrupt request, and the BL bit in SR is 0 (accepted at instruction boundary). • Transition address: VBR + H'00000600 •...
  • Page 113 • Instructions that make two accesses to memory With MAC instructions, memory-to-memory arithmetic/logic instructions, TAS instructions, and MOVUA instructions, two data transfers are performed by a single instruction, and an exception will be detected for each of these data transfers. In these cases, therefore, the following order is used to determine priority.
  • Page 114: Usage Notes

    Usage Notes 1. Return from exception handling A. Check the BL bit in SR with software. If SPC and SSR have been saved to memory, set the BL bit in SR to 1 before restoring them. B. Issue an RTE instruction. When RTE is executed, the SPC contents are saved in PC, the SSR contents are saved in SR, and branch is made to the SPC address to return from the exception handling routine.
  • Page 115 5. Changing the SR register value and accepting exception A. When the MD or BL bit in the SR register is changed by the LDC instruction, the acceptance of the exception is determined by the changed SR value, starting from the next instruction.* In the completion type exception, an exception is accepted after the next instruction has been executed.
  • Page 116 Rev. 1.50, 10/04, page 96 of 448...
  • Page 117: Section 6 Floating-Point Unit (Fpu)

    Section 6 Floating-Point Unit (FPU) Features The FPU has the following features. • Conforms to IEEE754 standard • 32 single-precision floating-point registers (can also be referenced as 16 double-precision registers) • Two rounding modes: Round to Nearest and Round to Zero •...
  • Page 118: Data Formats

    Data Formats 6.2.1 Floating-Point Format A floating-point number consists of the following three fields: • Sign bit (s) • Exponent field (e) • Fraction field (f) The SH-4A can handle single-precision and double-precision floating-point numbers, using the formats shown in figures 6.1 and 6.2. 23 22 Figure 6.1 Format of Single-Precision Floating-Point Number Figure 6.2 Format of Double-Precision Floating-Point Number...
  • Page 119 Table 6.1 Floating-Point Number Formats and Parameters Parameter Single-Precision Double-Precision Total bit width 32 bits 64 bits Sign bit 1 bit 1 bit Exponent field 8 bits 11 bits Fraction field 23 bits 52 bits Precision 24 bits 53 bits Bias +127 +1023...
  • Page 120: Table 6.2 Floating-Point Ranges

    Table 6.2 Floating-Point Ranges Type Single-Precision Double-Precision Signaling non-number H'7FFF FFFF to H'7FC0 0000 H'7FFF FFFF FFFF FFFF to H'7FF8 0000 0000 0000 Quiet non-number H'7FBF FFFF to H'7F80 0001 H'7FF7 FFFF FFFF FFFF to H'7FF0 0000 0000 0001 Positive infinity H'7F80 0000 H'7FF0 0000 0000 0000 Positive normalized...
  • Page 121: Non-Numbers (Nan)

    6.2.2 Non-Numbers (NaN) Figure 6.3 shows the bit pattern of a non-number (NaN). A value is NaN in the following case: • Sign bit: Don't care • Exponent field: All bits are 1 • Fraction field: At least one bit is 1 The NaN is a signaling NaN (sNaN) if the MSB of the fraction field is 1, and a quiet NaN (qNaN) if the MSB is 0.
  • Page 122: Denormalized Numbers

    6.2.3 Denormalized Numbers For a denormalized number floating-point value, the exponent field is expressed as 0, and the fraction field as a non-zero value. When the DN bit in FPSCR of the FPU is 1, a denormalized number (source operand or operation result) is always positive or negative zero in a floating-point operation that generates a value (an operation other than transfer instructions between registers, FNEG, or FABS).
  • Page 123: Register Descriptions

    Register Descriptions 6.3.1 Floating-Point Registers Figure 6.4 shows the floating-point register configuration. There are thirty-two 32-bit floating- point registers comprised with two banks: FPR0_BANK0 to FPR15_BANK0, and FPR0_BANK1 to FPR15_BANK1. These thirty-two registers are referenced as FR0 to FR15, DR0/2/4/6/8/10/12/14, FV0/4/8/12, XF0 to XF15, XD0/2/4/6/8/10/12/14, and XMTRX. Corresponding registers to FPR0_BANK0 to FPR15_BANK0, and FPR0_BANK1 to FPR15_BANK1 are determined according to the FR bit of FPSCR.
  • Page 124: Figure 6.4 Floating-Point Registers

    7. Single-precision floating-point extended register matrix, XMTRX: XMTRX comprises all 16 XF registers. XMTRX = XF12 XF13 XF10 XF14 XF11 XF15 FPSCR.FR = 0 FPSCR.FR = 1 FPR0 BANK0 XMTRX FPR1 BANK0 FPR2 BANK0 FPR3 BANK0 FPR4 BANK0 FPR5 BANK0 FPR6 BANK0 FPR7 BANK0 FPR8 BANK0...
  • Page 125: Floating-Point Status/Control Register (Fpscr)

    6.3.2 Floating-Point Status/Control Register (FPSCR) bit: Cause Initial value: R/W: bit: Cause Enable (EN) Flag Initial value: R/W: Initial Bit Name Value Description 31 to 22 — All 0 Reserved These bits are always read as 0. The write value should always be 0.
  • Page 126: Figure 6.5 Relation Between Sz Bit And Endian

    Initial Bit Name Value Description 17 to 12 Cause All 0 FPU Exception Cause Field FPU Exception Enable Field 11 to 7 Enable All 0 FPU Exception Flag Field 6 to 2 Flag All 0 Each time an FPU operation instruction is executed, the FPU exception cause field is cleared to 0.
  • Page 127: Floating-Point Communication Register (Fpul)

    Table 6.3 Bit Allocation for FPU Exception Handling Invalid Division Overflow Underflo Inexact Field Name Error (E) Operation (V) by Zero (Z) w (U) Cause FPU exception Bit 17 Bit 16 Bit 15 Bit 14 Bit 13 Bit 12 cause field Enable FPU exception None...
  • Page 128: Rounding

    Rounding In a floating-point instruction, rounding is performed when generating the final operation result from the intermediate result. Therefore, the result of combination instructions such as FMAC, FTRV, and FIPR will differ from the result when using a basic instruction such as FADD, FSUB, or FMUL.
  • Page 129: Floating-Point Exceptions

    Floating-Point Exceptions 6.5.1 General FPU Disable Exceptions and Slot FPU Disable Exceptions FPU-related exceptions are occurred when an FPU instruction is executed with SR.FD set to 1. When the FPU instruction is in other than delayed slot, the general FPU disable exception is occurred.
  • Page 130: Fpu Exception Handling

    6.5.3 FPU Exception Handling FPU exception handling is initiated in the following cases: • FPU error (E): FPSCR.DN = 0 and a denormalized number is input • Invalid operation (V): FPSCR.Enable.V = 1 and (instruction = FTRV or invalid operation) •...
  • Page 131: Graphics Support Functions

    Graphics Support Functions The SH-4A supports two kinds of graphics functions: new instructions for geometric operations, and pair single-precision transfer instructions that enable high-speed data transfer. 6.6.1 Geometric Operation Instructions Geometric operation instructions perform approximate-value computations. To enable high-speed computation with a minimum of hardware, the SH-4A ignores comparatively small values in the partial computation results of four multiplications.
  • Page 132: Pair Single-Precision Data Transfer

    Since an inexact exception is not detected by an FIRV instruction, the inexact exception (I) bit in both the FPU exception cause field and flag field are always set to 1 when an FTRV instruction is executed. Therefore, if the I bit is set in the FPU exception enable field, FPU exception handling will be executed.
  • Page 133: Section 7 Memory Management Unit (Mmu)

    Section 7 Memory Management Unit (MMU) The SH-4A supports an 8-bit address space identifier, a 32-bit virtual address space, and a 29-bit physical address space. Address translation from virtual addresses to physical addresses is enabled by the memory management unit (MMU) in the SH-4A. The MMU performs high-speed address translation by caching user-created address translation table information in an address translation buffer (translation lookaside buffer: TLB).
  • Page 134 When address translation from virtual memory to physical memory is performed using the MMU, it may happen that the translation information has not been recorded in the MMU, or the virtual memory of a different process is accessed by mistake. In such cases, the MMU will generate an exception, change the physical memory mapping, and record the new address translation information.
  • Page 135: Address Spaces

    Virtual Memory Physical Process 1 Physical Memory Physical Process 1 Memory Memory Process 1 Virtual Physical Process 1 Process 1 Memory Memory Physical Memory Process 2 Process 2 Process 3 Process 3 Figure 7.1 Role of MMU 7.1.1 Address Spaces Virtual Address Space: The SH-4A supports a 32-bit virtual address space, and can access a 4- Gbyte address space.
  • Page 136: Figure 7.2 Virtual Address Space (At In Mmucr= 0)

    Physical address space H'0000 0000 H'0000 0000 Area 0 Area 1 Area 2 Area 3 U0 area Area 4 P0 area Cacheable Area 5 Cacheable Area 6 Area 7 H'8000 0000 H'8000 0000 P1 area Cacheable Address error H'A000 0000 P2 area Non-cacheable H'C000 0000...
  • Page 137 • P0, P3, and U0 Areas: The P0, P3, and U0 areas allow address translation using the TLB and access using the cache. When the MMU is disabled, replacing the upper 3 bits of an address with 0s gives the corresponding physical address.
  • Page 138: Figure 7.4 P4 Area

    H'E000 0000 Store queue H'E400 0000 Reserved area H'E500 0000 On-chip memory area H'E600 0000 Reserved area H'F000 0000 Instruction cache address array H'F100 0000 Instruction cache data array H'F200 0000 Instruction TLB address array H'F300 0000 Instruction TLB data array H'F400 0000 Operand cache address array H'F500 0000...
  • Page 139: Figure 7.5 Physical Address Space

    The area from H'F500 0000 to H'F5FF FFFF is used for direct access to the operand cache data array. For details, see section 8.6.4, OC Data Array. The area from H'F600 0000 to H'F60F FFFF is used for direct access to the unified TLB address array.
  • Page 140 Address Translation: When the MMU is used, the virtual address space is divided into units called pages, and translation to physical addresses is carried out in these page units. The address translation table in external memory contains the physical addresses corresponding to virtual addresses and additional information such as memory protection codes.
  • Page 141: Register Descriptions

    Register Descriptions The following registers are related to MMU processing. Table 7.1 Register Configuration Area 7 Register Name Abbreviation R/W P4 Address* Address* Size Page table entry high register PTEH H'FF00 0000 H'1F00 0000 Page table entry low register PTEL H'FF00 0004 H'1F00 0004 Translation table base register...
  • Page 142: Page Table Entry High Register (Pteh)

    7.2.1 Page Table Entry High Register (PTEH) PTEH consists of the virtual page number (VPN) and address space identifier (ASID). When an MMU exception or address error exception occurs, the VPN of the virtual address at which the exception occurred is set in the VPN bit by hardware. VPN varies according to the page size, but the VPN set by hardware when an exception occurs consists of the upper 22 bits of the virtual address which caused the exception.
  • Page 143: Page Table Entry Low Register (Ptel)

    7.2.2 Page Table Entry Low Register (PTEL) PTEL is used to hold the physical page number and page management information to be recorded in the UTLB by means of the LDTLB instruction. The contents of this register are not changed unless a software directive is issued.
  • Page 144: Translation Table Base Register (Ttb)

    7.2.3 Translation Table Base Register (TTB) TTB is used to store the base address of the currently used page table, and so on. The contents of TTB are not changed unless a software directive is issued. This register can be used freely by software.
  • Page 145: Mmu Control Register (Mmucr)

    7.2.5 MMU Control Register (MMUCR) The individual bits perform MMU settings as shown below. Therefore, MMUCR rewriting should be performed by a program in the P1 or P2 area. After MMUCR has been updated, execute one of the following three methods before an access (including an instruction fetch) to the P0, P3, U0, or store queue area is performed.
  • Page 146 Bit Name Initial Value Description 31 to 26 LRUI All 0 Least Recently Used ITLB These bits indicate the ITLB entry to be replaced. The LRU (least recently used) method is used to decide the ITLB entry to be replaced in the event of an ITLB miss.
  • Page 147 Initial Bit Name Value Description 15 to 10 URC All 0 UTLB Replace Counter These bits serve as a random counter for indicating the UTLB entry for which replacement is to be performed with an LDTLB instruction. This bit is incremented each time the UTLB is accessed.
  • Page 148: Physical Address Space Control Register (Pascr)

    7.2.6 Physical Address Space Control Register (PASCR) PASCR controls the operation in the physical address space. Bit: Initial value: R/W: Bit: Initial value: R/W: Initial Bit Name Value Description  31 to 8 All 0 Reserved For details on reading from or writing to these bits, see description in General Precautions on Handling of Product.
  • Page 149: Instruction Re-Fetch Inhibit Control Register (Irmcr)

    7.2.7 Instruction Re-Fetch Inhibit Control Register (IRMCR) When the specific resource is changed, IRMCR controls whether the instruction fetch is performed again for the next instruction. The specific resource means the part of control registers, TLB, and cache. In the initial state, the instruction fetch is performed again for the next instruction after changing the resource.
  • Page 150 Initial Bit Name Value Description Re-Fetch Inhibit after LDTLB Execution This bit controls whether re-fetch is performed for the next instruction after the LDTLB instruction has been executed. 0: Re-fetch is performed 1: Re-fetch is not performed Re-Fetch Inhibit after Writing Memory-Mapped TLB This bit controls whether re-fetch is performed for the next instruction after writing memory-mapped ITLB/UTLB while the AT bit in MMUCR is set to 1.
  • Page 151: Tlb Functions

    TLB Functions 7.3.1 Unified TLB (UTLB) Configuration The UTLB is used for the following two purposes: 1. To translate a virtual address to a physical address in a data access 2. As a table of address translation information to be recorded in the ITLB in the event of an ITLB miss The UTLB is so called because of its use for the above two purposes.
  • Page 152 • SH: Share status bit When 0, pages are not shared by processes. When 1, pages are shared by processes. • SZ[1:0]: Page size bits Specify the page size. 00: 1-Kbyte page 01: 4-Kbyte page 10: 64-Kbyte page 11: 1-Mbyte page •...
  • Page 153: Instruction Tlb (Itlb) Configuration

    • D: Dirty bit Indicates whether a write has been performed to a page. 0: Write has not been performed 1: Write has been performed • WT: Write-through bit Specifies the cache write mode. 0: Copy-back mode 1: Write-through mode •...
  • Page 154: Address Translation Method

    7.3.3 Address Translation Method Figure 7.9 shows a flowchart of a memory access using the UTLB. Data access to virtual address (VA) VA is VA is VA is VA is in P0, U0, in P4 area in P2 area in P1 area or P3 area MMUCR.AT = 1 CCR.OCE?
  • Page 155: Figure 7.10 Flowchart Of Memory Access Using Itlb

    Figure 7.10 shows a flowchart of a memory access using the ITLB. Instruction access to virtual address (VA) VA is in P0, U0, VA is VA is VA is or P3 area in P1 area in P4 area in P2 area MMUCR.AT = 1 CCR.ICE? SH = 0...
  • Page 156: Mmu Functions

    MMU Functions 7.4.1 MMU Hardware Management The SH-4A supports the following MMU functions. 1. The MMU decodes the virtual address to be accessed by software, and performs address translation by controlling the UTLB/ITLB in accordance with the MMUCR settings. 2. The MMU determines the cache access status on the basis of the page management information read during address translation (C and WT bits).
  • Page 157: Mmu Instruction (Ldtlb)

    7.4.3 MMU Instruction (LDTLB) A TLB load instruction (LDTLB) is provided for recording UTLB entries. When an LDTLB instruction is issued, the SH-4A copies the contents of PTEH and PTEL to the UTLB entry indicated by the URC bit in MMUCR. ITLB entries are not updated by the LDTLB instruction, and therefore address translation information purged from the UTLB entry may still remain in the ITLB entry.
  • Page 158: Figure 7.11 Operation Of Ldtlb Instruction

    The operation of the LDTLB instruction is shown in figure 7.11. MMUCR 26252423 18171615 10 9 8 7 3 2 1 0 LRUI — — — TI — AT SQMD Entry specification PTEH PTEL 10 9 8 7 2928 9 8 7 6 5 4 3 2 1 0 —...
  • Page 159: Hardware Itlb Miss Handling

    7.4.4 Hardware ITLB Miss Handling In an instruction access, the SH-4A searches the ITLB. If it cannot find the necessary address translation information (ITLB miss occurred), the UTLB is searched by hardware, and if the necessary address translation information is present, it is recorded in the ITLB. This procedure is known as hardware ITLB miss handling.
  • Page 160: Mmu Exceptions

    MMU Exceptions There are seven MMU exceptions: instruction TLB multiple hit exception, instruction TLB miss exception, instruction TLB protection violation exception, data TLB multiple hit exception, data TLB miss exception, data TLB protection violation exception, and initial page write exception. Refer to figures 7.9 and 7.10 for the conditions under which each of these exceptions occurs.
  • Page 161: Instruction Tlb Miss Exception

    7.5.2 Instruction TLB Miss Exception An instruction TLB miss exception occurs when address translation information for the virtual address to which an instruction access is made is not found in the UTLB entries by the hardware ITLB miss handling routine. The instruction TLB miss exception processing carried out by hardware and software is shown below.
  • Page 162: Instruction Tlb Protection Violation Exception

    7.5.3 Instruction TLB Protection Violation Exception An instruction TLB protection violation exception occurs when, even though an ITLB entry contains address translation information matching the virtual address to which an instruction access is made, the actual access type is not permitted by the access right specified by the PR bit. The instruction TLB protection violation exception processing carried out by hardware and software is shown below.
  • Page 163: Data Tlb Multiple Hit Exception

    7.5.4 Data TLB Multiple Hit Exception A data TLB multiple hit exception occurs when more than one UTLB entry matches the virtual address to which a data access has been made. When a data TLB multiple hit exception occurs, a reset is executed, and cache coherency is not guaranteed.
  • Page 164: Data Tlb Protection Violation Exception

    Software Processing (Data TLB Miss Exception Handling Routine): Software is responsible for searching the external memory page table and assigning the necessary page table entry. Software should carry out the following processing in order to find and assign the necessary page table entry.
  • Page 165: Initial Page Write Exception

    Software Processing (Data TLB Protection Violation Exception Handling Routine): Resolve the data TLB protection violation, execute the exception handling return instruction (RTE), terminate the exception handling routine, and return control to the normal flow. The RTE instruction should be issued at least one instruction after the LDTLB instruction. 7.5.7 Initial Page Write Exception An initial page write exception occurs when the D bit is 0 even though a UTLB entry contains...
  • Page 166: Memory-Mapped Tlb Configuration

    6. Finally, execute the exception handling return instruction (RTE), terminate the exception handling routine, and return control to the normal flow. The RTE instruction should be issued at least one instruction after the LDTLB instruction. Memory-Mapped TLB Configuration To enable the ITLB and UTLB to be managed by software, their contents are allowed to be read from and written to by a program in the P2 area with a MOV instruction in privileged mode.
  • Page 167: Itlb Address Array

    7.6.1 ITLB Address Array The ITLB address array is allocated to addresses H'F200 0000 to H'F2FF FFFF in the P4 area. An address array access requires a 32-bit address field specification (when reading or writing) and a 32-bit data field specification (when writing). Information for selecting the entry to be accessed is specified in the address field, and VPN, V, and ASID to be written to the address array are specified in the data field.
  • Page 168: Itlb Data Array

    7.6.2 ITLB Data Array The ITLB data array is allocated to addresses H'F300 0000 to H'F37F FFFF in the P4 area. A data array access requires a 32-bit address field specification (when reading or writing) and a 32-bit data field specification (when writing). Information for selecting the entry to be accessed is specified in the address field, and PPN, V, SZ, PR, C, and SH to be written to the data array are specified in the data field.
  • Page 169: Utlb Address Array

    7.6.3 UTLB Address Array The UTLB address array is allocated to addresses H'F600 0000 to H'F60F FFFF in the P4 area. An address array access requires a 32-bit address field specification (when reading or writing) and a 32-bit data field specification (when writing). Information for selecting the entry to be accessed is specified in the address field, and VPN, D, V, and ASID to be written to the address array are specified in the data field.
  • Page 170: Utlb Data Array

    14 13 Address field 1 1 1 1 0 1 1 0 0 0 0 0 * * * * * * * * * * 10 9 8 7 Data field ASID VPN: Virtual page number ASID: Address space identifier Validity bit Association bit Entry...
  • Page 171: 32-Bit Address Extended Mode

    14 13 Address field 1 1 1 1 0 1 1 1 0 0 0 0 * * * * * * * * * * 29 28 10 9 8 7 Data field PPN: Physical page number Protection key data Validity bit Cacheability bit Entry...
  • Page 172: Overview Of 32-Bit Address Extended Mode

    7.7.1 Overview of 32-Bit Address Extended Mode In 32-bit address extended mode, the privileged space mapping buffer (PMB) is introduced. The PMB maps virtual addresses in the P1 or P2 area which are not translated in 29-bit address mode to the 32-bit physical address space. In areas which are target for address translation of the TLB (UTLB/ITLB), upper three bits in the PPN field of the UTLB or ITLB are extended and then addresses after the TLB translation can handle the 32-bit physical addresses.
  • Page 173 [Legend] • VPN: Virtual page number For 16-Mbyte page: Upper 8 bits of virtual address For 64-Mbyte page: Upper 6 bits of virtual address For 128-Mbyte page: Upper 5 bits of virtual address For 512-Mbyte page: Upper 3 bits of virtual address •...
  • Page 174: Pmb Function

    • UB: Buffered write bit Specifies whether a buffered write is performed. 0: Buffered write (Data access of subsequent processing proceeds without waiting for the write to complete.) 1: Unbuffered write (Data access of subsequent processing is stalled until the write has completed.) 7.7.4 PMB Function...
  • Page 175: Figure 7.18 Memory-Mapped Pmb Address Array

    1. PMB address array read When memory reading is performed while bits 31 to 20 in the address field are specified as H'F61 which indicates the PMB address array and bits 11 to 8 in the address field as an entry, bits 31 to 24 in the data field are read as VPN and bit 8 in the data field as V.
  • Page 176: Notes On Using 32-Bit Address Extended Mode

    12 11 Address field 1 1 1 1 0 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 9 8 7 2 1 0 Data field PPN: Physical page number Buffered write bit...
  • Page 177 ITLB: The PPN field in the ITLB is extended to bits 31 to 10. UTLB: The PPN field in the UTLB is extended to bits 31 to 10. The same UB bit as that in the PMB is added in each entry of the UTLB. •...
  • Page 178 Rev. 1.50, 10/04, page 158 of 448...
  • Page 179: Section 8 Caches

    Section 8 Caches The SH-4A has an on-chip 32-Kbyte instruction cache (IC) for instructions and an on-chip 32- Kbyte operand cache (OC) for data. Note: For the size of instruction cache and operand cache, see the hardware manual of the target product.
  • Page 180: Figure 8.1 Configuration Of Operand Cache (Oc)

    The operand cache of the SH-4A is 4-way set associative, each may comprising 256 cache lines. Figure 8.1 shows the configuration of the operand cache. The instruction cache is 4-way set-associative, each way comprising 256 cache lines. Figure 8.2 shows the configuration of the instruction cache. Virtual address [12:5] Longword (LW) selection...
  • Page 181: Figure 8.2 Configuration Of Instruction Cache (Ic)

    Virtual address 13 12 [12:5] Longword (LW) selection Entry selection Address array Data array (way 0 to way 3) (way 0 to way3) 19 bits 1 bit 32 bits 32 bits 32 bits 32 bits 32 bits 32 bits 32 bits 32 bits 6 bits Comparison...
  • Page 182: Register Descriptions

    • Data array The data field holds 32 bytes (256 bits) of data per cache line. The data array is not initialized by a power-on or manual reset. • LRU In a 4-way set-associative method, up to 4 items of data can be registered in the cache at each entry address.
  • Page 183: Cache Control Register (Ccr)

    8.2.1 Cache Control Register (CCR) CCR controls the cache operating mode, the cache write mode, and invalidation of all cache entries. CCR modifications must only be made by a program in the non-cacheable P2 area. After CCR has been updated, execute one of the following three methods before an access (including an instruction fetch) to the cacheable area is performed.
  • Page 184 Initial Bit Name Value Description  10, 9 All 0 Reserved For details on reading from or writing to these bits, see description in General Precautions on Handling of Product. IC Enable Bit Selects whether the IC is used. Note however when address translation is performed, the IC cannot be used unless the C bit in the page management information is also 1.
  • Page 185: Queue Address Control Register 0 (Qacr0)

    8.2.2 Queue Address Control Register 0 (QACR0) QACR0 specifies the area onto which store queue 0 (SQ0) is mapped when the MMU is disabled. Bit: Initial value: R/W: Bit: AREA0 Initial value: R/W: Initial Bit Name Value Description  31 to 5 All 0 Reserved For details on reading from or writing to these bits, see...
  • Page 186: Queue Address Control Register 1 (Qacr1)

    8.2.3 Queue Address Control Register 1 (QACR1) QACR1 specifies the area onto which store queue 1 (SQ1) is mapped when the MMU is disabled. Bit: Initial value: R/W: Bit: AREA1 Initial value: R/W: Initial Bit Name Value Description  31 to 5 All 0 Reserved For details on reading from or writing to these bits, see...
  • Page 187: On-Chip Memory Control Register (Ramcr)

    8.2.4 On-Chip Memory Control Register (RAMCR) RAMCR controls the number of ways in the IC and OC. RAMCR modifications must only be made by a program in the non-cacheable P2 area. After RAMCR has been updated, execute one of the following three methods before an access (including an instruction fetch) to the cacheable area or the L memory area is performed.
  • Page 188 Initial Bit Name Value Description IC2W IC Two-Way Mode bit 0: IC is a four-way operation 1: IC is a two-way operation For details, see section 8.4.3, IC Two-Way Mode. OC2W OC Two-Way Mode bit 0: OC is a four-way operation 1: OC is a two-way operation For details, see section 8.3.6, OC Two-Way Mode.
  • Page 189: Operand Cache Operation

    Operand Cache Operation 8.3.1 Read Operation When the Operand Cache (OC) is enabled (OCE = 1 in CCR) and data is read from a cacheable area, the cache operates as follows: 1. The tag, V bit, U bit, and LRU bits on each way are read from the cache line indexed by virtual address bits [12:5].
  • Page 190: Prefetch Operation

    8.3.2 Prefetch Operation When the Operand Cache (OC) is enabled (OCE = 1 in CCR) and data is prefetched from a cacheable area, the cache operates as follows: 1. The tag, V bit, U bit, and LRU bits on each way are read from the cache line indexed by virtual address bits [12:5].
  • Page 191: Write Operation

    8.3.3 Write Operation When the Operand Cache (OC) is enabled (OCE = 1 in CCR) and data is written to a cacheable area, the cache operates as follows: 1. The tag, V bit, U bit, and LRU bits on each way are read from the cache line indexed by virtual address bits [12:5].
  • Page 192: Write-Back Buffer

    6. Cache miss (copy-back, with write-back) The tag and data field of the cache line on the way which is selected to replace are saved in the write-back buffer. Then a data write in accordance with the access size is performed for the data field on the hit way which is indexed by virtual address bits [4:0].
  • Page 193: Oc Two-Way Mode

    8.3.6 OC Two-Way Mode When the OC2W bit in RAMCR is set to 1, OC two-way mode which only uses way 0 and way 1 in the OC is entered. Thus, power consumption can be reduced. In this mode, only way 0 and way 1 are used even if a memory-mapped OC access is made.
  • Page 194: Prefetch Operation

    8.4.2 Prefetch Operation When the IC is enabled (ICE = 1 in CCR) and instruction prefetches are performed from a cacheable area, the instruction cache operates as follows: 1. The tag, V bit, Ubit and LRU bits on each way are read from the cache line indexed by virtual address bits [12:5].
  • Page 195: Cache Operation Instruction

    Cache Operation Instruction 8.5.1 Coherency between Cache and External Memory Coherency between cache and external memory should be assured by software. In the SH-4A, the following six instructions are supported for cache operations. Details of these instructions are given in section 10, Instruction Descriptions. •...
  • Page 196: Prefetch Operation

    FLUSH transaction: When the operand cache is enabled, the FLUSH transaction checks the operand cache and if the hit line is dirty, then the data is written back to the external memory. If the transaction is not hit to the cache or the hit entry is not dirty, it is no-operation. 8.5.2 Prefetch Operation The SH-4A supports a prefetch instruction to reduce the cache fill penalty incurred as the result of...
  • Page 197: Ic Address Array

    8.6.1 IC Address Array The IC address array is allocated to addresses H'F000 0000 to H'F0FF FFFF in the P4 area. An address array access requires a 32-bit address field specification (when reading or writing) and a 32-bit data field specification. The way and entry to be accessed are specified in the address field, and the write tag and V bit are specified in the data field.
  • Page 198: Ic Data Array

    5 4 3 2 1 0 Address field 1 1 1 1 0 0 0 0 Entry * * * * * * * * * 10 9 Data field : Validity bit : Association bit : Reserved bits (write value should be 0 and read value is undefined ) : Don't care Figure 8.5 Memory-Mapped IC Address Array 8.6.2...
  • Page 199: Oc Address Array

    2 1 0 Address field 1 1 1 1 0 0 0 1 Entry * * * * * * * * * Data field Longword data : Longword specification bits : Don't care Figure 8.6 Memory-Mapped IC Data Array 8.6.3 OC Address Array The OC address array is allocated to addresses H'F400 0000 to H'F4FF FFFF in the P4 area.
  • Page 200: Figure 8.7 Memory-Mapped Oc Address Array

    3. OC address array write (associative) When a write is performed with the A bit in the address field set to 1, the tag in each way stored in the entry specified in the address field is compared with the tag specified in the data field.
  • Page 201: Oc Data Array

    8.6.4 OC Data Array The OC data array is allocated to addresses H'F500 0000 to H'F5FF FFFF in the P4 area. A data array access requires a 32-bit address field specification (when reading or writing) and a 32-bit data field specification. The way and entry to be accessed are specified in the address field, and the longword data to be written is specified in the data field.
  • Page 202: Store Queues

    Store Queues The SH-4A supports two 32-byte store queues (SQs) to perform high-speed writes to external memory. 8.7.1 SQ Configuration There are two 32-byte store queues, SQ0 and SQ1, as shown in figure 8.9. These two store queues can be set independently. SQ0[0] SQ0[1] SQ0[2]...
  • Page 203: Transfer To External Memory

    8.7.3 Transfer to External Memory Transfer from the SQs to external memory can be performed with a prefetch instruction (PREF). Issuing a PREF instruction for addresses H'E000 0000 to H'E3FF FFFC in the P4 area starts a transfer from the SQs to external memory. The transfer length is fixed at 32 bytes, and the start address is always at a 32-byte boundary.
  • Page 204: Determination Of Sq Access Exception

    8.7.4 Determination of SQ Access Exception Determination of an exception in a write to an SQ or transfer to external memory (PREF instruction) is performed as follows according to whether the MMU is enabled or disabled. If an exception occurs during a write to an SQ, the SQ contents before the write are retained. If an exception occurs in a data transfer from an SQ to external memory, the transfer to external memory will be aborted.
  • Page 205: Notes On Using 32-Bit Address Extended Mode

    Notes on Using 32-Bit Address Extended Mode In 32-bit address extended mode, the items described in this section are extended as follows. 1. The tag bits [28:10] (19 bits) in the IC and OC are extended to bits [31:10] (22 bits). 2.
  • Page 206 Rev. 1.50, 10/04, page 186 of 448...
  • Page 207: Section 9 L Memory

    Section 9 L Memory The SH-4A includes on-chip L-memory which stores instructions or data. Note: For the size of L-memory, see the hardware manual of the target product. Features • Capacity Total L memory can be selected from 16 Kbytes, 32 Kbytes, 64 Kbytes, or 128 Kbytes. •...
  • Page 208: Register Descriptions

    Register Descriptions The following registers are related to L memory. Table 9.2 Register Configuration Area 7 Name Abbreviation P4 Address* Address* Access Size On-chip memory control RAMCR H'FF000074 H'1F000074 register L memory transfer source LSA0 H'FF000050 H'1F000050 address register 0 L memory transfer source LSA1 H'FF000054...
  • Page 209: On-Chip Memory Control Register (Ramcr)

    9.2.1 On-Chip Memory Control Register (RAMCR) RAMCR controls the protective functions in the L memory. Bit : Initial value : R/W: Bit : IC2W OC2W Initial value : R/W: Initial Bit Name Value Description 31to10 — All 0 Reserved For read/write in these bits, refer to General Precautions on Handling of Product.
  • Page 210: L Memory Transfer Source Address Register 0 (Lsa0)

    9.2.2 L Memory Transfer Source Address Register 0 (LSA0) When MMUCR.AT = 0 or RAMCR.RP = 0, the LSA0 specifies the transfer source physical address for block transfer to page 0 of the L memory. Bit : L0SADR Initial value : R/W: Bit : L0SADR...
  • Page 211: L Memory Transfer Source Address Register 1 (Lsa1)

    Initial Bit Name Value Description 5 to 0 L0SSZ Undefined R/W L Memory Page 0 Block Transfer Source Address Select When MMUCR.AT = 0 or RAMCR.RP = 0, these bits select whether the operand addresses or L0SADR values are used as bits 15 to 10 of the transfer source physical address for block transfer to the L memory.
  • Page 212 Initial Bit Name Value Description 31 to 29 — All 0 Reserved For read/write in these bits, refer to General Precautions on Handling of Product. 28 to 10 L1SADR Undefined R/W L Memory Page 1 Block Transfer Source Address When MMUCR.AT = 0 or RAMCR.RP = 0, these bits specify transfer source physical address for block transfer to page 1 in the L memory.
  • Page 213: L Memory Transfer Destination Address Register 0 (Lda0)

    9.2.4 L Memory Transfer Destination Address Register 0 (LDA0) When MMUCR.AT = 0 or RAMCR.RP = 0, LDA0 specifies the transfer destination physical address for block transfer to page 0 of the L memory. Bit : L0DADR Initial value : R/W: Bit : L0DADR...
  • Page 214 Initial Bit Name Value Description 5 to 0 L0DSZ Undefined R/W L Memory Page 0 Block Transfer Destination Address Select When MMUCR.AT = 0 or RAMCR.RP = 0, these bits select whether the operand addresses or L0DADR values are used as bits 15 to 10 of the transfer destination physical address for block transfer to page 0 in the L memory.
  • Page 215: L Memory Transfer Destination Address Register 1 (Lda1)

    9.2.5 L Memory Transfer Destination Address Register 1 (LDA1) When MMUCR.AT = 0 or RAMCR.RP = 0, LDA1 specifies the transfer destination physical address for block transfer to page 1 in the L memory. Bit : L1DADR Initial value : R/W: Bit : L1DADR...
  • Page 216 Initial Bit Name Value Description 5 to 0 L1DSZ Undefined R/W L Memory Page 1 Block Transfer Destination Address Select When MMUCR.AT = 0 or RAMCR.RP = 0, these bits select whether the operand addresses or L1DADR values are used as bits 15 to 10 of the transfer destination physical address for block transfer to page 1 in the L memory.
  • Page 217: Operation

    Operation 9.3.1 Access from the CPU and FPU L memory access from the CPU and FPU is direct via the instruction bus and operand bus by means of the virtual address. As long as there is no conflict on the page, the L memory is accessed in one cycle.
  • Page 218 When the PREF instruction is issued to the L memory area, address conversion is performed in order to generate the physical address bits [28:10] in accordance with the SZ bit specification. The physical address bits [9:5] are generated from the virtual address prior to address conversion. The physical address bits [4:0] are fixed to 0.
  • Page 219: L Memory Protective Functions

    L Memory Protective Functions The SH-4A implements the following protective functions to the L memory by using the on-chip memory access mode bit (RMD) and the on-chip memory protection enable bit (RP) in the on-chip memory control register (RAMCR). • Protective functions for access from the CPU and FPU When RAMCR.RMD = 0, and the L memory is accessed in user mode, it is determined to be an address error exception.
  • Page 220: Usage Notes

    Usage Notes 9.5.1 Page Conflict In the event of simultaneous access to the same page from different buses, page conflict occurs. Although each access is completed correctly, this kind of conflict tends to lower L memory accessibility. Therefore it is advisable to provide all possible preventative software measures. For example, conflicts will not occur if each bus accesses different pages.
  • Page 221: Section 10 Instruction Descriptions

    Section 10 Instruction Descriptions This section describes instructions in alphabetical order using the format shown below. Instruction Name (Full Name): Instruction Type (Indication of delayed branch instruction or interrupt-disabling instruction) Cycle Instruction Code Operation T Bit Format Assembler input format; Number of The value of A brief description...
  • Page 222: Cpu Instruction

    10.1 CPU instruction Note: Of the SH-4A's section, CPU instructions, those which support the FPU or differ functionally from those of the SH4AL-DSP are described in section 10.2, CPU instructions (FPU Related). The other instructions are described in section 10.1, CPU instructions.
  • Page 223 struct SR0 { unsigned long dummy0:22; unsigned long M0:1; unsigned long Q0:1; unsigned long I0:4; unsigned long dummy1:2; unsigned long S0:1; unsigned long T0:1; SR structure definitions define M ((*(struct SR0 *)(&SR)).M0) #define Q ((*(struct SR0 *)(&SR)).Q0) #define S ((*(struct SR0 *)(&SR)).S0) #define T ((*(struct SR0 *)(&SR)).T0) Definitions of bits in SR Error( char *er );...
  • Page 224: Add (Add Binary): Arithmetic Instruction

    10.1.1 ADD (Add binary): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit Rn + Rm → Rn — ADD Rm,Rn 0011nnnnmmmm1100 Rn + imm → Rn — ADD #imm,Rn 0111nnnniiiiiiii Description: This instruction adds together the contents of general registers Rn and Rm and stores the result in Rn.
  • Page 225: Addc (Add With Carry): Arithmetic Instruction

    10.1.2 ADDC (Add with Carry): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit Rn + Rm + T → Rn, Carry ADDC Rm,Rn 0011nnnnmmmm1110 carry → T Description: This instruction adds together the contents of general registers Rn and Rm and the T bit, and stores the result in Rn.
  • Page 226: Addv (Add With (V Flag) Overflow Check): Arithmetic Instruction

    10.1.3 ADDV (Add with (V flag) Overflow Check): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit Rn + Rm → Rn, Overflow ADDV Rm,Rn 0011nnnnmmmm1111 overflow → T Description: This instruction adds together the contents of general registers Rn and Rm and stores the result in Rn.
  • Page 227 Example: ;Before execution R0 = H'00000001, R1 = H'7FFFFFFE, T=0 ADDV R0,R1 ;After execution R1 = H'7FFFFFFF, T=0 ;Before execution R0 = H'00000002, R1 = H'7FFFFFFE, T=0 ADDV R0,R1 ;After execution R1 = H'80000000, T=1 Rev. 1.50, 10/04, page 207 of 448...
  • Page 228: And (And Logical): Logical Instruction

    10.1.4 AND (AND Logical): Logical Instruction Format Operation Instruction Code Cycle T Bit Rn & Rm → Rn — 0010nnnnmmmm1001 Rm,Rn R0 & imm → R0 — 11001001iiiiiiii #imm,R0 (R0 + GBR) & imm → — 11001101iiiiiiii AND.B #imm,@(R0,GBR) (R0 + GBR) Description: This instruction ANDs the contents of general registers Rn and Rm and stores the result in Rn.
  • Page 229 temp &= (0x000000FF & (long)i); Write_Byte(GBR+R[0],temp); PC += 2; Example: ;Before execution R0 = H'AAAAAAAA, R1=H'55555555 R0,R1 ;After execution R1 = H'00000000 ;Before execution R0 = H'FFFFFFFF #H'0F,R0 ;After execution R0 = H'0000000F #H'80,@(R0,GBR) ;Before execution (R0,GBR) = H'A5 AND.B ;After execution (R0,GBR) = H'80 Possible Exceptions: Exceptions may occur when AND.B instruction is executed.
  • Page 230: Bf (Branch If False): Branch Instruction

    10.1.5 BF (Branch if False): Branch Instruction Format Operation Instruction Code Cycle T Bit If T = 0 — label 10001011dddddddd PC + 4 + disp × 2 → PC If T = 1, nop Description: This is a conditional branch instruction that references the T bit. The branch is taken if T = 0, and not taken if T = 1.
  • Page 231 Possible Exceptions: • Slot illegal instruction exception Rev. 1.50, 10/04, page 211 of 448...
  • Page 232: Bf/S (Branch If False With Delay Slot): Branch Instruction

    10.1.6 BF/S (Branch if False with Delay Slot): Branch Instruction Format Operation Instruction Code Cycle T Bit If T = 0, — BF/S label 10001111dddddddd PC + 4 + disp × 2 → PC If T = 1, nop Description: This is a delayed conditional branch instruction that references the T bit. If T = 1, the next instruction is executed and the branch is not taken.
  • Page 233 Operation: BFS(int d) /* BFS disp */ int disp; unsigned int temp; temp = PC; if ((d&0x80)==0) disp = (0x000000FF & d); else disp = (0xFFFFFF00 | d); if (T==0) PC = PC + 4 + (disp<<1); else PC += 4; Delay_Slot(temp+2);...
  • Page 234: Bra (Branch): Branch Instruction

    10.1.7 BRA (Branch): Branch Instruction Format Operation Instruction Code Cycle T Bit PC + 4 + disp × 2 → PC — label 1010dddddddddddd Description: This is an unconditional branch instruction. The branch destination is address (PC + 4 + displacement × 2). The PC source value is the BRA instruction address. As the 12-bit displacement is multiplied by two after sign-extension, the branch destination can be located in the range from –4096 to +4094 bytes from the BRA instruction.
  • Page 235 Possible Exceptions: • Slot illegal instruction exception Rev. 1.50, 10/04, page 215 of 448...
  • Page 236: Braf (Branch Far): Branch Instruction (Delayed Branch Instruction)

    10.1.8 BRAF (Branch Far): Branch Instruction (Delayed Branch Instruction) Format Operation Instruction Code Cycle T Bit PC + 4 + Rn → PC — BRAF 0000nnnn00100011 Description: This is an unconditional branch instruction. The branch destination is address (PC + 4 + Rn).
  • Page 237: Bt (Branch If True): Branch Instruction

    10.1.9 BT (Branch if True): Branch Instruction Format Operation Instruction Code Cycle T Bit If T = 1 — label 10001001dddddddd PC + 4 + disp × 2 → PC If T = 0, nop Description: This is a conditional branch instruction that references the T bit. The branch is taken if T = 1, and not taken if T = 0.
  • Page 238 Possible Exceptions: • Slot illegal instruction exception Rev. 1.50, 10/04, page 218 of 448...
  • Page 239: Bt/S (Branch If True With Delay Slot): Branch Instruction

    10.1.10 BT/S (Branch if True with Delay Slot): Branch Instruction Format Operation Instruction Code Cycle T Bit If T = 1, — BT/S label 10001101dddddddd PC + 4 + disp × 2 → PC If T = 0, nop Description: This is a conditional branch instruction that references the T bit. The branch is taken if T = 1, and not taken if T = 0.
  • Page 240 Example: ;Normally T = 1 SETT ;T = 1, so branch is not taken. BF/S TRGET_F ;T = 1, so branch to TRGET_T. BT/S TRGET_T ;Executed before branch. R0,R1 ;← BT/S instruction branch destination TRGET_T: Possible Exceptions: • Slot illegal instruction exception Rev.
  • Page 241: Clrmac (Clear Mac Register): System Control Instruction

    10.1.11 CLRMAC (Clear MAC Register): System Control Instruction Format Operation Instruction Code Cycle T Bit 0 → MACH, MACL — CLRMAC 0000000000101000 Description: This instruction clears the MACH and MACL registers. Notes: None Operation: CLRMAC( ) /* CLRMAC */ MACH = 0; MACL = 0;...
  • Page 242: Clrs (Clear S Bit): System Control Instruction

    10.1.12 CLRS (Clear S Bit): System Control Instruction Format Operation Instruction Code Cycle T Bit 0 → S — CLRS 0000000001001000 Description: This instruction clears the S bit to 0. Notes: None Operation: CLRS( ) /* CLRS */ S = 0; PC += 2;...
  • Page 243: Clrt (Clear T Bit): System Control Instruction

    10.1.13 CLRT (Clear T Bit): System Control Instruction Format Operation Instruction Code Cycle T Bit 0 → T CLRT 0000000000001000 Description: This instruction clears the T bit. Notes: None Operation: CLRT( ) /* CLRT */ T = 0; PC += 2; Example: ;Before execution T = 1 CLRT...
  • Page 244: Cmp/Cond (Compare Conditionally): Arithmetic Instruction

    10.1.14 CMP/cond (Compare Conditionally): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit If Rn = Rm, 1 → T CMP/EQ Rm,Rn 0011nnnnmmmm0000 Result of comparison Otherwise, 0 → T If Rn ≥ Rm, signed, 1 → T CMP/GE Rm,Rn 0011nnnnmmmm0011 Result of comparison...
  • Page 245 Mnemonic Description CMP/EQ Rm,Rn If Rn = Rm, T = 1 If Rn ≥ Rm as signed values, T = 1 CMP/GE Rm,Rn CMP/GT Rm,Rn If Rn > Rm as signed values, T = 1 CMP/HI Rm,Rn If Rn > Rm as unsigned values, T = 1 If Rn ≥...
  • Page 246 CMPHI(long m, long n) /* CMP_HI Rm,Rn */ if ((unsigned long)R[n]>(unsigned long)R[m]) T = 1; else T = 0; PC += 2; CMPHS(long m, long n) /* CMP_HS Rm,Rn */ if ((unsigned long)R[n]>=(unsigned long)R[m]) T = 1; else T = 0; PC += 2;...
  • Page 247 temp=R[n]^R[m]; HH = (temp & 0xFF000000) >> 24; HL = (temp & 0x00FF0000) >> 16; LH = (temp & 0x0000FF00) >> 8; LL = temp & 0x000000FF; HH = HH && HL && LH && LL; if (HH==0) T = 1; else T = 0;...
  • Page 248: Div0S (Divide (Step 0) As Signed): Arithmetic Instruction

    10.1.15 DIV0S (Divide (Step 0) as Signed): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit MSB of Rn → Q, DIV0S Rm,Rn 0010nnnnmmmm0111 Result of MSB of Rm → M, calculation M^Q → T Description: This instruction performs initial settings for signed division. This instruction is followed by a DIV1 instruction that executes 1-digit division, for example, and repeated divisions are executed to find the quotient.
  • Page 249: Div0U (Divide (Step 0) As Unsigned): Arithmetic Instruction

    10.1.16 DIV0U (Divide (Step 0) as Unsigned): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit 0 → M/Q/T DIV0U 0000000000011001 Description: This instruction performs initial settings for unsigned division. This instruction is followed by a DIV1 instruction that executes 1-digit division, for example, and repeated divisions are executed to find the quotient.
  • Page 250: Div1 (Divide 1 Step): Arithmetic Instruction

    10.1.17 DIV1 (Divide 1 Step): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit 1-step division DIV1 Rm,Rn 0011nnnnmmmm0100 Result of (Rn ÷ Rm) calculation Description: This instruction performs 1-digit division (1-step division) of the 32-bit contents of general register Rn (dividend) by the contents of Rm (divisor). The quotient is obtained by repeated execution of this instruction alone or in combination with other instructions.
  • Page 251 switch(old_q){ case 0:switch(M){ case 0:tmp0 = R[n]; R[n] -= tmp2; tmp1 = (R[n]>tmp0); switch(Q){ case 0:Q = tmp1; break; case 1:Q = (unsigned char)(tmp1==0); break; break; case 1:tmp0 = R[n]; R[n] += tmp2; tmp1 = (R[n]<tmp0); switch(Q){ case 0:Q = (unsigned char)(tmp1==0); break;...
  • Page 252 R[n] -= tmp2; tmp1 = (R[n]>tmp0); switch(Q){ case 0:Q = (unsigned char)(tmp1==0); break; case 1:Q = tmp1; break; break; break; T = (Q==M); PC += 2; Example 1: ;R1 (32 bits) ÷ R0 (16 bits) = R1 (16 bits); unsigned ;Set divisor in upper 16 bits, clear lower 16 bits to 0 SHLL16 ;Check for division by zero...
  • Page 253 Example 2: ; R1:R2 (64 bits) ÷ R0 (32 bits) = R2 (32 bits); unsigned ;Check for division by zero R0,R0 ZERO_DIV ;Check for overflow CMP/HS R0,R1 OVER_DIV ;Flag initialization DIV0U .arepeat ;Repeat 32 times ROTCL DIV1 R0,R1 .aendr ;R2 = quotient ROTCL Example 3: ;R1 (16 bits) ÷...
  • Page 254 Example 4: ;R2 (32 bits) ÷ R0 (32 bits) = R2 (32 bits); signed R2,R3 ROTCL ;Dividend sign-extended to 64 bits (R1:R2) SUBC R1,R1 ;R3 = 0 R3,R3 ;If dividend is negative, subtract 1 to convert to one's complement notation SUBC R3,R2 ;Flag initialization...
  • Page 255: Dmuls.l (Double-Length Multiply As Signed): Arithmetic Instruction

    10.1.18 DMULS.L (Double-length Multiply as Signed): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit Signed, — DMULS.L Rm,Rn 0011nnnnmmmm1101 Rn × Rm →MAC Description: This instruction performs 32-bit multiplication of the contents of general register Rn by the contents of Rm, and stores the 64-bit result in the MACH and MACL registers. The multiplication is performed as a signed arithmetic operation.
  • Page 256 temp0 = RmL*RnL; temp1 = RmH*RnL; temp2 = RmL*RnH; temp3 = RmH*RnH; Res2 = 0; Res1 = temp1+temp2; if (Res1<temp1) Res2 += 0x00010000; temp1 = (Res1<<16)&0xFFFF0000; Res0 = temp0 + temp1; if (Res0<temp0) Res2++; Res2 = Res2 + ((Res1>>16)&0x0000FFFF) + temp3; if (fnLmL<0) { Res2 = Res2;...
  • Page 257: Dmulu.l (Double-Length Multiply As Unsigned): Arithmetic Instruction

    10.1.19 DMULU.L (Double-length Multiply as Unsigned): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit Rm,Rn Unsigned, 0011nnnnmmmm0101 2 — DMULU.L Rn × Rm →MAC Description: This instruction performs 32-bit multiplication of the contents of general register Rn by the contents of Rm, and stores the 64-bit result in the MACH and MACL registers. The multiplication is performed as an unsigned arithmetic operation.
  • Page 258 Res2 = Res2 + ((Res1>>16)&0x0000FFFF) + temp3; MACH = Res2; MACL = Res0; PC += 2; Example: ;Before execution R0 = H'FFFFFFFE, R1 = H'00005555 DMULU.L R0,R1 ;After execution MACH = H'00005554, MACL = H'FFFF5556 ;Get operation result (upper) MACH,R0 ;Get operation result (lower) MACL,R1 Rev.
  • Page 259: Dt (Decrement And Test): Arithmetic Instruction

    10.1.20 DT (Decrement and Test): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit Rn – 1 → Rn; 0100nnnn00010000 Result of if Rn = 0, 1 → T comparison if Rn ≠ 0, 0 → T Description: This instruction decrements the contents of general register Rn by 1 and compares the result with zero.
  • Page 260: Exts (Extend As Signed): Arithmetic Instruction

    10.1.21 EXTS (Extend as Signed): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit Rm sign-extended from — EXTS.B Rm,Rn 0110nnnnmmmm1110 byte → Rn Rm sign-extended from — EXTS.W Rm,Rn 0110nnnnmmmm1111 word → Rn Description: This instruction sign-extends the contents of general register Rm and stores the result in Rn. For a byte specification, the value of Rm bit 7 is transferred to Rn bits 8 to 31.
  • Page 261 Example: ;Before execution R0 = H'00000080 EXTS.B R0,R1 ;After execution R1 = H'FFFFFF80 ;Before execution R0 = H'00008000 EXTS.W R0,R1 ;After execution R1 = H'FFFF8000 Rev. 1.50, 10/04, page 241 of 448...
  • Page 262: Extu (Extend As Unsigned): Arithmetic Instruction

    10.1.22 EXTU (Extend as Unsigned): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit Rm zero-extended from — EXTU.B Rm,Rn 0110nnnnmmmm1100 byte → Rn Rm zero-extended from — EXTU.W Rm,Rn 0110nnnnmmmm1101 word → Rn Description: This instruction zero-extends the contents of general register Rm and stores the result in Rn.
  • Page 263: Icbi (Instruction Cache Block Invalidate): Data Transfer Instruction

    10.1.23 ICBI (Instruction Cache Block Invalidate): Data Transfer Instruction Format Operation Instruction Code Cycle T Bit Invalidates the instruction — ICBI 0000nnnn11100011 cache block indicated by logical address Rn Description: This instruction accesses the instruction cache at the effective address indicated by the contents of Rn.
  • Page 264: Jmp (Jump): Branch Instruction

    10.1.24 JMP (Jump): Branch Instruction Format Operation Instruction Code Cycle T Bit Rn → PC — JMP @Rn 0100nnnn00101011 Description: Unconditionally makes a delayed branch to the address specified by Rn. Notes: As this is a delayed branch instruction, the instruction following this instruction is executed before the branch destination instruction.
  • Page 265: Ldc (Load To Control Register): System Control Instruction

    10.1.25 LDC (Load to Control Register): System Control Instruction Format Operation Instruction Code Cycle T Bit Rm → GBR — Rm, GBR 0100mmmm00011110 Rm → VBR — Rm, VBR 0100mmmm00101110 Rm → SGR — Rm, SGR 0100mmmm00111010 Rm → SSR —...
  • Page 266 Notes: With the exception of LDC Rm,GBR and LDC.L @Rm+,GBR, the LDC/LDC.L instructions are privileged instructions and can only be used in privileged mode. Use in user mode will cause an illegal instruction exception. However, LDC Rm,GBR and LDC.L @Rm+,GBR can also be used in user mode.
  • Page 267 LDCDBR(int m) /* LDC Rm,DBR : Privileged */ DBR = R[m]; PC += 2; LDCRn_BANK(int m) /* LDC Rm,Rn_BANK : Privileged */ /* n=0–7 */ Rn_BANK = R[m]; PC += 2; LDCMGBR(int m) /* LDC.L @Rm+,GBR */ GBR=Read_Long(R[m]); R[m] += 4; PC += 2;...
  • Page 268 LDCMSSR(int m) /* LDC.L @Rm+,SSR : Privileged */ SSR=Read_Long(R[m]); R[m] += 4; PC += 2; LDCMSPC(int m) /* LDC.L @Rm+,SPC : Privileged */ SPC = Read_Long(R[m]); R[m] += 4; PC += 2; LDCMDBR(int m) /* LDC.L @Rm+,DBR : Privileged */ DBR = Read_Long(R[m]);...
  • Page 269: Lds (Load To System Register): System Control Instruction

    10.1.26 LDS (Load to System Register): System Control Instruction Format Operation Instruction Code Cycle T Bit Rm → MACH — 0100mmmm00001010 Rm,MACH Rm → MACL — 0100mmmm00011010 Rm,MACL — Rm→ PR 0100mmmm00101010 Rm,PR (Rm) → MACH, Rm + 4 → Rm —...
  • Page 270 LDSMMACH(long m) /* LDS.L @Rm+,MACH */ MACH = Read_Long(R[m]); R[m] += 4; PC += 2; LDSMMACL(lomg m) /* LDS.L @Rm+,MACL */ MACL = Read_Long(R[m]); R[m] += 4; PC += 2; LDSMPR(long m) /* LDS.L @Rm+,PR */ PR = Read_Long(R[m]); R[m] += 4; PC += 2;...
  • Page 271: Ldtlb (Load Pteh/Ptel To Tlb): System Control Instruction

    10.1.27 LDTLB (Load PTEH/PTEL to TLB): System Control Instruction (Privileged Instruction) Format Operation Instruction Code Cycle T Bit PTEH/PTEL → TLB — LDTLB 0000000000111000 Description: This instruction loads the contents of the PTEH/PTEL registers into the TLB (translation lookaside buffer) specified by MMUCR.URC (random counter field in the MMC control register).
  • Page 272 Example: ;Load page table entry (upper) into R1 @R0,R1 ;Load R1 into PTEH; R2 is PTEH address (H'FF000000) R1,@R2 ;Load PTEH, PTEL registers into TLB LDTLB Possible Exceptions: • General illegal instruction exception • Slot illegal instruction exception Rev. 1.50, 10/04, page 252 of 448...
  • Page 273: Mac.l (Multiply And Accumulate Long): Arithmetic Instruction

    10.1.28 MAC.L (Multiply and Accumulate Long): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit MAC.L @Rm+,@Rn+ Signed, — 0000nnnnmmmm1111 (Rn) × (Rm) + MAC → MAC Rn + 4 → Rn, Rm + 4 → Rm Description: This instruction performs signed multiplication of the 32-bit operands whose addresses are the contents of general registers Rm and Rn, adds the 64-bit result to the MAC register contents, and stores the result in the MAC register.
  • Page 274 temp1 = (unsigned long)tempn; temp2 = (unsigned long)tempm; RnL = temp1&0x0000FFFF; RnH = (temp1>>16) & 0x0000FFFF; RmL = temp2 & 0x0000FFFF; RmH = (temp2>>16) & 0x0000FFFF; temp0 = RmL*RnL; temp1 = RmH*RnL; temp2 = RmL*RnH; temp3 = RmH*RnH; Res2 = 0; Res1 = temp1 + temp2;...
  • Page 275 if(((long)Res2<0)&&(Res2 < 0xFFFF8000)){ Res2 = 0xFFFF8000; Res0 = 0x00000000; if(((long)Res2>0)&&(Res2 > 0x00007FFF)){ Res2 = 0x00007FFF; Res0 = 0xFFFFFFFF; MACH = (Res2 & 0x0000FFFF)|(MACH & 0xFFFF0000); MACL = Res0; else { Res0 = MACL + Res0; if (MACL>Res0) Res2++; Res2 += MACH; MACH = Res2;...
  • Page 276 Example: ;Get table address MOVA TBLM,R0 R0,R1 ;Get table address MOVA TBLN,R0 ;MAC register initialization CLRMAC MAC.L @R0+,@R1+ MAC.L @R0+,@R1+ ;Get result in R0 MACL,R0 ..align TBLM .data.l H'1234ABCD .data.l H'5678EF01 TBLN .data.l H'0123ABCD .data.l H'4567DEF0 Possible Exceptions: • Data TLB multiple-hit exception •...
  • Page 277: Mac.w (Multiply And Accumulate Word): Arithmetic Instruction

    10.1.29 MAC.W (Multiply and Accumulate Word): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit — Signed, 0100nnnnmmmm1111 MAC.W @Rm+,@Rn+ (Rn) × (Rm) + MAC →MAC @Rm+,@Rn+ Rn + 2 → Rn, Rm + 2 → Rm Description: This instruction performs signed multiplication of the 16-bit operands whose addresses are the contents of general registers Rm and Rn, adds the 32-bit result to the MAC register contents, and stores the result in the MAC register.
  • Page 278 if ((long)tempm>=0) { src = 0; tempn = 0; else { src = 1; tempn = 0xFFFFFFFF; src += dest; MACL += tempm; if ((long)MACL>=0) ans = 0; else ans = 1; ans += dest; if (S==1) { if (ans==1) { if (src==0) MACL = 0x7FFFFFFF;...
  • Page 279 Example: ;Get table address MOVA TBLM,R0 R0,R1 ;Get table address MOVA TBLN,R0 ;MAC register initialization CLRMAC MAC.W @R0+,@R1+ MAC.W @R0+,@R1+ ;Get result in R0 MACL,R0 ... .align 2 TBLM .data.w H'1234 .data.w H'5678 TBLN .data.w H'0123 .data.w H'4567 Possible Exceptions: •...
  • Page 280: Mov (Move Data): Data Transfer Instruction

    10.1.30 MOV (Move data): Data Transfer Instruction Format Operation Instruction Code Cycle T Bit Rm → Rn 0110nnnnmmmm0011 1 — Rm,Rn Rm → (Rn) 0010nnnnmmmm0000 1 — MOV.B Rm,@Rn Rm → (Rn) 0010nnnnmmmm0001 1 — MOV.W Rm,@Rn Rm → (Rn) 0010nnnnmmmm0010 1 —...
  • Page 281 Operation: MOV(long m, long n) /* MOV Rm,Rn */ R[n] = R[m]; PC += 2; MOVBS(long m, long n) /* MOV.B Rm,@Rn */ Write_Byte(R[n],R[m]); PC += 2; MOVWS(long m, long n) /* MOV.W Rm,@Rn */ Write_Word(R[n],R[m]); PC += 2; MOVLS(long m, long n) /* MOV.L Rm,@Rn */ Write_Long(R[n],R[m]);...
  • Page 282 MOVWL(long m, long n) /* MOV.W @Rm,Rn */ R[n] = (long)Read_Word(R[m]); if ((R[n]&0x8000)==0) R[n] &= 0x0000FFFF; else R[n] |= 0xFFFF0000; PC += 2; MOVLL(long m, long n) /* MOV.L @Rm,Rn */ R[n] = Read_Long(R[m]); PC += 2; MOVBM(long m, long n) /* MOV.B Rm,@-Rn */ Write_Byte(R[n]-1,R[m]);...
  • Page 283 MOVBP(long m, long n) /* MOV.B @Rm+,Rn */ R[n] = (long)Read_Byte(R[m]); if ((R[n]&0x80)==0) R[n] &= 0x000000FF; else R[n] |= 0xFFFFFF00; if (n!=m) R[m] += 1; PC += 2; MOVWP(long m, long n) /* MOV.W @Rm+,Rn */ R[n] = (long)Read_Word(R[m]); if ((R[n]&0x8000)==0) R[n] &= 0x0000FFFF; else R[n] |= 0xFFFF0000;...
  • Page 284 MOVLS0(long m, long n) /* MOV.L Rm,@(R0,Rn) */ Write_Long(R[n]+R[0],R[m]); PC += 2; MOVBL0(long m, long n) /* MOV.B @(R0,Rm),Rn */ R[n] = (long)Read_Byte(R[m]+R[0]); if ((R[n]&0x80)==0) R[n] &= 0x000000FF; else R[n] |= 0xFFFFFF00; PC += 2; MOVWL0(long m, long n) /* MOV.W @(R0,Rm),Rn */ R[n] = (long)Read_Word(R[m]+R[0]);...
  • Page 285 Example: ;Before execution R0 = H'FFFFFFFF, R1 = H'00000000 R0,R1 ;After execution R1 = H'FFFFFFFF ;Before execution R0 = H'FFFF7F80 MOV.W R0,@R1 ;After execution (R1) = H'7F80 ;Before execution (R0) = H'80, R1 = H'00000000 MOV.B @R0,R1 ;After execution R1 = H'FFFFFF80 ;Before execution R0 = H'AAAAAAAA, (R1) = H'FFFF7F80 MOV.W R0,@-R1...
  • Page 286: Mov (Move Constant Value): Data Transfer Instruction

    1101nnnndddddddd MOV.L @(disp*,PC),Rn H'FFFFFFFC + 4) → Rn The assembler of Renesas Technology uses the value after scaling (×1, ×2, or ×4) as Note: the displacement (disp). Description: This instruction stores immediate data, sign-extended to longword, in general register Rn. In the case of word or longword data, the data is stored from memory address (PC + 4 + displacement ×...
  • Page 287 Operation: MOVI(int i, int n) /* MOV #imm,Rn */ if ((i&0x80)==0) R[n] = (0x000000FF & i); else R[n] = (0xFFFFFF00 | i); PC += 2; MOVWI(d, n) /* MOV.W @(disp,PC),Rn */ unsigned int disp; disp = (unsigned int)(0x000000FF & d); R[n] = (int)Read_Word(PC+4+(disp<<1));...
  • Page 288 H'12345678 101C .data.l H'9ABCDEF0 Note: * The assembler of Renesas Technology uses the value after scaling (×1, ×2, or ×4) as the displacement (disp). Possible Exceptions: Exceptions may occur when PC-relative load instruction is executed. • Data TLB multiple-hit exception •...
  • Page 289: Mov (Move Global Data): Data Transfer Instruction

    R0,@(disp*,GBR) R0 → (disp × 4 + GBR) MOV.L — 11000010dddddddd The assembler of Renesas Technology uses the value after scaling (×1, ×2, or ×4) as Note: the displacement (disp). Description: This instruction transfers the source operand to the destination. Byte, word, or longword can be specified as the data size, but the register is always R0.
  • Page 290 MOVWLG(int d) /* MOV.W @(disp,GBR),R0 */ unsigned int disp; disp = (unsigned int)(0x000000FF & d); R[0] = (int)Read_Word(GBR+(disp<<1)); if ((R[0]&0x8000)==0) R[0] &= 0x0000FFFF; else R[0] |= 0xFFFF0000; PC += 2; MOVLLG(int d) /* MOV.L @(disp,GBR),R0 */ unsigned int disp; disp = (unsigned int)(0x000000FF & d); R[0] = Read_Long(GBR+(disp<<2));...
  • Page 291 R0 = H'12345670 MOV.B R0,@(1*,GBR) ;Before execution R0 = H'FFFF7F80 ;After execution (GBR+1) = H'80 Note: * The assembler of Renesas Technology uses the value after scaling (×1, ×2, or ×4) as the displacement (disp). Possible Exceptions: • Data TLB multiple-hit exception •...
  • Page 292: Mov (Move Structure Data): Data Transfer Instruction

    MOV.L @(disp*,Rm),Rn (disp × 4 + Rm) → Rn — 0101nnnnmmmmdddd The assembler of Renesas Technology uses the value after scaling (×1, ×2, or ×4) as Note: the displacement (disp). Description: This instruction transfers the source operand to the destination. It is ideal for accessing data inside a structure or stack.
  • Page 293 Operation: MOVBS4(long d, long n) /* MOV.B R0,@(disp,Rn) */ long disp; disp = (0x0000000F & (long)d); Write_Byte(R[n]+disp,R[0]); PC += 2; MOVWS4(long d, long n) /* MOV.W R0,@(disp,Rn) */ long disp; disp = (0x0000000F & (long)d); Write_Word(R[n]+(disp<<1),R[0]); PC += 2; MOVLS4(long m, long d, long n) /* MOV.L Rm,@(disp,Rn) */ long disp;...
  • Page 294 ;Before execution R0 = H'FFFF7F80 MOV.L R0,@(H'F,R1) ;After execution (R1+60) = H'FFFF7F80 Note: * The assembler of Renesas Technology uses the value after scaling (×1, ×2, or ×4) as the displacement (disp). Possible Exceptions: • Data TLB multiple-hit exception • Slot illegal instruction exception •...
  • Page 295: Mova (Move Effective Address): Data Transfer Instruction

    100A R4,R5 .align 100C STR:.sdata "XYZP12" Note: * The assembler of Renesas Technology uses the value after scaling (×1, ×2, or ×4) as the displacement (disp). Possible Exceptions: • Slot illegal instruction exception Rev. 1.50, 10/04, page 275 of 448...
  • Page 296: Movca.l (Move With Cache Block Allocation): Data Transfer Instruction

    10.1.35 MOVCA.L (Move with Cache Block Allocation): Data Transfer Instruction Format Operation Instruction Code Cycle T Bit R0 → (Rn) — MOVCA.L R0,@Rn 0000nnnn11000011 (without fetching cache block) Description: This instruction stores the contents of general register R0 in the memory location indicated by effective address Rn.
  • Page 297: Movco (Move Conditional): Data Transfer Instruction

    10.1.36 MOVCO (Move Conditional): Data Transfer Instruction Format Operation Instruction Code Cycle T Bit LDST → T LDST MOVCO.L R0,@Rn 0000nnnn01110011 if (T==1) R0 → (Rn) 0 → LDST Description: MOVCO is used in combination with MOVLI to realize an atomic read-modify- write operation in a single processor.
  • Page 298 Example: ; Atomic incrementation Retry: MOVLI.L @Rn,R0 #1,R0 MOVCO.L R0,@Rn ; Reexecute if an interrupt or other Retry exception occurs between the MOVLI and MOVCO instructions Possible Exceptions: • Data TLB multiple-hit exception • Data TLB miss exception • Data TLB protection violation exception •...
  • Page 299: Movli (Move Linked): Data Transfer Instruction

    10.1.37 MOVLI (Move Linked): Data Transfer Instruction Format Operation Instruction Code Cycle T Bit 1 → LDST — MOVLI.L @Rm,R0 0000nnnn01100011 (Rm) → R0 If an interrupt or exception has occurred 0 → LDST Description: MOVLI is used in combination with MOVCO to realize an atomic read-modify- write operation in a single processor.
  • Page 300: Movt (Move T Bit): Data Transfer Instruction

    10.1.38 MOVT (Move T Bit): Data Transfer Instruction Format Operation Instruction Code Cycle T Bit T → Rn — MOVT 0000nnnn00101001 Description: This instruction stores the T bit in general register Rn. When T = 1, Rn = 1; when T = 0, Rn = 0.
  • Page 301: Movua (Move Unaligned): Data Transfer Instruction

    10.1.39 MOVUA (Move Unaligned): Data Transfer Instruction Format Operation Instruction Code Cycle T Bit (Rm) → R0 — MOVUA.L @Rm,R0 0100nnnn10101001 Load non-boundary-aligned data (Rm) → R0, Rm + 4 → Rm — MOVUA.L @Rm+,R0 0100nnnn11101001 Load non-boundary-aligned data Description: This instruction loads the longword of data from the effective address indicated by the contents of Rm in memory to R0.
  • Page 302 Example: ;Before execution MOVUA.L @R1,R0 R1=H'00001001, R0=H'00000000 ;After execution R0=(H'00001001) ;Before execution MOVUA.L @R1+,R0 R1=H'00001007, R0=H'00000000 ;After execution R1=H'0000100B, R0=(H'00001007) ; Special case in which the source operand is @R0 ;Before execution MOVUA.L @R0,R0 R0=H'00001001 ;After execution R0=(H'00001001) ;Before execution MOVUA.L @R0+,R0 R0=H'00001001...
  • Page 303: Mul.l (Multiply Long): Arithmetic Instruction

    10.1.40 MUL.L (Multiply Long): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit Rn × Rm → MACL — MUL.L Rm,Rn 0000nnnnmmmm0111 Description: This instruction performs 32-bit multiplication of the contents of general registers Rn and Rm, and stores the lower 32 bits of the result in the MACL register. The contents of MACH are not changed.
  • Page 304: Muls.w (Multiply As Signed Word): Arithmetic Instruction

    10.1.41 MULS.W (Multiply as Signed Word): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit Signed, Rn × Rm → MACL 0010nnnnmmmm1111 — MULS.W Rm,Rn Description: This instruction performs 16-bit multiplication of the contents of general registers Rn and Rm, and stores the 32-bit result in the MACL register. The multiplication is performed as a signed arithmetic operation.
  • Page 305: Mulu.w (Multiply As Unsigned Word): Arithmetic Instruction

    10.1.42 MULU.W (Multiply as Unsigned Word): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit Unsigned, Rn × Rm → — MULU.W Rm,Rn 0010nnnnmmmm1110 MACL Description: This instruction performs 16-bit multiplication of the contents of general registers Rn and Rm, and stores the 32-bit result in the MACL register. The multiplication is performed as an unsigned arithmetic operation.
  • Page 306: Neg (Negate): Arithmetic Instruction

    10.1.43 NEG (Negate): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit 0 - Rm → Rn — Rm,Rn 0110nnnnmmmm1011 Description: This instruction finds the two's complement of the contents of general register Rm and stores the result in Rn. That is, it subtracts Rm from 0 and stores the result in Rn. Notes: None Operation: NEG(long m, long n) /* NEG Rm,Rn */...
  • Page 307: Negc (Negate With Carry): Arithmetic Instruction

    10.1.44 NEGC (Negate with Carry): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit 0 – Rm – T → Rn, Borrow NEGC Rm,Rn 0110nnnnmmmm1010 borrow → T Description: This instruction subtracts the contents of general register Rm and the T bit from 0 and stores the result in Rn.
  • Page 308: Nop (No Operation): System Control Instruction

    10.1.45 NOP (No Operation): System Control Instruction Format Operation Instruction Code Cycle T Bit No operation — 0000000000001001 Description: This instruction simply increments the program counter (PC), advancing the processing flow to execution of the next instruction. Notes: None Operation: NOP( ) /* NOP */ PC += 2;...
  • Page 309: Not (Not-Logical Complement): Logical Instruction

    10.1.46 NOT (Not-logical Complement): Logical Instruction Format Operation Instruction Code Cycle T Bit ∼Rm → Rn — Rm,Rn 0110nnnnmmmm0111 Description: This instruction finds the one's complement of the contents of general register Rm and stores the result in Rn. That is, it inverts the Rm bits and stores the result in Rn. Notes: None Operation: NOT(long m, long n) /* NOT Rm,Rn */...
  • Page 310: Ocbi (Operand Cache Block Invalidate): Data Transfer Instruction

    10.1.47 OCBI (Operand Cache Block Invalidate): Data Transfer Instruction Format Operation Instruction Code Cycle T Bit Operand cache block — OCBI 0000nnnn10010011 invalidation Description: This instruction accesses data using the contents indicated by effective address Rn. In the case of a hit in the cache, the corresponding cache block is invalidated (the V bit is cleared to 0).
  • Page 311: Ocbp (Operand Cache Block Purge): Data Transfer Instruction

    10.1.48 OCBP (Operand Cache Block Purge): Data Transfer Instruction Format Operation Instruction Code Cycle T Bit Writes back and invalidates — OCBP @Rn 0000nnnn10100011 operand cache block Description: This instruction accesses data using the contents indicated by effective address Rn. If the cache is hit and there is unwritten information (U bit = 1), the corresponding cache block is written back to external memory and that block is invalidated (the V bit is cleared to 0).
  • Page 312: Ocbwb (Operand Cache Block Write Back): Data Transfer Instruction

    10.1.49 OCBWB (Operand Cache Block Write Back): Data Transfer Instruction Format Operation Instruction Code Cycle T Bit Writes back operand cache — OCBWB 0000nnnn10110011 block Description: This instruction accesses data using the contents indicated by effective address Rn. If the cache is hit and there is unwritten information (U bit = 1), the corresponding cache block is written back to external memory and that block is cleaned (the U bit is cleared to 0).
  • Page 313: Or (Or Logical): Logical Instruction

    10.1.50 OR (OR Logical): Logical Instruction Format Operation Instruction Code Cycle T Bit Rn | Rm → Rn — Rm,Rn 0010nnnnmmmm1011 R0 | imm → R0 — #imm,R0 11001011iiiiiiii (R0 + GBR) | imm — OR.B #imm,@(R0,GBR) 11001111iiiiiiii → (R0 + GBR) Description: This instruction ORs the contents of general registers Rn and Rm and stores the result in Rn.
  • Page 314 Operation: OR(long m, long n) /* OR Rm,Rn */ R[n] |= R[m]; PC += 2; ORI(long i) /* OR #imm,R0 */ R[0] |= (0x000000FF & (long)i); PC += 2; ORM(long i) /* OR.B #imm,@(R0,GBR) */ long temp; temp = (long)Read_Byte(GBR+R[0]); temp |= (0x000000FF &...
  • Page 315 Possible Exceptions: Exceptions may occur when OR.B instruction is executed. • Data TLB multiple-hit exception • Data TLB miss exception • Data TLB protection violation exception • Initial page write exception • Data address error Exceptions are checked taking a data access by this instruction as a byte load and a byte store. Rev.
  • Page 316: Pref (Prefetch Data To Cache): Data Transfer Instruction

    10.1.51 PREF (Prefetch Data to Cache): Data Transfer Instruction Format Operation Instruction Code Cycle T Bit (Rn) → operand cache — PREF 0000nnnn10000011 Description: This instruction reads a 32-byte data block starting at a 32-byte boundary into the operand cache. The lower 5 bits of the address specified by Rn are masked to zero. This instruction does not generate data address error and MMU exceptions except data TLB multiple-hit exception.
  • Page 317: Prefi (Prefetch Instruction Cache Block): Data Transfer Instruction

    10.1.52 PREFI (Prefetch Instruction Cache Block): Data Transfer Instruction Format Operation Instruction Code Cycle T Bit Invalidation of instruction — PREFI 0000nnnn11010011 cache indicated by logical address Rn Description: This instruction reads a 32-byte block of data starting at a 32-byte boundary within the instruction cache.
  • Page 318: Rotcl (Rotate With Carry Left): Shift Instruction

    10.1.53 ROTCL (Rotate with Carry Left): Shift Instruction Format Operation Instruction Code Cycle T Bit T ← Rn ← T ROTCL 0100nnnn00100100 Description: This instruction rotates the contents of general register Rn one bit to the left through the T bit, and stores the result in Rn. The bit rotated out of the operand is transferred to the T bit. ROTCL Notes: None Operation:...
  • Page 319: Rotcr (Rotate With Carry Right): Shift Instruction

    10.1.54 ROTCR (Rotate with Carry Right): Shift Instruction Format Operation Instruction Code Cycle T Bit T → Rn → T ROTCR 0100nnnn00100101 Description: This instruction rotates the contents of general register Rn one bit to the right through the T bit, and stores the result in Rn. The bit rotated out of the operand is transferred to the T bit.
  • Page 320: Rotl (Rotate Left): Shift Instruction

    10.1.55 ROTL (Rotate Left): Shift Instruction Format Operation Instruction Code Cycle T Bit T ← Rn ← MSB 0100nnnn00000100 1 ROTL Description: This instruction rotates the contents of general register Rn one bit to the left, and stores the result in Rn. The bit rotated out of the operand is transferred to the T bit. ROTL Notes: None Operation:...
  • Page 321: Rotr (Rotate Right): Shift Instruction

    10.1.56 ROTR (Rotate Right): Shift Instruction Format Operation Instruction Code Cycle T Bit LSB → Rn → T ROTR 0100nnnn00000101 Description: This instruction rotates the contents of general register Rn one bit to the right, and stores the result in Rn. The bit rotated out of the operand is transferred to the T bit. ROTR Notes: None Operation:...
  • Page 322: Rte (Return From Exception): System Control Instruction

    10.1.57 RTE (Return from Exception): System Control Instruction Format Operation Instruction Code Cycle T Bit SSR → SR, SPC→ PC — 0000000000101011 Description: This instruction returns from an exception or interrupt handling routine by restoring the PC and SR values from SPC and SSR. Program execution continues from the address specified by the restored PC value.
  • Page 323 Note: In a delayed branch, the actual branch operation occurs after execution of the slot instruction, but instruction execution (register updating, etc.) is in fact performed in delayed branch instruction → delay slot instruction order. For example, even if the register holding the branch destination address is modified in the delay slot, the branch destination address will still be the register contents prior to the modification.
  • Page 324: Rts (Return From Subroutine): Branch Instruction

    10.1.58 RTS (Return from Subroutine): Branch Instruction Format Operation Instruction Code Cycle T Bit PR → PC — 0000000000001011 Description: This instruction returns from a subroutine procedure by restoring the PC from PR. Processing continues from the address indicated by the restored PC value. This instruction can be used to return from a subroutine procedure called by a BSR or JSR instruction to the source of the call.
  • Page 325 Example: ;R3 = TRGET address MOV.L TABLE,R3 ; Branch to TRGET. ;NOP executed before branch. ;← Subroutine procedure return destination (PR contents) R0,R1 ..;Jump table TABLE: .data.l TRGET ..;← Entry to procedure TRGET: R1,R0 ;PR contents → PC ;MOV executed before branch.
  • Page 326: Sets (Set S Bit): System Control Instruction

    10.1.59 SETS (Set S Bit): System Control Instruction Format Operation Instruction Code Cycle T Bit 1 → S — SETS 0000000001011000 Description: This instruction sets the S bit to 1. Notes: None Operation: SETS( ) /* SETS */ S = 1; PC += 2;...
  • Page 327: Sett (Set T Bit): System Control Instruction

    10.1.60 SETT (Set T Bit): System Control Instruction Format Operation Instruction Code Cycle T Bit 1 → T SETT 0000000000011000 Description: This instruction sets the T bit to 1. Notes: None Operation: SETT( ) /* SETT */ T = 1; PC += 2;...
  • Page 328: Shad (Shift Arithmetic Dynamically): Shift Instruction

    10.1.61 SHAD (Shift Arithmetic Dynamically): Shift Instruction Format Operation Instruction Code Cycle T Bit When Rm ≥ 0, — SHAD Rm, Rn 0100nnnnmmmm1100 Rn << Rm → Rn When Rm < 0, Rn >> Rm → [MSB → Rn] Description: This instruction arithmetically shifts the contents of general register Rn. General register Rm specifies the shift direction and the number of bits to be shifted.
  • Page 329 Operation: SHAD(int m,n) /*SHAD Rm,Rn */ int sgn = R[m] & 0x80000000; if (sgn==0) R[n] <<= (R[m] & 0x1F); else if ((R[m] & 0x1F) == 0) { if ((R[n] & 0x80000000) == 0) R[n] = 0; else R[n] = 0xFFFFFFFF; else R[n] = (long)R[n] >>...
  • Page 330: Shal (Shift Arithmetic Left): Shift Instruction

    10.1.62 SHAL (Shift Arithmetic Left): Shift Instruction Format Operation Instruction Code Cycle T Bit T ← Rn ← 0 SHAL 0100nnnn00100000 Description: This instruction arithmetically shifts the contents of general register Rn one bit to the left, and stores the result in Rn. The bit shifted out of the operand is transferred to the T bit. SHAL Notes: None Operation:...
  • Page 331: Shar (Shift Arithmetic Right): Shift Instruction

    10.1.63 SHAR (Shift Arithmetic Right): Shift Instruction Format Operation Instruction Code Cycle T Bit MSB → Rn → T SHAR 0100nnnn00100001 Description: This instruction arithmetically shifts the contents of general register Rn one bit to the right, and stores the result in Rn. The bit shifted out of the operand is transferred to the T bit. SHAR Notes: None Operation:...
  • Page 332: Shld (Shift Logical Dynamically): Shift Instruction

    10.1.64 SHLD (Shift Logical Dynamically): Shift Instruction Format Operation Instruction Code Cycle T Bit When Rm ≥ 0, — SHLD Rm, Rn 0100nnnnmmmm1101 Rn << Rm → Rn When Rm < 0, Rn >> Rm → [0 → Rn] Description: This instruction logically shifts the contents of general register Rn. General register Rm specifies the shift direction and the number of bits to be shifted.
  • Page 333 Operation: SHLD(int m,n)/*SHLD Rm,Rn */ int sgn = R[m] & 0x80000000; if (sgn == 0) R[n] <<= (R[m] & 0x1F); else if ((R[m] & 0x1F) == 0) R[n] = 0; else R[n] = (unsigned)R[n] >> ((~R[m] & 0x1F)+1); PC += 2; Example: ;Before execution R1 = H'FFFFFFEC, R2 = H'80180000 SHLD...
  • Page 334: Shll (Shift Logical Left ): Shift Instruction

    10.1.65 SHLL (Shift Logical Left ): Shift Instruction Format Operation Instruction Code Cycle T Bit T ← Rn ← 0 SHLL 0100nnnn00000000 Description: This instruction logically shifts the contents of general register Rn one bit to the left, and stores the result in Rn. The bit shifted out of the operand is transferred to the T bit. SHLL Notes: None Operation:...
  • Page 335: Shlln (N Bits Shift Logical Left): Shift Instruction

    10.1.66 SHLLn (n bits Shift Logical Left): Shift Instruction Format Operation Instruction Code Cycle T Bit Rn<<2 → Rn — SHLL2 0100nnnn00001000 Rn<<8 → Rn — SHLL8 0100nnnn00011000 Rn<<16 → Rn — SHLL16 0100nnnn00101000 Description: This instruction logically shifts the contents of general register Rn 2, 8, or 16 bits to the left, and stores the result in Rn.
  • Page 336 Operation: SHLL2(long n) /* SHLL2 Rn */ R[n] <<= 2; PC += 2; SHLL8(long n) /* SHLL8 Rn */ R[n] <<= 8; PC += 2; SHLL16(long n) /* SHLL16 Rn */ R[n] <<= 16; PC += 2; Example: ;Before execution R0 = H'12345678 SHLL2 ;After execution R0 = H'48D159E0...
  • Page 337: Shlr (Shift Logical Right): Shift Instruction

    10.1.67 SHLR (Shift Logical Right): Shift Instruction Format Operation Instruction Code Cycle T Bit 0 → Rn → T SHLR 0100nnnn00000001 Description: This instruction logically shifts the contents of general register Rn one bit to the right, and stores the result in Rn. The bit shifted out of the operand is transferred to the T bit. SHLR Notes: None Operation:...
  • Page 338: Shlrn (N Bits Shift Logical Right): Shift Instruction

    10.1.68 SHLRn (n bits Shift Logical Right): Shift Instruction Format Operation Instruction Code Cycle T Bit Rn>>2 → Rn — SHLR2 0100nnnn00001001 Rn>>8 → Rn — SHLR8 0100nnnn00011001 Rn>>16 → Rn — SHLR16 0100nnnn00101001 Description: This instruction logically shifts the contents of general register Rn 2, 8, or 16 bits to the right, and stores the result in Rn.
  • Page 339 Operation: SHLR2(long n) /* SHLR2 Rn */ R[n] >>= 2; R[n] &= 0x3FFFFFFF; PC += 2; SHLR8(long n) /* SHLR8 Rn */ R[n] >>= 8; R[n] &= 0x00FFFFFF; PC += 2; SHLR16(long n) /* SHLR16 Rn */ R[n] >>= 16; R[n] &= 0x0000FFFF;...
  • Page 340: Sleep (Sleep): System Control Instruction (Privileged Instruction)

    10.1.69 SLEEP (Sleep): System Control Instruction (Privileged Instruction) Format Operation Instruction Code Cycle T Bit Sleep or standby Undefined — SLEEP 0000000000011011 Description: This instruction places the CPU in the power-down state. In power-down mode, the CPU retains its internal state, but immediately stops executing instructions and waits for an interrupt request.
  • Page 341: Stc (Store Control Register): System Control Instruction (Privileged Instruction)

    10.1.70 STC (Store Control Register): System Control Instruction (Privileged Instruction) Format Operation Instruction Code Cycle T Bit GBR → Rn — GBR, Rn 0000nnnn00010010 VBR → Rn — VBR, Rn 0000nnnn00100010 SSR → Rn — SSR, Rn 0000nnnn00110010 SPC → Rn —...
  • Page 342 Notes: STC/STC.L can only be used in privileged mode excepting STC GBR, Rn/STC.L GBR, @-Rn. Use of these instructions in user mode will cause illegal instruction exceptions. Operation: STCGBR(int n) /* STC GBR,Rn */ R[n] = GBR; PC += 2; STCVBR(int n) /* STC VBR,Rn : Privileged */ R[n] = VBR;...
  • Page 343 STCRm_BANK(int n) /* STC Rm_BANK,Rn : Privileged */ /* m=0–7 */ R[n] = Rm_BANK; PC += 2; STCMGBR(int n) /* STC.L GBR,@–Rn */ R[n] –= 4; Write_Long(R[n],GBR); PC += 2; STCMVBR(int n) /* STC.L VBR,@-Rn : Privileged */ R[n] –= 4; Write_Long(R[n],VBR);...
  • Page 344 STCMSGR(int n) /* STC.L SGR,@-Rn : Privileged */ R[n] –= 4; Write_Long(R[n],SGR); PC += 2; STCMDBR(int n) /* STC.L DBR,@-Rn : Privileged */ R[n] –= 4; Write_Long(R[n],DBR); PC += 2; STCMRm_BANK(int n) /* STC.L Rm_BANK,@-Rn : Privileged */ /* m=0–7 */ R[n] –= 4;...
  • Page 345: Sts (Store System Register): System Control Instruction

    10.1.71 STS (Store System Register): System Control Instruction Format Operation Instruction Code Cycle T Bit MACH → Rn — MACH,Rn 0000nnnn00001010 MACL → Rn — MACL,Rn 0000nnnn00011010 PR → Rn — PR,Rn 0000nnnn00101010 Rn - 4 → Rn, MACH → (Rn) —...
  • Page 346 STSMMACH(int n) /* STS.L MACH,@-Rn */ R[n] –= 4; Write_Long(R[n],MACH); PC += 2; STSMMACL(int n) /* STS.L MACL,@-Rn */ R[n] –= 4; Write_Long(R[n],MACL); PC += 2; STSMPR(int n) /* STS.L PR,@-Rn */ R[n] –= 4; Write_Long(R[n],PR); PC += 2; Example: ;...
  • Page 347: Sub (Subtract Binary): Arithmetic Instruction

    10.1.72 SUB (Subtract Binary): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit Rn - Rm → Rn — Rm,Rn 0011nnnnmmmm1000 Description: This instruction subtracts the contents of general register Rm from the contents of general register Rn and stores the result in Rn. For immediate data subtraction, ADD #imm,Rn should be used.
  • Page 348: Subc (Subtract With Carry): Arithmetic Instruction

    10.1.73 SUBC (Subtract with Carry): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit Rn - Rm-T → Rn, borrow Borrow SUBC Rm,Rn 0011nnnnmmmm1010 → T Description: This instruction subtracts the contents of general register Rm and the T bit from the contents of general register Rn, and stores the result in Rn.
  • Page 349: Subv (Subtract With (V Flag) Underflow Check): Arithmetic Instruction

    10.1.74 SUBV (Subtract with (V flag) Underflow Check): Arithmetic Instruction Format Operation Instruction Code Cycle T Bit Rn - Rm → Rn, underflow SUBV Rm,Rn 0011nnnnmmmm1011 Underflow → T Description: This instruction subtracts the contents of general register Rm from the contents of general register Rn, and stores the result in Rn.
  • Page 350 Example: ;Before execution R0 = H'00000002, R1 = H'80000001 SUBV R0,R1 ;After execution R1 = H'7FFFFFFF, T = 1 ;Before execution R2 = H'FFFFFFFE, R3 = H'7FFFFFFE SUBV R2,R3 ;After execution R3 = H'80000000, T = 1 Rev. 1.50, 10/04, page 330 of 448...
  • Page 351: Swap (Swap Register Halves): Data Transfer Instruction

    10.1.75 SWAP (Swap Register Halves): Data Transfer Instruction Format Operation Instruction Code Cycle T Bit Rm → lower-2-byte upper/ — SWAP.B Rm,Rn 0110nnnnmmmm1000 lower-byte swap → Rn Rm → upper-/lower-word SWAP.W Rm,Rn 0110nnnnmmmm1001 swap → Rn Description: This instruction swaps the upper and lower parts of the contents of general register Rm, and stores the result in Rn.
  • Page 352 temp = (R[m]>>16)&0x0000FFFF; R[n] = R[m]<<16; R[n] |= temp; PC += 2; Example: ;Before execution R0 = H'12345678 SWAP.B R0,R1 ;After execution R1 = H'12347856 ;Before execution R0 = H'12345678 SWAP.W R0,R1 ;After execution R1 = H'56781234 Rev. 1.50, 10/04, page 332 of 448...
  • Page 353: Synco (Synchronize Data Operation): Data Transfer Instruction

    10.1.76 SYNCO (Synchronize Data Operation): Data Transfer Instruction Format Operation Instruction Code Cycle T Bit Data accesses invoked by the — SYNCO 0000000010101011 Undefined following instruction are not executed until execution of data accesses which precede this instruction has been completed.
  • Page 354: Tas (Test And Set): Logical Instruction

    10.1.77 TAS (Test And Set): Logical Instruction Format Operation Instruction Code Cycle T Bit If (Rn) = 0, 1 → T, else 0 → T Test result TAS.B 0100nnnn00011011 1 → MSB of (Rn) Description: This instruction purges the cache block corresponding to the memory area specified by the contents of general register Rn, reads the byte data indicated by that address, and sets the T bit to 1 if that data is zero, or clears the T bit to 0 if the data is nonzero.
  • Page 355 Possible Exceptions: • Data TLB multiple-hit exception • Data TLB miss exception • Data TLB protection violation exception • Initial page write exception • Data address error Exceptions are checked taking a data access by this instruction as a byte load and a byte store. Rev.
  • Page 356: Trapa (Trap Always): System Control Instruction

    10.1.78 TRAPA (Trap Always): System Control Instruction Format Operation Instruction Code Cycle T Bit Imm<<2 → TRA, PC + 2 → — TRAPA #imm 11000011iiiiiiii SPC, SR → SSR, R15 → SGR, 1 → SR.MD/BL/RB, H'160 → EXPEVT, VBR + H'00000100 → PC Description: This instruction starts trap exception handling.
  • Page 357: Tst (Test Logical): Logical Instruction

    10.1.79 TST (Test Logical): Logical Instruction Format Operation Instruction Code Cycle T Bit Rn & Rm; if result is 0, Test result Rm,Rn 0010nnnnmmmm1000 1 → T, else 0 → T R0 & imm; if result is 0, Test result #imm,R0 11001000iiiiiiii 1 →...
  • Page 358 TSTM(long i) /* TST.B #imm,@(R0,GBR) */ long temp; temp = (long)Read_Byte(GBR+R[0]); temp &= (0x000000FF & (long)i); if (temp==0) T = 1; else T = 0; PC += 2; Example: ;Before execution R0 = H'00000000 R0,R0 ;After execution T = 1 ;Before execution R0 = H'FFFFFF7F #H'80,R0 ;After execution...
  • Page 359: Xor (Exclusive Or Logical): Logical Instruction

    10.1.80 XOR (Exclusive OR Logical): Logical Instruction Format Operation Instruction Code Cycle T Bit Rn ^ Rm → Rn — Rm,Rn 0010nnnnmmmm1010 R0 ^ imm → R0 — #imm,R0 11001010iiiiiiii XOR.B #imm,@(R0,GBR) (R0 + GBR)^imm → — 11001110iiiiiiii (R0 + GBR) Description: This instruction exclusively ORs the contents of general registers Rn and Rm, and stores the result in Rn.
  • Page 360 Example: ;Before execution R0 = H'AAAAAAAA, R1 = H'55555555 R0,R1 ;After execution R1 = H'FFFFFFFF ;Before execution R0 = H'FFFFFFFF #H'F0,R0 ;After execution R0 = H'FFFFFF0F XOR.B #H'A5,@(R0,GBR) ;Before execution (R0,GBR) = H'A5 ;After execution (R0,GBR) = H'00 Possible Exceptions: Exceptions may occur when XOR.B instruction is executed. •...
  • Page 361: Xtrct (Extract): Data Transfer Instruction

    10.1.81 XTRCT (Extract): Data Transfer Instruction Format Operation Instruction Code Cycle T Bit Middle 32 bits of Rm:Rn → Rn 0010nnnnmmmm1101 — XTRCT Rm,Rn Description: This instruction extracts the middle 32 bits from the 64-bit contents of linked general registers Rm and Rn, and stores the result in Rn. Notes: None Operation: XTRCT(long m, long n)
  • Page 362: Cpu Instructions (Fpu Related)

    10.2 CPU Instructions (FPU related) Of the SH-4A CPU's instructions, those which support the FPU and those which differ in function from instructions of the SH3A-DSP are described in this section. 10.2.1 BSR (Branch to Subroutine): Branch Instruction (Delayed Branch Instruction) Format Operation Instruction Code...
  • Page 363 Example ;Branch to TRGET. TRGET ;MOV executed before branch. R3,R4 ;Subroutine procedure return destination (contents of PR) R0,R1 ..;← Entry to procedure TRGET: R2,R3 ;Return to above ADD instruction. ;MOV executed before branch. #1,R0 Possible Exceptions: • Slot illegal instruction exception Rev.
  • Page 364: Bsrf (Branch To Subroutine Far): Branch Instruction (Delayed Branch

    10.2.2 BSRF (Branch to Subroutine Far): Branch Instruction (Delayed Branch Instruction) Format Operation Instruction Code Cycle T Bit PC+4 → PR, — BSRF 0000nnnn00000011 PC+4+Rn → PC Description: This instruction branches to address (PC + 4 + Rn), and stores address (PC + 4) in PR.
  • Page 365 Example: #(TRGET-BSRF_PC),R0 ;Set displacement. MOV.L ;Branch to TRGET. BSRF ;MOV executed before branch. R3,R4 BSRF_PC: R0,R1 ..;← Entry to procedure TRGET: R2,R3 ;Return to above ADD instruction. ;MOV executed before branch. #1,R0 Possible Exceptions: • Slot illegal instruction exception Rev.
  • Page 366: Jsr (Jump To Subroutine): Branch Instruction (Delayed Branch Instruction)

    10.2.3 JSR (Jump to Subroutine): Branch Instruction (Delayed Branch Instruction) Format Operation Instruction Code Cycle T Bit PC+4 → PR, Rn → PC — JSR @Rn 0100nnnn00001011 Description: This instruction makes a delayed branch to the subroutine procedure at the specified address after execution of the following instruction.
  • Page 367 Example: ;R0 = TRGET address MOV.L JSR_TABLE,R0 ;Branch to TRGET. ;XOR executed before branch. R1,R1 ;← Procedure return destination (PR contents) R0,R1 ..align ;Jump table JSR_TABLE: .data.l TRGET ;← Entry to procedure TRGET: R2,R3 ;Return to above ADD instruction. ;MOV executed before RTS.
  • Page 368: Ldc (Load To Control Register): System Control Instruction (Privileged

    10.2.4 LDC (Load to Control Register): System Control Instruction (Privileged Instruction) Format Operation Instruction Code Cycle T Bit Rm → SR LDC Rm,SR 0100mmmm00001110 (Rm) → SR, Rm+4 → Rm LDC.L @Rm+,SR 0100mmmm00000111 Description: This instruction stores the source operand in the control register SR. Notes: This instruction is only usable in privileged mode.
  • Page 369: Lds (Load To Fpu System Register): System Control Instruction

    10.2.5 LDS (Load to FPU System register): System Control Instruction Format Operation Instruction Code Cycle T Bit Rm → FPUL — Rm,FPUL 0100mmmm01011010 (Rm) → FPUL, Rm + 4 — LDS.L @Rm+,FPUL 0100mmmm01010110 → Rm Rm → FPSCR — Rm,FPSCR 0100mmmm01101010 (Rm) →...
  • Page 370 LDSMFPSCR(int /* LDS.L @Rm+,FPSCR FPSCR = Read_Long(R[m]) & FPSCR_MASK; R[m] += 4; PC += 2; Possible Exceptions: • Data TLB multiple-hit exception • Data TLB miss exception • Data TLB protection violation exception • Data address error Rev. 1.50, 10/04, page 350 of 448...
  • Page 371: Stc (Store Control Register): System Control Instruction (Privileged Instruction)

    10.2.6 STC (Store Control Register): System Control Instruction (Privileged Instruction) Format Operation Instruction Code Cycle T Bit SR → Rn — STC SR,Rn 0000nnnn00000010 Rn - 4 →Rn, SR → (Rn) — STC.L SR,@-Rn 0100nnnn00000011 Description: This instruction stores the control register SR in the destination. Notes: STC can only be used in privileged mode.
  • Page 372: Sts (Store From Fpu System Register): System Control Instruction

    10.2.7 STS (Store from FPU System Register): System Control Instruction Format Operation Instruction Code Cycle T Bit FPUL → Rn — 0000nnnn01011010 FPUL,Rn FPSCR → Rn — 0000nnnn01101010 FPSCR,Rn Rn-4 → Rn, FPUL → (Rn) — 0100nnnn01010010 STS.L FPUL,@-Rn Rn-4 → Rn, FPSCR → (Rn) —...
  • Page 373 Examples: • STS Example 1: MOV.L #H'12ABCDEF, R12 R12, FPUL FPUL, R13 ; After executing the STS instruction: ; R13 = 12ABCDEF Example 2: FPSCR, R2 ; After executing the STS instruction: ; The current content of FPSCR is stored in register R2 •...
  • Page 374: Fpu Instruction

    10.3 FPU Instruction The following resources and functions are for use in C-language descriptions of the operation of FPU instructions and supplement the resources and functions used in describing the operation of CPU instructions. These are floating-point number definition statements. #define PZERO #define NZERO #define DENORM...
  • Page 375 #define FPSCR_FR FPSCR>>21&1 #define FPSCR_PR FPSCR>>19&1 #define FPSCR_DN FPSCR>>18&1 #define FPSCR_I FPSCR>>12&1 #define FPSCR_RM FPSCR&1 #define FR_HEX frf.l[ FPSCR_FR] #define FR frf.f[ FPSCR_FR] #define DR_HEX frf.l[ FPSCR_FR] #define DR frf.d[ FPSCR_FR] #define XF_HEX frf.l[~FPSCR_FR] #define XF frf.f[~FPSCR_FR] #define XD frf.d[~FPSCR_FR] union { l[2][16];...
  • Page 376 else if(abs < 0x7f800000) return(NORM); else if(abs == 0x7f800000) { if(sign_of(n) == 0) return(PINF); else return(NINF); else if(abs < 0x7fc00000) return(qNaN); else return(sNaN); /* Double-precision */ else { if(abs < 0x00100000){ if((FPSCR_DN == 1) || ((abs == 0x00000000) && (FR_HEX[n+1] == 0x00000000)){ if(sign_of(n) == 0) {zero(n, 0);...
  • Page 377 union { float f; int l; dstf,srcf; union { long d; int l[2]; dstd,srcd; union { /* “long double” format: */ long double x; 1-bit sign int l[4]; 15-bit exponent dstx; 112-bit mantissa if(FPSCR_PR == 0) { if(type == FADD) srcf.f = FR[m];...
  • Page 378 if(type == FADD) srcd.d = DR[m>>1]; else srcd.d = -DR[m>>1]; dstx.x = DR[n>>1]; /* Conversion from double-precision to extended double-precision */ dstx.x += srcd.d; if(((dstx.x == DR[n>>1]) && (srcd.d != 0.0)) || ((dstx.x == srcd.d) && (DR[n>>1] != 0.0)) ) { set_I();...
  • Page 379 union { long double x; int l[4]; tmpx; if(FPSCR_PR == 0) { tmpd.d = FR[n]; /* Single-precision to double-precision */ tmpd.d *= FR[m]; /* Precise creation */ tmpf.f *= FR[m]; /* Round to nearest */ if(tmpf.f != tmpd.d) set_I(); if((tmpf.f > tmpd.d) && (FPSCR_RM == 1)) { tmpf.l -= 1;...
  • Page 380 float dstf; if((data_type_of(m) == sNaN) || (data_type_of(n) == sNaN) || (data_type_of(m+1) == sNaN) || (data_type_of(n+1) == sNaN) || (data_type_of(m+2) == sNaN) || (data_type_of(n+2) == sNaN) || (data_type_of(m+3) == sNaN) || (data_type_of(n+3) == sNaN) || (check_product_invalid(m,n)) || (check_product_invalid(m+1,n+1)) || (check_product_invalid(m+2,n+2)) || (check_product_invalid(m+3,n+3)) ) invalid(n+3);...
  • Page 381 check_single_exception(&FR[n+3],dstf); void check_single_exception(float *dst,result) union { float f; int l; tmp; float abs; tmp.l = 0xff800000; /* – infinity */ if(result < 0.0) tmp.l = 0x7f800000; /* + infinity */ else if(result == tmp.f) { set_O(); set_I(); if(FPSCR_RM == 1) tmp.l -= 1;...
  • Page 382 void check_double_exception(double *dst,result) union { double d; int l[2]; tmp; double abs; tmp.l[0] = 0xfff00000; /* – infinity */ if(result < 0.0) tmp.l[0] = 0x7ff00000; /* + infinity */ else tmp.l[1] = 0x00000000; if(result == tmp.d) set_O(); set_I(); if(FPSCR_RM == 1) { tmp.l[0] -= 1;...
  • Page 383 int check_product_invalid(int m,n) return(check_product_infinity(m,n) && ((data_type_of(m) == PZERO) || (data_type_of(n) == PZERO) || (data_type_of(m) == NZERO) || (data_type_of(n) == NZERO))); int check_ product_infinity(int m,n) return((data_type_of(m) == PINF) || (data_type_of(n) == PINF) || (data_type_of(m) == NINF) || (data_type_of(n) == NINF)); int check_ positive_infinity(int m,n) return(((check_ product_infinity(m,n) &&...
  • Page 384 void invalid(int n) set_V(); if((FPSCR & ENABLE_V) == 0 qnan(n); else fpu_exception_trap(); void dz(int n,sign) set_Z(); if((FPSCR & ENABLE_Z) == 0 inf(n,sign); else fpu_exception_trap(); void zero(int n,sign) if(sign == 0) FR_HEX [n] = 0x00000000; else FR_HEX [n] = 0x80000000; if (FPSCR_PR==1) FR_HEX [n+1] = 0x00000000; void inf(int n,sign) { if (FPSCR_PR==0) { if(sign == 0)
  • Page 385: Fabs (Floating-Point Absolute Value): Floating-Point Instruction

    10.3.1 FABS (Floating-point Absolute Value): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit FRn & H'7FFFFFFF → FRn 1111nnnn01011101 1 — FABS 1111nnn001011101 1 — FABS DRn & H'7FFFFFFFFFFFFFFF → DRn Description: This instruction clears the most significant bit of the contents of floating-point register FRn/DRn to 0, and stores the result in FRn/DRn.
  • Page 386: Fadd (Floating-Point Add): Floating-Point Instruction

    10.3.2 FADD (Floating-point ADD): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit FRn+FRm → FRn 1111nnnnmmmm0000 1 — FADD FRm,FRn DRn+DRm → DRn 1111nnn0mmm00000 1 — FADD DRm,DRn Description: When FPSCR.PR = 0: Arithmetically adds the two single-precision floating-point numbers in FRn and FRm, and stores the result in FRn.
  • Page 387 break; case PZERO: switch (data_type_of(n)){ case NZERO: zero(n,0); break; default: break; break; case NZERO: break; case PINF: switch (data_type_of(n)){ case NINF: invalid(n); break; default: inf(n,0); break; break; case NINF: switch (data_type_of(n)){ case PINF: invalid(n); break; default: inf(n,1); break; break; FADD Special Cases FADD FRn,DRm FRm,DRm +NORM...
  • Page 388 Possible Exceptions and Overflow/Underflow Exception Trap Generating Conditions: • FPU error • Invalid operation • Overflow Generation of overflow-exception traps FPSCR.PR = 0: FRn and FRm have the same sign and the exponent of at least one value is H'FE FPSCR.PR = 1: DRn and DRm have the same sign and the exponent of at least one value is H'7FE •...
  • Page 389: Fcmp (Floating-Point Compare): Floating-Point Instruction

    10.3.3 FCMP (Floating-point Compare): Floating-Point Instruction No. PR Format Operation Instruction Code Cycle T Bit When FRn = FRm,1 → T 1111nnnnmmmm0100 1 FCMP/EQ FRm,FRn Otherwise, 0 → T When DRn = DRm,1 → T 1111nnn0mmm00100 1 FCMP/EQ DRm,DRn Otherwise, 0 → T When FRn >...
  • Page 390 Operation: void FCMP_EQ(int m,n) /* FCMP/EQ FRm,FRn */ pc += 2; clear_cause(); if(fcmp_chk(m,n) == INVALID) fcmp_invalid(); else if(fcmp_chk(m,n) == EQ) T = 1; else T = 0; void FCMP_GT(int m,n) /* FCMP/GT FRm,FRn */ pc += 2; clear_cause(); if ((fcmp_chk(m,n) == INVALID) || (fcmp_chk(m,n) == UO)) fcmp_invalid();...
  • Page 391 case PINF : switch(data_type_of(n)){ case PINF :return(EQ); break; default:return(LT); break; break; case NINF : switch(data_type_of(n)){ case NINF :return(EQ); break; default:return(GT); break; break; if(FPSCR_PR == 0) { if(FR[n] == FR[m]) return(EQ); else if(FR[n] > FR[m]) return(GT); else return(LT); }else { if(DR[n>>1] == DR[m>>1]) return(EQ);...
  • Page 392 FCMP Special Cases FCMP/EQ FRn,DRn FRm,DRm NORM DNORM –0 +INF –INF qNaN sNaN NORM DNORM –0 +INF –INF qNaN sNaN Invalid Note: When DN = 1, the value of a denormalized number is treated as 0. FCMP/GT FRn,DRn FRm,DRm NORM DENORM +0 –0 +INF...
  • Page 393: Floating-Point Instruction

    10.3.4 FCNVDS (Floating-point Convert Double to Single Precision): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit — — — — — FCNVDS DRm,FPUL (float)DRm → FPUL — 1111mmm010111101 Description: When FPSCR.PR = 1: This instruction converts the double-precision floating-point number in DRm to a single-precision floating-point number, and stores the result in FPUL.
  • Page 394: Fcnvds (Floating-Point Convert Double To Single Precision)

    qNaN *FPUL = 0x7fbfffff; break; sNaN set_V(); if((FPSCR & ENABLE_V) == 0) *FPUL = 0x7fbfffff; else fpu_exception_trap(); break; void normal_fcnvds(int m, float *FPUL) int sign; float abs; union { float f; int l; dstf,tmpf; union { double d; int l[2]; dstd;...
  • Page 395 Possible Exceptions and Overflow/Underflow Exception Trap Generating Conditions: • FPU error • Invalid operation • Overflow Generation of overflow-exception traps The exponent of DRn is not less than H'47E • Underflow Generation of underflow-exception traps The exponent of DRn is not more than H'380 •...
  • Page 396: Fcnvsd (Floating-Point Convert Single To Double Precision)

    10.3.5 FCNVSD (Floating-point Convert Single to Double Precision): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit — — — — — FCNVSD FPUL,DRn (double) FPUL → DRn — 1111nnn01010110 Description: When FPSCR.PR = 1: This instruction converts the single-precision floating-point number in FPUL to a double-precision floating-point number, and stores the result in DRn.
  • Page 397 abs = *FPUL & 0x7fffffff; if(abs < 0x00800000){ if((FPSCR_DN == 1) || (abs == 0x00000000)){ if(sign_of(src) == 0) return(PZERO); else return(NZERO); else return(DENORM); else if(abs < 0x7f800000) return(NORM); else if(abs == 0x7f800000) { if(sign_of(src) == 0) return(PINF); else return(NINF); else if(abs < 0x7fc00000) return(qNaN);...
  • Page 398: Fdiv (Floating-Point Divide): Floating-Point Instruction

    10.3.6 FDIV (Floating-point Divide): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit FRm,FRn FRn/FRm → FRn 1111nnnnmmmm0011 14 — FDIV DRm,DRn DRn/DRm → DRn 1111nnn0mmm00011 30 — FDIV Description: When FPSCR.PR = 0: Arithmetically divides the single-precision floating-point number in FRn by the single-precision floating-point number in FRm, and stores the result in FRn. When FPSCR.PR = 1: Arithmetically divides the double-precision floating-point number in DRn by the double-precision floating-point number in DRm, and stores the result in DRn.
  • Page 399 break; case PZERO: switch (data_type_of(n)){ case PZERO: case NZERO: invalid(n);break; case PINF: case NINF: break; default: dz(n,sign_of(m)^sign_of(n));break; break; case NZERO: switch (data_type_of(n)){ case PZERO: case NZERO: invalid(n); break; case PINF: inf(n,1); break; case NINF: inf(n,0); break; default: dz(FR[n],sign_of(m)^sign_of(n)); break; break; case DENORM: set_E();...
  • Page 400 int l[4]; tmpx; if(FPSCR_PR == 0) { tmpf.f = FR[n]; /* save destination value */ dstf.f /= FR[m]; /* round toward nearest or even */ tmpd.d = dstf.f; /* convert single to double */ tmpd.d *= FR[m]; if(tmpf.f != tmpd.d) set_I(); if((tmpf.f <...
  • Page 401 FDIV Special Cases FDIV FRn,DRn FRm,DRm +NORM -NORM +DENORM –DENORM +0 +inf –inf qNaN sNaN +NORM FDIV +inf -inf -NORM -inf +inf +DENORM +inf -inf –DENORM Error -inf +inf +inf -inf invalid -inf DZ+inf +inf –inf invalid qNaN qNaN sNaN invalid Note: When DN = 1, the value of a denormalized number is treated as 0.
  • Page 402: Fipr (Floating-Point Inner Product): Floating-Point Instruction

    10.3.7 FIPR (Floating-point Inner Product): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit FVm,FVn Inner_product(FVm, FVn) 1111nnmm11101101 1 — FIPR → FR[n+3] — — — — — — Notes: FV0 = {FR0, FR1, FR2, FR3} FV4 = {FR4, FR5, FR6, FR7} FV8 = {FR8, FR9, FR10, FR11} FV12 = {FR12, FR13, FR14, FR15} Description: When FPSCR.PR = 0: This instruction calculates the inner products of the 4-...
  • Page 403 and FPSCR.flag, and FR[n+3] is not updated. Appropriate processing should therefore be performed by software. Notes: None Operation: void FIPR(int m,n) /* FIPR FVm,FVn */ if(FPSCR_PR == 0) { pc += 2; clear_cause(); fipr(m,n); else undefined_operation(); Possible Exceptions and Overflow Exception Trap Generating Conditions: •...
  • Page 404: Fldi0 (Floating-Point Load Immediate 0.0): Floating-Point Instruction

    10.3.8 FLDI0 (Floating-point Load Immediate 0.0): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit 0x00000000 → FRn 1111nnnn10001101 1 — FLDI0 — — — — — Description: When FPSCR.PR = 0, this instruction loads floating-point 0.0 (0x00000000) into FRn. Notes: None Operation: void FLDI0(int n)
  • Page 405: Fldi1 (Floating-Point Load Immediate 1.0): Floating-Point Instruction

    10.3.9 FLDI1 (Floating-point Load Immediate 1.0): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit 0x3F800000 → FRn 1111nnnn10011101 1 — FLDI1 — — — — — Description: When FPSCR.PR = 0, this instruction loads floating-point 1.0 (0x3F800000) into FRn. Notes: None Operation: void FLDI1(int n)
  • Page 406: Flds (Floating-Point Load To System Register): Floating-Point Instruction

    10.3.10 FLDS (Floating-point Load to System register): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit FRm → FPUL 1111mmmm00011101 1 — FLDS FRm,FPUL Description: This instruction loads the contents of floating-point register FRm into system register FPUL. Notes: None Operation: void FLDS(int m, float *FPUL) *FPUL = FR[m];...
  • Page 407: Float (Floating-Point Convert From Integer): Floating-Point Instruction

    10.3.11 FLOAT (Floating-point Convert from Integer): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit (float)FPUL → FRn 1111nnnn00101101 1 — FLOAT FPUL,FRn (double)FPUL → DRn 1111nnn000101101 1 — FLOAT FPUL,DRn Description: When FPSCR.PR = 0: Taking the contents of FPUL as a 32-bit integer, converts this integer to a single-precision floating-point number and stores the result in FRn.
  • Page 408 Possible Exceptions: • Inexact: Not generated when FPSCR.PR = 1. Rev. 1.50, 10/04, page 388 of 448...
  • Page 409: Fmac (Floating-Point Multiply And Accumulate): Floating-Point Instruction

    10.3.12 FMAC (Floating-point Multiply and Accumulate): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit FMAC FR0,FRm,FRn FR0 × FRm+FRn → FRn 1111nnnnmmmm1110 1 — — — — — — Description: When FPSCR.PR = 0: This instruction arithmetically multiplies the two single- precision floating-point numbers in FR0 and FRm, arithmetically adds the contents of FRn, and stores the result in FRn.
  • Page 410 case PZERO: case NZERO: zero(n,sign_of(0)^ sign_of(m)^sign_of(n)); break; default: break; case PINF: case NINF: switch (data_type_of(n)){ case DENORM: set_E(); break; case qNaN: qnan(n); break; case PINF: case NINF: if(sign_of(0)^ sign_of(m)^sign_of(n)) invalid(n); else inf(n,sign_of(0)^ sign_of(m)); break; default: inf(n,sign_of(0)^ sign_of(m)); break; case NORM: switch (data_type_of(n)){ case DENORM: set_E();...
  • Page 411 case NINF : switch (data_type_of(m)){ case PZERO: case NZERO:invalid(n); break; default: switch (data_type_of(n)){ case DENORM: set_E(); break; case qNaN: qnan(n); break; default: inf(n,sign_of(0)^sign_of(m)^sign_of(n));break break; break; void normal_fmac(int m,n) union { int double x; int l[4]; dstx,tmpx; float dstf,srcf; if((data_type_of(n) == PZERO)|| (data_type_of(n) == NZERO)) srcf = 0.0;...
  • Page 412 dstx.l[1] &= 0xfe000000; /* round toward zero */ dstx.l[2] = 0x00000000; dstx.l[3] = 0x00000000; dstf = dstx.x; check_single_exception(&FR[n],dstf); Rev. 1.50, 10/04, page 392 of 448...
  • Page 413 FMAC Special Cases FMAC +NORM -NORM +0 –0 +inf –inf qNaN sNaN NORM +NORM FMAC +inf -inf -NORM -inf +inf invalid +inf +inf -inf +inf -inf -inf -inf +inf invalid -inf +inf +NORM FMAC +inf -inf -NORM -inf +inf invalid +inf +inf -inf...
  • Page 414 FMAC +NORM -NORM +0 –0 +inf –inf qNaN sNaN qNaN +NORM -NORM invalid +inf -inf invalid !sNaN qNaN qNaN all types sNaN sNaN all types invalid Notes: When DN = 1, the value of a denormalized numbers is treated as 0. When DN = 0, calculation for denormalized numbers is the same as for normalized numbers.
  • Page 415: Fmov (Floating-Point Move): Floating-Point Instruction

    10.3.13 FMOV (Floating-point Move): Floating-Point Instruction No. SZ Format Operation Instruction Code Cycle T Bit FRm → FRn — FMOV FRm,FRn 1111nnnnmmmm1100 DRm → DRn — FMOV DRm,DRn 1111nnn0mmm01100 FRm → (Rn) — FMOV.S FRm,@Rn 1111nnnnmmmm1010 DRm → (Rn) — FMOV DRm,@Rn 1111nnnnmmm01010...
  • Page 416 12. This instruction transfers contents of memory at address indicated by (R0 + Rm) to DRn. 13. This instruction transfers FRm contents to memory at address indicated by (R0 + Rn). 14. This instruction transfers DRm contents to memory at address indicated by (R0 + Rn). Notes: None Operation: void FMOV(int m,n)
  • Page 417 load_int(R[m],FR[n]); R[m] += 4; pc += 2; void FMOV_RESTORE_DR(int m,n) /* FMOV @Rm+,DRn */ load_quad(R[m],DR[n>>1]) ; R[m] += 8; pc += 2; void FMOV_SAVE(int m,n) /* FMOV.S FRm,@–Rn */ store_int(FR[m],R[n]-4); R[n] -= 4; pc += 2; void FMOV_SAVE_DR(int m,n) /* FMOV DRm,@–Rn */ store_quad(DR[m>>1],R[n]-8);...
  • Page 418 void FMOV_INDEX_STORE_DR(int m,n)/*FMOV DRm,@(R0,Rn)*/ store_quad(DR[m>>1], R[0] + R[n]); pc += 2; Possible Exceptions: • Data TLB miss exception • Data protection violation exception • Initial page write exception • Data address error Rev. 1.50, 10/04, page 398 of 448...
  • Page 419: Fmov (Floating-Point Move Extension): Floating-Point Instruction

    10.3.14 FMOV (Floating-point Move Extension): Floating-Point Instruction No. SZ Format Operation Instruction Code Cycle T Bit XRm → (Rn) 1111nnnnmmm11010 1 — FMOV XDm,@Rn (Rm) → XDn 1111nnn1mmmm1000 1 — FMOV @Rm,XDn (Rm) → XDn, Rm+8 1111nnn1mmmm1001 1 — FMOV @Rm+,XDn →...
  • Page 420 Operation: void FMOV_STORE_XD(int m,n) /* FMOV XDm,@Rn */ store_quad(XD[m>>1],R[n]); pc += 2; void FMOV_LOAD_XD(int m,n) /* FMOV @Rm,XDn */ load_quad(R[m],XD[n>>1]); pc += 2; void FMOV_RESTORE_XD(int m,n) /* FMOV @Rm+,XDn */ load_quad(R[m],XD[n>>1]); R[m] += 8; pc += 2; void FMOV_SAVE_XD(int m,n) /* FMOV XDm,@–Rn */ store_quad(XD[m>>1],R[n]-8);...
  • Page 421 void FMOV_XDDR(int m,n) /* FMOV XDm,DRn */ DR[n>>1] = XD[m>>1]; pc += 2; void FMOV_DRXD(int m,n) /* FMOV DRm,XDn */ XD[n>>1] = DR[m>>1]; pc += 2; Possible Exceptions: • Data TLB miss exception • Data protection violation exception • Initial page write exception •...
  • Page 422: Fmul (Floating-Point Multiply): Floating-Point Instruction

    10.3.15 FMUL (Floating-point Multiply): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit FRn × FRm → FRn 1111nnnnmmmm0010 1 — FMUL FRm,FRn DRn × DRm → DRn 1111nnn0mmm00010 3 — FMUL DRm,DRn Description: When FPSCR.PR = 0: Arithmetically multiplies the two single-precision floating- point numbers in FRn and FRm, and stores the result in FRn.
  • Page 423 default: normal_fmul(m,n); break; break; case PZERO: case NZERO: switch (data_type_of(n)){ case PINF: case NINF: invalid(n); break; default: zero(n,sign_of(m)^sign_of(n));break; break; case PINF : case NINF : switch (data_type_of(n)){ case PZERO: case NZERO: invalid(n); break; default: inf(n,sign_of(m)^sign_of(n));break break; FMUL Special Cases (FPSCR.PR = 0) FMUL +NORM -NORM...
  • Page 424 FMUL Special Cases (FPSCR.PR = 1) FMUL +NORM -NORM +DENORM –DENORM +0 +inf –inf qNaN sNaN +NORM FMUL +inf -inf -NORM -inf +inf +DENORM +inf -inf –DENORM Error -inf +inf invalid +inf +inf -inf +inf -inf +inf -inf –inf -inf +inf -inf +inf...
  • Page 425: Fneg (Floating-Point Negate Value): Floating-Point Instruction

    10.3.16 FNEG (Floating-point Negate Value): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit FRn ^ H'80000000 → FRn 1111nnnn01001101 1 — FNEG FRn DRn ^ H'8000000000000000 1111nnn001001101 1 — FNEG DRn → DRn Description: This instruction inverts the most significant bit (sign bit) of the contents of floating- point register FRn/DRn, and stores the result in FRn/DRn.
  • Page 426: Fpchg (Pr-Bit Change): Floating-Point Instruction

    10.3.17 FPCHG (Pr-bit Change): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit ~FPSCR.PR → FPSCR.PR 1111011111111101 1 — FPCHG Description: This instruction inverts the PR bit of the floating-point status register FPSCR. The value of this bit selects single-precision or double-precision operation. Notes: None Operation: void FPCHG(){/* FPCHG */}...
  • Page 427: Frchg (Fr-Bit Change): Floating-Point Instruction

    10.3.18 FRCHG (FR-bit Change): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit ~FPSCR.FR → FPSCR.FR 1111101111111101 1 — FRCHG — — — — — Description: This instruction inverts the FR bit in floating-point register FPSCR. When the FR bit in FPSCR is changed, FR0 to FR15 in FPR0_BANK0 to FPR15_BANK0 and FPR0_BANK1 to FPR15_BANK1 become XR0 to XR15, and XR0 to XR15 become FR0 to FR15.
  • Page 428: Floating-Point Instruction

    10.3.19 FSCA (Floating Point Sine And Cosine Approximate): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit sin(FPUL) → FRn 1111nnn011111101 3 — FSCA FPUL,DRn cos(FPUL) → FR[n+1] — reserved 1111nnnn11111101 — — Description: This instruction calculates the sine and cosine approximations of FPUL (absolute error is within ±2^–21) as single-precision floating point values, and places the values of the sine and cosine in FRn and FR[n + 1], respectively.
  • Page 429 1: undefined_operation(); /* reserved */ Data Format of Source Operand: Angle is specified as shown below, i.e., as a signed fraction in twos complement. The result of sin/cos is a single-precision floating-point number. 0x7FFFFFFF to 0x00000001 : 360 × 2 −...
  • Page 430: Fschg (Sz-Bit Change): Floating-Point Instruction

    10.3.20 FSCHG (Sz-bit Change): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit ~FPSCR.SZ → FPSCR.SZ 1111001111111101 1 — FSCHG Description: This instruction inverts the SZ bit of the floating-point status register FPSCR. Changing the value of the SZ bit in FPSCR switches the amount of data for transfer by the FMOV instruction between one single-precision data and a pair of single-precision data.
  • Page 431: Fsqrt (Floating-Point Square Root): Floating-Point Instruction

    10.3.21 FSQRT (Floating-point Square Root): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit sqrt (FRn)* → FRn 1111nnnn01101101 14 — FSQRT FRn sqrt (DRn)* → DRn 1111nnn001101101 30 — FSQRT DRn Note: sqrt(FRn) and sqrt(DRn) are the square roots of FRn and DRn, respectively. Description: When FPSCR.PR = 0: Finds the arithmetical square root of the single-precision floating-point number in FRn, and stores the result in FRn.
  • Page 432 void normal_fsqrt(int n) union { float f; int l; dstf,tmpf; union { double d; int l[2]; dstd,tmpd; union { int double x; int l[4]; tmpx; if(FPSCR_PR == 0) { tmpf.f = FR[n]; /* save destination value */ dstf.f = sqrt(FR[n]); /* round toward nearest or even */ tmpd.d = dstf.f;...
  • Page 433 FSQRT Special Cases: +NORM –NORM +DENORM –DENORM +0 –0 +INF –INF qNaN sNaN FSQRT SQRT Invalid Error Error –0 +INF Invalid qNaN Invalid (FRn) Note: When DN = 1, the value of a denormalized number is treated as 0. Possible Exceptions: •...
  • Page 434: Floating-Point Instruction

    10.3.22 FSRRA (Floating Point Square Reciprocal Approximate): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit 1/ sqrt(FRn)* → FRn — FSRRA FRn 1111nnnn01111101 — reserved 1111nnnn01111101 Note: sqrt(FRn) is the square root of FRn. Description: This instruction takes the approximate inverse of the arithmetic square root (absolute error is within ±2^–21) of the single-precision floating-point in FRn and writes the result to FRn.
  • Page 435 PZERO: NZERO: dz(n,sign_of(n)); break; PINF: FR[n]=0;break; NINF: invalid(n); break; qNAN: qnan(n); break; sNAN invalid(n); break; FSRRA Special Cases +NORM –NORM +DENORM –DENORM +0 –0 +INF –INF qNaN sNaN FSRRA(FRn) 1/SQRT Invalid Error Invalid DZ DZ +0 Invalid qNaN Invalid Note: When DN = 1, the value of denormalized number is treated as 0. Possible Exceptions: •...
  • Page 436: Fsts (Floating-Point Store System Register): Floating-Point Instruction

    10.3.23 FSTS (Floating-point Store System Register): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit FPUL → FRn — FSTS FPUL,FRn 1111nnnn00001101 Description: This instruction transfers the contents of system register FPUL to floating-point register FRn. Notes: None Operation: void FSTS(int n, float *FPUL) FR[n] = *FPUL;...
  • Page 437: Fsub (Floating-Point Subtract): Floating-Point Instruction

    10.3.24 FSUB (Floating-point Subtract): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit FRn-FRm → FRn 1111nnnnmmmm0001 1 — FSUB FRm,FRn DRn-DRm → DRn 1111nnn0mmm00001 1 — FSUB DRm,DRn Description: When FPSCR.PR = 0: Arithmetically subtracts the single-precision floating-point number in FRm from the single-precision floating-point number in FRn, and stores the result in FRn.
  • Page 438 case PZERO: break; case NZERO: switch (data_type_of(n)){ case NZERO: zero(n,0); break; default: break; break; case PINF: switch (data_type_of(n)){ case PINF: invalid(n); break; default: inf(n,1); break; break; case NINF: switch (data_type_of(n)){ case NINF: invalid(n); break; default: inf(n,0); break; break; FSUB Special Cases FSUB FRn,DRn FRm,DRm +NORM...
  • Page 439 Possible Exceptions and Overflow/Underflow Exception Trap Generating Conditions: • FPU error • Invalid operation • Overflow Generation of overflow-exception traps FPSCR.PR = 0: FRn and FRm have the different signs and the exponent of at least one value is H'FE FPSCR.PR = 1: DRn and DRm have the different signs and the exponent of at least one value is H'7FE •...
  • Page 440: Floating-Point Instruction

    10.3.25 FTRC (Floating-point Truncate and Convert to integer): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit (long)FRm → FPUL 1111mmmm00111101 1 — FTRC FRm,FPUL (long)DRm → FPUL 1111mmm000111101 1 — FTRC DRm,FPUL Description: When FPSCR.PR = 0: Converts the single-precision floating-point number in FRm to a 32-bit integer, and stores the result in FPUL.
  • Page 441 else{ /* case FPSCR.PR=1 */ case(ftrc_double_type_of(m)){ NORM: *FPUL = DR[m>>1]; break; PINF: ftrc_invalid(0,*FPUL); break; NINF: ftrc_invalid(1, *FPUL); break; int ftrc_signle_type_of(int m) if(sign_of(m) == 0){ if(FR_HEX[m] > 0x7f800000) return(NINF); /* NaN */ else if(FR_HEX[m] > P_INT_SINGLE_RANGE) return(PINF); /* out of range,+INF */ else return(NORM);...
  • Page 442 void ftrc_invalid(int sign, int *FPUL) set_V(); if((FPSCR & ENABLE_V) == 0){ if(sign == 0) *FPUL = 0x7fffffff; else *FPUL = 0x80000000; else fpu_exception_trap(); FTRC Special Cases Positive Negative Out of Out of FRn,DRn NORM –0 Range Range +INF –INF qNaN sNaN FTRC Invalid...
  • Page 443: Ftrv (Floating-Point Transform Vector): Floating-Point Instruction

    10.3.26 FTRV (Floating-point Transform Vector): Floating-Point Instruction Format Operation Instruction Code Cycle T Bit XMTRX,FVn transform_vector 1111nn0111111101 4 — FTRV (XMTRX, FVn) → FVn — — — — — Description: When FPSCR.PR = 0: This instruction takes the contents of floating-point registers XF0 to XF15 indicated by XMTRX as a 4-row ×...
  • Page 444 When FPSCR.enable.V/O/U/I is set, an FPU exception trap is generated regardless of whether or not an exception has occurred. When an exception occurs, correct exception information is reflected in FPSCR.cause and FPSCR.flag, and FVn is not updated. Appropriate processing should therefore be performed by software. Notes: None Operation: void FTRV (int n)
  • Page 445 Possible Exceptions: • Invalid operation • Overflow • Underflow • Inexact Rev. 1.50, 10/04, page 425 of 448...
  • Page 446 Rev. 1.50, 10/04, page 426 of 448...
  • Page 447: Section 11 List Of Registers

    Section 11 List of Registers The address map gives information on the on-chip I/O registers and is configured as described below. Register Addresses (by functional module, in order of the corresponding section numbers): • Descriptions by functional module, in order of the corresponding section numbers •...
  • Page 448: Register Addresses

    11.1 Register Addresses (by functional module, in order of the corresponding section numbers) Entries under Access size indicates numbers of bits. Note: Access to undefined or reserved addresses is prohibited. Since operation or continued operation is not guaranteed when these registers are accessed, do not attempt such access. Area 7 Access Module...
  • Page 449 Area 7 Access Module Name Abbreviation R/W P4 Address* Address* Size L memory L memory transfer source LSA0 H'FF00 0050 H'1F00 0050 address register 0 L memory transfer source LSA1 H'FF00 0054 H'1F00 0054 address register 1 L memory transfer LDA0 H'FF00 0058 H'1F00 0058...
  • Page 450: Register States In Each Operating Mode

    11.2 Register States in Each Operating Mode Power-on Manual Module Name Abbreviation Reset Reset Sleep Standby Exception TRAPA exception register TRA Undefined Undefined Retained Retained handling Exception event register EXPEVT H'0000 0000 H'0000 0020 Retained Retained Interrupt event register INTEVT Undefined Undefined Retained...
  • Page 451: Appendix

    Appendix CPU Operation Mode Register (CPUOPM) The CPUOPM is used to control the CPU operation mode. This register can be read from or written to the address H'FF2F0000 in P4 area or H'1F2F0000 in area 7 as 32-bit size. The write value to the reserved bits should be the initial value. The operation is not guaranteed if the write value is not the initial value.
  • Page 452 Initial Bit Name Value Description  31 to 6 H'000000F R Reserved The write value must be the initial value. RABD Speculative execution bit for subroutine return 0: Instruction fetch for subroutine return is issued speculatively. When this bit is set to 0, refer to Appendix C, Speculative Execution for Subroutine Return.
  • Page 453: Instruction Prefetching And Its Side Effects

    Instruction Prefetching and Its Side Effects This LSI is provided with an internal buffer for holding pre-read instructions, and always performs pre-reading. Therefore, program code must not be located in the last 64-byte area of any memory space. If program code is located in these areas, a bus access for instruction prefetch may occur exceeding the memory areas boundary.
  • Page 454: Speculative Execution For Subroutine Return

    Speculative Execution for Subroutine Return The SH-4A has the mechanism to issue an instruction fetch speculatively when returning from subroutine. By issuing an instruction fetch speculatively, the execution cycles to return from subroutine may be shortened. This function is enabled by setting 0 to the bit 5 (RABD) of CPU Operation Mode register (CPUOPM).
  • Page 455: Version Registers (Pvr, Prr)

    Since the values of the version registers differ for every product, please refer to the hardware manual or contact Renesas Technology Corp.. Note: The bit 7 to bit 0 of PVR register and the bit 3 to bit 0 of PRR register should be masked by the software.
  • Page 456 Product Register (PRR): Bit:                 Initial value: R/W: Bit:     Product Initial value:     R/W: Initial Bit Name Value Description 31 to 16 —...
  • Page 457: Main Revisions And Additions In This Edition

    Page Revision (See Manual for Details) Preface — Deleted. The SH-4A is a RISC (Reduced Instruction Set Computer) microcomputer which includes a Renesas Technology-original RISC CPU as its core. and the peripheral functions required to configure a system. 1.1 Features Amended.
  • Page 458 Item Page Revision (See Manual for Details) Table 1.2 Changes from SH-4 to Added. SH-4A Section No. and Sub-section Name Sub-section Name Changes 8. Caches 8.3.6 OC Two-Way Newly added. Mode Instruction Cache IC index mode is Operation deleted. 8.4.3 IC Two-Way Newly added.
  • Page 459 Item Page Revision (See Manual for Details) Figure 4.2 Instruction Execution Amended. Patterns (7) (6-3) LDS.L to FPUL: 1 issue cycle (6-5) LDS to FPSCR: 1 issue cycle (6-7) LDS.L to FPSCR: 1 issue cycle Table 4.2 Instruction Groups Amended. Instruction Group Instruction...
  • Page 460: Section 8 Caches

    Item Page Revision (See Manual for Details) 7.2.2 Page Table Entry Low Added. Register (PTEL) Bit Name Initial Value  7.2.6 Physical Address Space Amended. Control Register (PASCR) Name Description 7 to 0 Buffered Write Control for Each Area (64 Mbytes) When writing is performed without using the cache or in the cache write-through mode, these bits specify whether the next bus access from the CPU...
  • Page 461: Table 7.1 Register Configuration

    Item Page Revision (See Manual for Details) 8.7.3 Transfer to External Deleted. Memory The SQ area (H'E000 0000 to H'E3FF FFFF) is set in • VPN of the UTLB, and the transfer destination physical When MMU is enabled (AT = address in PPN.
  • Page 462 Item Page Revision (See Manual for Details) 10.1.4 AND (AND Logical) Added. • Exceptions are checked taking a data access by this Possible Exceptions instruction as a byte load and a byte store. 10.1.50 OR (OR Logical) Added. • Exceptions are checked taking a data access by this Possible Exceptions instruction as a byte load and a byte store.
  • Page 463 Item Page Revision (See Manual for Details) 10.1.76 SYNCO (Synchronize Deleted. Data Operation) 1. Ordering access to memory areas which are shared • Example with other memory users 2. Ordering access to memory-mapped hardware registers 2. Flushing all write buffers 3.
  • Page 464 Item Page Revision (See Manual for Details) Appendix A Added. The write value to the reserved bits should be the initial value. The operation is not guaranteed if the write value is not the initial value. The CPUOPM register should be updated by the CPU store instruction not the access from SuperHyway bus master except CPU.
  • Page 465: Index

    Index 32-Bit address extended mode....151 Floating-point graphics acceleration instructions..........42 Address space identifier (ASID)..... 120 Floating-point registers ......9, 12 Address translation ......... 120 Floating-point Addressing modes........25 single-precision instructions ..... 40 Arithmetic operation instructions ..... 33 FPU error ..........109 ASID............
  • Page 466 Manual reset ..........75 MACL........... 16 Memory management unit...... 113 MMUCR ......125, 428, 430 Memory-mapped registers......19 PASCR........ 128, 428, 430 Multiple virtual memory mode....120 PC ............16 PR ............16 NMI (nonmaskable interrupt) ....91 PRR............. 436 PTEH ........
  • Page 467 Unconditional trap ........84 Validity bit ..........132 Underflow..........109 Vector addresses ........70 User mode........... 8 Virtual address space ......115 UTLB............131 VPN ............131 UTLB address array........ 149 UTLB data array........150 Write-through bit ........133 Rev. 1.50, 10/04, page 447 of 448...
  • Page 468 Rev. 1.50, 10/04, page 448 of 448...
  • Page 469 Publication Date: Rev.1.00, Nov 27, 2003 Rev.1.50, Oct 29, 2004 Published by: Sales Strategic Planning Div. Renesas Technology Corp. Edited by: Technical Documentation & Information Department Renesas Kodaira Semiconductor Co., Ltd.  2004. Renesas Technology Corp., All rights reserved. Printed in Japan.
  • Page 470 Nippon Bldg., 2-6-2, Ohte-machi, Chiyoda-ku, Tokyo 100-0004, Japan http://www.renesas.com RENESAS SALES OFFICES Refer to "http://www.renesas.com/en/network" for the latest and detailed information. Renesas Technology America, Inc. 450 Holger Way, San Jose, CA 95134-1368, U.S.A Tel: <1> (408) 382-7500, Fax: <1> (408) 382-7501 Renesas Technology Europe Limited Dukes Meadow, Millboard Road, Bourne End, Buckinghamshire, SL8 5FH, U.K.
  • Page 472 SH-4A Software Manual...

Table of Contents