Page 2
This document is intended only to assist the reader in the use of the product. ARM Limited shall not be liable for any loss or damage arising from the use of any information in this document, or any error or omission in such information, or any incorrect use of the product.
Page 7
Preface This preface introduces the ARM9TDMI (Revision 1 and subsequent revisions), which is a member of the ARM family of general-purpose microprocessors. It contains the following sections: • About this document on page viii. • Further reading on page ix.
Preface About this document This document is a reference manual for the ARM9TDMI microprocessor. The ARM9TDMI includes the following features: • The option, selectable using the UNIEN signal, of using two unidirectional buses DD[31:0] and DDIN[31:0], instead of a single bidirectional data bus. This is described in Unidirectional/bidirectional mode interface on page 3-10.
Preface Further reading This section lists publications by ARM Limited, and by third parties. ARM publications ARM Architecture Reference Manual (ARM DDI 0100). ARM7TDMI Data Sheet (ARM DDI 0029). Other reading IEEE Std. 1149.1 - 1990, Standard Test Access Port and Boundary-Scan Architecture.
Preface Typographical conventions The following typographical conventions are used in this document: bold Highlights ARM processor signal names within text, and interface elements such as menu names. May also be used for emphasis in descriptive lists where appropriate. italic Highlights special terminology, cross-references and citations.
• a concise explanation of your comments. General suggestions for additions and improvements are also welcome. Feedback on the ARM9TDMI If you have any comments or suggestions about the ARM9TDMI, please contact your supplier giving: • the product name •...
The ARM9TDMI supports both the 32-bit ARM and 16-bit Thumb instruction sets, allowing the user to trade off between high performance and high code density. The ARM9TDMI supports the ARM debug architecture and includes logic to assist in both hardware and software debug.
ARM Architecture Reference Manual. The ARM v4T architecture specifies a small number of implementation options. The options selected in the ARM9TDMI implementation are listed in the table below. For comparison, the options selected for the ARM7TDMI implementation are also shown:...
Page 19
2.1.2 Instruction set extension spaces All ARM processors implement the undefined instruction space as one of the entry mechanisms for the Undefined Instruction Exception. That is, ARM instructions with opcode[27:25] = 0b011 and opcode[4] = 1 are UNDEFINED on all ARM processors including the ARM9TDMI and ARM7TDMI.
Programmer’s Model Pipeline implementation and interlocks The ARM9TDMI implementation uses a five-stage pipeline design. These five stages are: • instruction fetch (F) • instruction decode (D) • execute (E) • data memory access (M) • register write (W). ARM implementations are fully interlocked, so that software will function identically across different implementations without concern for pipeline effects.
Chapter 3 ARM9TDMI Processor Core Memory Interface This chapter describes the memory interface of the ARM9TDMI processor core. The processor core has a Harvard memory architecture, and so the memory interface is separated into the instruction interface and the data interface. The information in this chapter is broken down as follows: •...
ARM9TDMI Processor Core Memory Interface About the memory interface The ARM9TDMI has a Harvard bus architecture with separate instruction and data interfaces. This allows concurrent instruction and data accesses, and greatly reduces the CPI of the processor. For optimal performance, single cycle memory accesses for both interfaces are required, although the core can be wait-stated for non-sequential accesses, or slower memory systems.
Page 25
Alternatively, wait states may be inserted by stretching either phase of GCLK before it is applied to the processor. ARM9TDMI does not contain any dynamic logic which relies on regular clocking to maintain its state. Therefore there is no limit on the maximum period for which GCLK may be stretched, in either phase, or the time for which nWAIT may be held LOW.
However, in order to ease the system design, it is possible to connect the ARM9TDMI to memory which takes two (or more) cycles for a non-sequential (N) access, and one cycle for a sequential (S) access.
ITBIT signal. When this signal is LOW, the processor is in ARM state, and 32-bit instructions are fetched. When it is HIGH, the processor is in Thumb state and 16-bit instructions are fetched.
If the memory control logic does not make use of the DABORT signal, it must be tied LOW, but with the exception that data can be transferred to and from the ARM9TDMI core. ARM DDI0145B...
Page 30
Reserved For coprocessor transfers, access to memory is not required, but there will be a transfer of data between the ARM9TDMI and coprocessor using the data buses, DD[31:0] and DDIN[31:0]. DnRW indicates the direction of the transfer and DMAS[1:0] indicates word transfers, as all coprocessor transfers are word sized.
If UNIEN is LOW, DD[31:0] is a tristate output bus used to transfer write data. It is only driven when the ARM9TDMI is performing a write to memory. By wiring DD[31:0] to the input DDIN[31:0] bus (externally to the ARM9TDMI), a bidirectional data data bus can be formed.
ARM9TDMI Processor Core Memory Interface Endian effects for data transfers The ARM9TDMI supports 32-bit, 16-bit and 8-bit data memory access sizes. The endian configuration of the processor, set by BIGEND, affects only non-word transfers (16-bit and 8-bit transfers). For data writes by the processor, the write data is duplicated on the data bus. So for a 16-bit data store, one copy of the data appears on the upper half of the data bus, DD[31:16], and the same data appears on the lower half, DD[15:0].
If GCLK is LOW, they will not change until after the GCLK goes HIGH. When nRESET is driven HIGH, the ARM9TDMI starts requesting memory again once the signal has been synchronized, and the first memory access will start two cycles later.
Chapter 4 ARM9TDMI Coprocessor Interface This chapter describes the ARM9TDMI coprocessor interface, and details the following operations: • About the coprocessor interface on page 4-2. • LDC/STC on page 4-3. • MCR/MRC on page 4-9. • Interlocked MCR on page 4-11.
Coprocessors determine the instructions they need to execute using a pipeline follower in the coprocessor. As each instruction arrives from memory, it enters both the ARM pipeline and the coprocessor pipeline. Typically, a coprocessor operates one clock phase behind the ARM9TDMI pipeline. The...
Page 41
• the fetched instruction should be latched. In all other cases, the ARM9TDMI pipeline is stalled, and the coprocessor pipeline should not advance. Figure 4-2 shows the timing for these signals, and indicates when the coprocessor pipeline should advance its state. In this timing diagram, Coproc Clock shows a processed version of GCLK with InMREQ and nWAIT.
Page 42
This is the only cycle in which LATECANCEL can be asserted. On the falling edge of the clock, the ARM9TDMI processor core examines the coprocessor handshake signals CHSD[1:0] or CHSE[1:0]: •...
Page 43
When only one further word is to be transferred, the coprocessor drives the handshake signals with LAST. In phase 2 of the execute stage, the ARM9TDMI processor core outputs the address for the LDC/STC. Also in this phase, DnMREQ is driven LOW, indicating to the memory system that a memory access is required at the data end of the device.
Page 44
ARM9TDMI Coprocessor Interface If a coprocessor is not attached to the ARM9TDMI, the handshake signals must be driven with “10” ABSENT, otherwise the ARM9TDMI processor will hang if a coprocessor enters the pipeline. If multiple coprocessors are to be attached to the interface, the handshaking signals can be combined by ANDing bit 1, and ORing bit 0.
These cycles look very similar to STC/LDC. An example, with a busy-wait state, is shown in Figure 4-3: Figure 4-3 ARM9TDMI MCR / MRC transfer timing First InMREQ is driven LOW to denote that the instruction on ID is entering the decode stage of the pipeline.
Page 46
In the case of an MCR, the DD[31:0] bus is driven with the register data. In the case of an MRC, DDIN[31:0] is sampled at the end of the ARM9TDMI memory stage and written to the destination register during the next cycle.
ARM9TDMI Coprocessor Interface Interlocked MCR If the data for an MCR operation is not available inside the ARM9TDMI pipeline during its first decode cycle, the ARM9TDMI pipeline will interlock for one or more cycles until the data is available. An example of this is where the register being transferred is the destination from a preceding LDR instruction.
InTRANS changes after a mode change. Figure 4-6 ARM9TDMI privileged instructions The first two CHSD responses are ignored by the ARM9TDMI because it is only the final CHSD response, as the instruction moves from decode into execute, that counts. This allows the coprocessor to change its response as InTRANS/InM[4:0] changes.
Chapter 5 Debug Support This chapter describes the debug support for the ARM9TDMI, including the EmbeddedICE macrocell: • About debug on page 5-2. • Debug systems on page 5-3. • Debug interface signals on page 5-5. • Scan chains and JTAG interface on page 5-11.
• asynchronously by a debug request. When this happens, the ARM9TDMI is said to be in debug state. At this point, the internal state of the core and the external state of the system may be examined. Once examination is complete, the core and system state may be restored and program execution resumed.
The debug host is connected to the ARM9TDMI development system via an interface (an RS232, for example). The messages broadcast over this connection must be converted to the interface signals of the ARM9TDMI. This function is performed by the protocol converter, for example, Multi-ICE.
Page 58
5.2.3 The ARM9TDMI The ARM9TDMI, with hardware extensions to ease debugging, is the lowest level of the system. The debug extensions allow the user to stall the core from program execution, examine its internal state and the state of the memory system, and then resume program execution.
IEBKPT, DEWPT, and EDBGRQ, with which the system asks the ARM9TDMI to enter debug state • DBGACK, which the ARM9TDMI uses to flag back to the system when it is in debug state. 5.3.1 Entry into debug state on breakpoint Any instruction being fetched for memory is latched at the end of phase 2.
Page 61
Figure 5-4 on page 5-9. However, it is always possible to restart the processor. Once the processor has entered debug state, the ARM9TDMI core may be interrogated to determine its state. In the case of a watchpoint, the PC contains a value that is five instructions on from the address of the next instruction to be executed.
Page 64
If there is an abort with the data access as well as a watchpoint, the watchpoint condition is latched, the exception entry sequence performed, and then the processor enters debug state. If there is an interrupt pending, again the ARM9TDMI allows the exception entry sequence to occur and then enters debug state.
The signals provided for this scan chain are described on Scan chain 3 on page 5-25. The three scan chains of the ARM9TDMI are referred to as scan chain 0, 1 and 2. Note The ARM9TDMI TAP controller supports 32 scan chains.
Page 67
In order to minimize static current draw, these resistors are not fitted to the ARM9TDMI. Accordingly, the four inputs to the test interface (the TDO, TDI and TMS signals plus TCK) must all be driven to valid logic levels to achieve normal circuit operation.
Page 71
When the HIGHZ instruction is loaded into the instruction register and scan chain 0 is selected, all ARM9TDMI outputs are driven to the high impedance state and the external HIGHZ signal is driven HIGH. This is as if the signal TBE had been driven LOW.
Page 75
SCREG[4:0], IR[3:0], TAPSM[3:0], TCK1 and TCK2. The list of scan chain numbers allocated by ARM are shown in Table 5-3. An external scan chain may take any other number. The serial data stream applied to the external scan chain is made present on SDIN.
Page 77
• After the ARM9TDMI has entered debug state, the first time SYSSPEED is captured and scanned out, its value tells the debugger whether the core has entered debug state due to a breakpoint (SYSSPEED LOW), or a watchpoint (SYSSPEED HIGH).
Page 79
Scan chain 3 is provided so that an optional external boundary scan chain may be controlled via the ARM9TDMI. Typically this would be used for a scan chain around the pad ring of a packaged device. The following control signals are provided and are generated only when scan chain 3 has been selected.
DCLK. During normal operation, the core is clocked by GCLK, and internal logic holds DCLK LOW. When the ARM9TDMI is in the debug state, the core is clocked by DCLK under control of the TAP state machine, and GCLK may free run.
Debug Support Clock switching during debug When the ARM9TDMI enters debug state, it must switch from GCLK to DCLK. This is handled automatically by logic in the ARM9TDMI. On entry to debug state, the ARM9TDMI asserts DBGACK in the HIGH phase of GCLK. The switch between the two clocks occurs on the next falling edge of GCLK.
On the way into test, GCLK must be held LOW. The TAP controller can now be used to perform serial testing on the ARM9TDMI. If scan chain 0 and INTEST are selected, DCLK is generated while the state machine is in RUN-TEST/IDLE state.
If the processor has entered debug state from Thumb state, the simplest course of action is for the debugger to force the core back into ARM state. Once this is done, the debugger can always execute the same sequence of instructions to determine the processor state.
Page 84
Executing instructions more slowly than usual is fine for accessing the core’s state since the ARM9TDMI is fully static. However, this same method cannot be used for determining the state of the rest of the system.
Page 85
SYSCOMP (bit 3 of the Debug status register). To access memory, the ARM9TDMI must access memory through the data data bus interface, as this access may be stalled indefinitely by nWAIT. Therefore, the only way to determine whether the memory access has completed is to examine the SYSCOMP bit.
Then, when RUN-TEST/IDLE state is entered, all the processors resume operation simultaneously. The function of DBGACK is to tell the rest of the system when the ARM9TDMI is in debug state. This can be used to inhibit peripherals such as watchdog timers that have real time characteristics.
5.12 The behavior of the program counter during debug To force the ARM9TDMI to branch back to the place at which program flow was interrupted by debug, the debugger must keep track of what happens to the PC. There are six cases: •...
Page 90
A similar sequence is followed when an interrupt, or any other exception, occurs during a watchpointed memory access. The ARM9TDMI will enter debug state in the mode of the exception, and so the debugger must check to see whether this happened. The debugger can deduce whether an exception occurred by looking at the current and previous mode, (in the CPSR and SPSR), and the value of the PC.
Page 91
If an abort occurs during a system speed memory access, the ARM9TDMI enters abort mode before returning to debug state. This is similar to an aborted watchpoint. However, the problem is much harder to fix because the abort was not caused by an instruction in the main program, and the PC does not point to the instruction that caused the abort.
Debug Support 5.13 EmbeddedICE macrocell The EmbeddedICE macrocell is integral to the ARM9TDMI processor core. It has two hardware breakpoint/watchpoint units each of which may be configured to monitor either the instruction memory interface or the data memory interface. Each watchpoint unit has a value and mask register, with an address, data and control field.
Page 93
Debug Support Table 5-4 ARM9TDMI EmbeddedICE macrocell register map (continued) Address Width Function 01000 Watchpoint 0 address value 01001 Watchpoint 0 address mask 01010 Watchpoint 0 data value 01011 Watchpoint 0 data mask 01100 Watchpoint 0 control value 01101 Watchpoint 0 control mask...
Page 96
“breakpoint on address YYY only when in process XXX”. In the ARM9TDMI EmbeddedICE macrocell, the CHAINOUT output of watchpoint 1 is connected to the CHAIN input of watchpoint 0. The CHAINOUT output is derived from a latch. The address/control field comparator drives the write enable for the latch and the input to the latch is the value of the data field comparator.
Page 97
ITBIT Compares against the Thumb state signal from the core to determine between a Thumb (ITBIT = 1) instruction fetch or an ARM (ITBIT = 0) fetch. InTRANS Compares against the not translate signal from the core in order to determine between a user mode (InTRANS = 0) instruction fetch, and a privileged mode (InTRANS = 1) fetch.
Page 98
Debug Support 5.13.4 Debug control register The ARM9TDMI debug control register is four bits wide and is shown in Figure 5-12: Figure 5-12 Debug control register Bit 3 controls the single-step hardware, and this is explained in more detail in Figure 5-15 on page 5-48.
Page 99
5.13.6 Vector catch register The ARM9TDMI EmbeddedICE macrocell controls logic to enable accesses to the exception vectors to be trapped in an efficient manner. This is controlled by the vector catch register, as shown in Figure 5-14. The functionality is described in Vector catching on page 5-46.
For example, if the processor executes a SWI instruction while bit 2 of the Vector catch register is set, the ARM9TDMI fetches an instruction from location 0x8. The vector catch hardware detects this access and forces the internal Breakpoint signal HIGH into the ARM9TDMI control logic.
Debug Support 5.15 Single stepping The ARM9TDMI EmbeddedICE macrocell contains logic that allows efficient single stepping through code. This leaves the macrocell watchpoint comparators free for general use. This function is enabled by setting bit 3 of the debug control register. The state of this bit should only be altered while the processor is in debug state.
Debug Support 5.16 Debug communications channel The ARM9TDMI EmbeddedICE macrocell contains a communication channel for passing information between the target and the host debugger. This is implemented as coprocessor 14. The communications channel consists of a 32-bit wide comms data read register, a 32-bit wide comms data write register and a 6-bit wide comms control register for synchronized handshaking between the processor and the asynchronous debugger.
Page 103
MRC p14, 0, Rd, C1, C0 Returns the debug data read register into Rd. Note The Thumb instruction set does not support coprocessors so the ARM9TDMI must be operated in ARM state in order to access the debug comms channel. 5.16.2...
Page 104
W bit. At this point, the communications process may begin again. As an alternative to polling, the debug comms channel can be interrupt driven by connecting the ARM9TDMI COMMRX and COMMTX signals to the systems interrupt controller. Receiving a message from the debugger Message transfer from the debugger to the processor is similar to sending a message to the debugger.
Chapter 6 Test Issues This chapter examines the test issues for the ARM9TDMI and lists the scan chain 0 bit order under the headings: • About testing on page 6-2. • Scan chain 0 bit order on page 6-3. ARM DDI0145B...
About testing The ARM9TDMI processor core supports parallel and serial test methodologies. The parallel test patterns are derived from assembler ARM code programs written to achieve a high fault coverage. The ARM9TDMI processor core has a fully JTAG-compatible scan chain which intersects all the inputs and outputs.
The number of words transferred in an LDM/STM/LDC/STC Coprocessor register transfer (C-cycle) Internal cycle (I-cycle) Non-sequential cycle (N-cycle) Sequential cycle (S-cycle) Table 7-2 summarizes the ARM9TDMI instruction cycle counts and bus activity when executing the ARM instruction set. Table 7-2 Instruction cycle bus times Instruction Instruction...
Page 114
The number of cycles that a multiply instruction takes to complete depends on which instruction it is, and on the value of the multiplier-operand. The multiplier-operand is the contents of the register specified by bits [8:11] of the ARM multiply instructions, or bits [2:0] of the Thumb multiply instructions.
Pipeline interlocks occur when the data required for an instruction is not available due to the incomplete execution of an earlier instruction. When an interlock occurs, instruction fetches stop on the instruction memory interface of the ARM9TDMI. Four examples of this are given below.
Appendix A ARM9TDMI Signal Descriptions This chapter lists and describes the ARM9TDMI signals: • Instruction memory interface signals on page A-2. • Data memory interface signals on page A-3. • Coprocessor interface signals on page A-5. • JTAG and TAP controller signals on page A-6.
Output Coprocessor PASS. This signal indicates that there is a coprocessor instruction in the execute stage of the pipeline, and it should be executed. For further information on the coprocessor interface refer to Chapter 4 ARM9TDMI Coprocessor Interface. ARM DDI0145B...
ARM9TDMI. COMMTX Output Communications Channel Transmit. When HIGH, this signal denotes that the comms channel transmit buffer is empty and the ARM9TDMI can write new data to the comms channel. DBGACK Output Debug Acknowledge.
Page 148
Direction Description BIGEND Input Big-Endian Configuration. When this input is HIGH, the ARM9TDMI processor treats bytes in memory as being in big-endian format. When it is LOW, memory is treated as little-endian. ECLK Output External Clock. The clock by which the ARM9TDMI is currently being clocked. This clock will reflect any wait states applied by nWAIT, and once debug state has been entered by the debug clock.
Page 149
Input Not Wait. When a memory request cannot be processed in a single cycle, the ARM9TDMI can be made to wait for a number of GCLK cycles by driving nWAIT LOW. Internally, the inverse of nWAIT is ORed with GCLK, and must only change when GCLK is HIGH. If nWAIT is not used, it must be tied HIGH.
Need help?
Do you have a question about the ARM9TDMI and is the answer not in the manual?
Questions and answers