Page 1
ARM946E-S Technical Reference Manual ARM DDI 0155A...
Page 2
This document is intended only to assist the reader in the use of the product. ARM Limited shall not be liable for any loss or damage arising from the use of any information in this document, or any error or omission in such information, or any incorrect use of the product.
This document is a reference manual for the ARM946E-S. Intended audience This document has been written for hardware and software engineers who want to design or develop products based upon the ARM946E-S processor. It assumes no prior knowledge of ARM products. Using this manual...
Page 13
Chapter 8 Debug Support This chapter describes the debug support for the ARM946E-S and the EmbeddedICE-RT logic. Chapter 9 ETM Interface This chapter describes the ETM interface, including details of how to enable the interface. Chapter 10 Test Support This chapter describes the test methodology used for the ARM946E-S synthesized logic and tightly-coupled SRAM.
Further reading This section lists publications by ARM Limited, and by third parties. If you would like further information on ARM products, or if you have questions not answered by this document, please contact info@arm.com or visit our web site at http://www.arm.com.
Feedback ARM Limited welcomes feedback both on the ARM946E-S, and on the documentation. Feedback on the ARM946E-S If you have any comments or suggestions about this product, please contact your supplier giving: • the product name • a concise explanation of your comments...
Introduction About the ARM946E-S The ARM946E-S is a synthesizable macrocell combining an ARM processor. It is a member of the ARM9 Thumb family of high-performance, 32-bit system-on-chip processor solutions. The ARM946E-S has tightly-coupled SRAM memory, and instruction and data caches and is targeted at a wide range of embedded applications where high-performance, low system cost, small die size and low power are all important.
Introduction Microprocessor block diagram The ARM946E-S block diagram is shown in Figure 1-1. Data Instruction System control External SRAM SRAM coprocessor Bus interface unit coprocessor (CP15) and write buffer interface Addr Addr WDATA ARM9E-S INSTR RDATA interface Memory System Instruction...
Page 21
Chapter 2 Programmer’s Model This chapter describes the programmer’s model for the ARM946E-S. It contains the following sections: • About the ARM94E-S programmer’s model on page 2-2 • About the ARM9E-S programmer’s model on page 2-3 • CP15 register map summary on page 2-4.
CP14 within the ARM9E-S core allows software access to the debug communications channel • CP15 allows configuration of the caches, tightly-coupled SRAM, protection unit, write buffer, and other ARM946E-S system options such as big or little-endian operation. The registers defined in CP14 are accessible with instructions, and are described in The debug communications channel on page 8-29.
About the ARM9E-S programmer’s model The ARM9E-S processor core implements the ARMv5TExP architecture, which includes the 32-bit ARM instruction set and the 16-bit Thumb instruction set. For a description of both instruction sets, see the ARM Architecture Reference Manual. Contact ARM for complete descriptions of both instruction sets.
Programmer’s Model CP15 register map summary The ARM946E-S incorporates CP15 for system control. CP15 allows configuration of the caches, tightly-coupled SRAM, and protection unit. It also allows configuration of ARM946E-S system options including big or little-endian operation. The register map for CP15 is shown in Table 2-1.
This is a read-only register that contains information about the size and architecture of the instruction cache (ICache) and data cache (DCache), allowing operating systems to establish how to perform operations such as cache cleaning and lockdown. Future ARM cached processors will contain this register, allowing RTOS vendors to produce future-proof versions of their operating systems.
2.3.5 Register 1, Control register This register contains the control bits of the ARM946E-S. All reserved bits must either be written with zero or one, as indicated, or written using read-modify-write. The reserved bits have an unpredictable value when read. To read and write this register: MRC p15, 0, rd, c1, c0, 0;...
Page 32
You can use the instruction RAM load mode for initializing the instruction RAM. The instruction RAM load mode allows you to load data into ARM registers from either data cache or main memory, and then write to the same address but within the tightly-coupled instruction RAM.
Page 33
You can use the data RAM load mode for initializing the data RAM. The data RAM load mode allows you to load data into ARM registers from either data cache or main memory, and then write to the same address but within the tightly-coupled data RAM.
Page 34
HIGH. This can be done with a single write to register 1. At reset this bit is cleared. Bit 7, Endian Selects the endian configuration of the ARM946E-S. When this bit is HIGH, big-endian configuration is selected. When LOW, little-endian configuration is selected. At reset this bit is cleared.
Programmer’s Model The following instructions are supported for backwards compatibility with existing ARM processors with memory protection, and access the standard registers: MRC p15, 0, rd, c5, c0, 0; read data access permission bits MRC p15, 0, rd, c5, c0, 1; read instruction access permission bits MCR p15, 0, rd, c5, c0, 0;...
• clean and flush the DCache. The ARM946E-S uses a subset of the ARM architecture v4 functions (defined in the ARM Architecture Reference Manual). The available operations are summarized in Table 2-19. Table 2-19 Cache operations...
Figure 2-2 Index and segment format The size of the index varies depending on the implemented cache size. Table 2-20 shows how the index size changes for the cache sizes supported by the ARM946E-S. Table 2-20 Index fields for supported cache sizes...
Wait for interrupt This operation allows the ARM946E-S to enter a low-power standby mode. When you invoke the operation, the CLKEN signal to the processor core is negated and the cache and tightly-coupled memories are placed in a low-power state until either an interrupt or a debug request occurs.
2.3.14 Register 15, RAM and TAG BIST test registers Register 15 gives you access to the test features included within the ARM946E-S. The register map for CP15 register 15 BIST-related instructions is shown in Table 2-24. Table 2-24 Register 15, BIST instructions...
MCR p15, 2, Rd, c15, c0, 7 register Note ARM recommends that you do not write application code that relies on the presence of the BIST address and general registers. ARM does not guarantee to support these registers in future versions of the ARM946E-S.
2.3.16 Register 15, Cache debug index register Register 15 gives you access to the test features included within the ARM946E-S. Additional instructions and operations are required to support debug operations within the cache. Instructions for the additional operations are listed in Table 2-27.
Page 55
Chapter 3 Caches To reduce the effective memory access time, the ARM946E-S uses a cache controller, an Instruction Cache (ICache), and a Data Cache (DCache). This chapter describes the features and behavior of each of these blocks. It contains the following sections: •...
Caches Cache architecture The ARM946E-S incorporates ICache and DCache. You can tailor the size of these to suit individual applications. A range of different cache sizes is supported: • • • • 16KB • 32KB • 64KB • 128KB •...
Caches ICache The ARM946E-S has a four-way set-associative ICache. You can choose the size of the ICache from any of the supported cache sizes. The ICache uses the physical address generated by the processor core. It uses a policy of allocate on read-miss, and is always reloaded one cache line (eight words) at a time, through the external interface.
Page 61
As shown in Table 2-19 on page 2-22, you can flush the entire ICache using an instruction. In this case, the contents of the ARM register transferred to CP15 must be zero. You can use the following code segment to do this: MOV r0, #0 ;...
Caches DCache The ARM946E-S has a four-way set-associative DCache. You can choose the size of the DCache from any of the supported cache sizes. The DCache uses the physical address generated by the processor core. It uses an allocate on read-miss policy, and is always reloaded one cache line (eight words) at a time, through the external interface.
The ARM946E-S does not support memory translation so you can always consider the data in the DCache as valid within the context of the ARM946E-S. However, if you use external memory translation, and the mappings are changed, the DCache is no longer consistent with external memory, and you must flush it.
You can carry out lockdown in the DCache using CP15 register 9. ICache lockdown uses both CP15 registers 7 and 9. As described in Cache architecture on page 3-2, the ARM946E-S ICache and DCache each comprise four segments. You can perform lockdown with a granularity of one segment.
Page 68
- Interrupts must be disabled - Subroutine must be called using the BL instruction - r1-r3 can be corrupted in line with ARM/Thumb Procedure Call Standards (ATPCS) - Returns final ICache lockdown index in r0 if successful...
Enabling the protection unit Before the protection unit is enabled, you must program at least one valid protection region. If you do not do this the ARM946E-S can enter a state that is recoverable only by reset. Setting bit 0 of the CP15 register 1, the control register, enables the protection unit.
• read and write access permissions. The ARM architecture uses constants known as inline literals to perform address calculations. These constants are automatically generated by the assembler and compiler and are stored inline with the instruction code. To ensure correct operation, you must define an area of memory, from where code is to be executed, that allows both data and instruction accesses.
Page 77
If the address issued by the processor falls outside any of the defined regions, the ARM946E-S protection unit is hard-wired to abort the access. You can override this behavior by programming region 0 to be a 4GB background region. In this way, if the address does not fall into any of the other seven regions, the access is controlled by the attributes you have specified for region 0.
Chapter 5 Tightly-coupled SRAM This chapter describes the tightly-coupled SRAM in the ARM946E-S. It contains the following sections: • ARM946E-S SRAM requirements on page 5-2 • Using CP15 control register on page 5-3. For details of the ARM9E-S interface signals referenced in this chapter, see the ARM9E-S Technical Reference Manual.
Tightly-coupled SRAM ARM946E-S SRAM requirements The ARM946E-S tightly-coupled SRAM is built from blocks of ASIC library compiled SRAM. The instruction SRAM (I-SRAM) and data SRAM (D-SRAM) can each be of any size supported by the protection unit, from 0 bytes to 1MB, although to ease implementation the size must be an integer power of two.
ARM9E-S instruction fetches and data accesses to the I-SRAM address space cause the I-SRAM to be accessed. Enabling the I-SRAM greatly increases the performance of the ARM946E-S because the majority of accesses to it can be performed with no stall cycles. Accessing the AHB however, can cause several stall cycles for each access.
Page 82
The procedure for initializing the I-SRAM using the load mode is as follows: Enable the I-SRAM and instruction load mode Load ARM registers from main memory, data cache or data RAM Store ARM registers into I-SRAM Increment address pointers and repeat load/store steps until the code image has been copied.
Page 83
The procedure for initializing the D-SRAM using the load mode is as follows: Enable the D-SRAM and data load mode Load ARM registers from main memory or data cache Store ARM registers into data RAM Increment address pointers and repeat load/store steps until the data image has been copied.
Page 85
Chapter 6 Bus Interface Unit and Write Buffer This chapter describes the ARM946E-S Bus Interface Unit (BIU) and write buffer. It contains the following sections: • About the BIU and write buffer on page 6-2 • AHB bus master interface on page 6-3 •...
See the AMBA Rev 2.0 AHB Specification for full details of this bus architecture. The ARM946E-S BIU implements a fully-compliant AHB bus master interface and incorporates a write buffer to increase system performance. The BIU is the link between the ARM9E-S core with the caches and tightly-coupled SRAM and the external AHB memory.
Bus Interface Unit and Write Buffer AHB bus master interface The ARM946E-S implements a fully compliant AHB bus master interface as defined in the AMBA Rev 2.0 Specification. See this document for a detailed description of the AHB protocol. 6.2.1...
Incrementing bursts have an address increment of four (that is, word increment). 6.2.4 Linefetch transfers The ARM946E-S is optimized to run with both the ICache and DCache enabled. If a memory request (either instruction or data) to a cachable area misses in the cache the ARM946E-S performs a linefetch.
Figure 6-2 Back to back linefetches 6.2.6 Uncached transfers If a memory request is made to an uncachable region, or the ARM946E-S cache is not enabled, the memory requests are serviced by the AHB interface. Sequential instruction fetches are treated as nonsequential reads.
If the slave being accessed at the HCLK rate has a multi-cycle response, the HREADY input to the ARM946E-S is driven LOW until the data is ready to be returned. The BIU must therefore perform a logical AND on the HREADY response with HCLKEN to detect that the AHB transfer has completed.
ARM946E-S. You can achieve this using a clock tree insertion tool, if the clock tree is inserted for the ARM946E-S and the embedded system at the same time (top level insertion).
In Figure 6-7, the slave peripheral has an input setup and hold, and an output hold and valid time relative to HCLK. The ARM946E-S has an input setup and hold, and an output hold and valid time relative to CLK’, the clock at the bottom of the clock tree.
Bus Interface Unit and Write Buffer The write buffer The ARM946E-S provides a write buffer to improve system performance. The write buffer has a 16-entry FIFO. Each entry can be either address or data. The type of entry is determined by the setting of an address/data flag. Each address entry is tagged with the size of transfer, as indicated by the ARM9E-S core (byte, halfword, or word).
Page 99
Chapter 7 Coprocessor Interface This chapter describes the ARM946E-S pipelined coprocessor interface. It contains the following sections: • About the coprocessor interface on page 7-2 • LDC/STC on page 7-4 • MCR/MRC on page 7-8 • Interlocked MCR on page 7-10 •...
Technical Reference Manual. Coprocessors determine the instructions they must execute using a pipeline follower in the coprocessor. As each instruction arrives from memory it enters both the ARM pipeline and the coprocessor pipeline. To avoid a critical path for the instruction being registered by the coprocessor, the coprocessor pipeline operates one clock cycle behind the ARM9E-S core pipeline.
Page 101
There are three classes of coprocessor instructions: Load from memory to coprocessor, or store from coprocessor to LDC/STC memory. Register transfer between coprocessor and ARM processor core. MCR/MRC Coprocessor data operation. The following sections give examples of how a coprocessor must execute these instruction classes: •...
LDC/STC instructions are used respectively to transfer data to and from external coprocessor registers and memory. For the ARM946E-S, the memory can be either internal memory (cache or tightly-coupled memory) or AHB depending on the address range of the access and the protection unit settings.
Page 104
. Also in this cycle, DnMREQ is driven LOW, indicating to the ARM946E-S memory system that a memory access is required at the data end of the device. The timing for the data on CPDOUT and CPDIN is shown in Figure 7-2 on page 7-4.
Meaning ABSENT WAIT LAST Note If an external coprocessor is not attached in the ARM946E-S embedded system, the CHSDE[1:0] and CHSEX[1:0] handshake inputs must be tied off to indicate ABSENT. 7.2.3 Multiple external coprocessors If multiple external coprocessors are to be attached to the ARM946E-S interface, you can combine the handshaking signals by ANDing bit 1, and ORing bit 0.
Page 113
Chapter 8 Debug Support This chapter describes the ARM946E-S debug interface. It contains the following sections: • About the debug interface on page 8-2 • Debug systems on page 8-4 • The JTAG state machine on page 8-7 • Scan chains on page 8-12 •...
• an external debug request. This is known as debug state. In debug state, the core and ARM946E-S memory system are effectively stopped, and isolated from the rest of the system. This is known as halt mode operation and allows you to examine the internal state of the ARM9E-S core, ARM946E-S system, and external AHB state, while all other system activity continues as normal.
Debug Support 8.1.1 Debug clocks You must synchronize the system and test clocks externally to the ARM946E-S macrocell. The ARM Multi-ICE debug agent directly supports one or more cores within an ASIC design. To synchronize off-chip debug clocking with the ARM946E-S macrocell you must use a three-stage synchronizer.
Debug Support Debug systems The ARM946E-S forms one component of a debug system that interfaces from the high-level debugging performed by the user to the low-level interface supported by the ARM946E-S. Figure 8-2 shows a typical debug system. Debug Debug...
Page 117
8.2.2 The protocol converter An interface, such as a parallel port, connects the debug host to the ARM946E-S development system. The messages broadcast over this connection must be converted to the interface signals of the ARM946E-S. The protocol converter performs the conversion.
15. This is used for debug access to the CP15 register bank, to allow you to configure the system state within the ARM946E-S while in debug state, for instance to enable or disable the SRAM before performing a debug load or store.
Debug Support Scan chains ARM946E-S supports 32 scan chains. Three scan chains are used inside ARM946E-S. These allow testing, debugging, and programming of the EmbeddedICE macrocell watchpoint units. The supported scan chains are listed in Table 8-2. Table 8-2 ARM946E-S scan chain allocations...
While debugging, the value placed in the SYSSPEED control bit determines if the ARM9E-S core executes the instruction at system speed. After the ARM946E-S has entered debug state, the first time SYSSPEED is captured and scanned out tells the debugger whether the core has entered debug state due to a breakpoint (SYSSPEED LOW) or a watchpoint (SYSSPEED HIGH).
Debug Support 8.4.3 Scan chain 3 This scan chain allows ARM946E-S to control an optional external boundary scan chain. You can determine the length of scan chain 3. 8.4.4 Scan chain 15 Scan chain 15 allows debug access to the CP15 register bank and allows the cache to be interrogated.
DBGIEBKPT, DBGDEWPT, and EDBGRQ are system requests for the ARM946E-S to enter debug state. • DBGACK is used by the ARM946E-S to flag back to the system that it is in debug state. 8.6.1 Entry into debug state on breakpoint Any instruction being fetched from memory is sampled at the end of a cycle.
Debug Support entry to debug state, in ARM state, the instruction is scanned in and SUB PC, PC, #20 the processor restarted, execution flow returns to the next instruction in the code sequence. Fldr Dldr Eldr Mldr Wldr Ddebug Edebug1...
Page 135
Actions of the ARM9E-S in debug state When the ARM9E-S is in debug state, both memory interfaces indicate internal cycles. This ensures that the tightly-coupled SRAM within the ARM946E-S, and the AHB interface, are both quiescent, allowing the rest of the AHB system to ignore the ARM9E-S and function as normal.
Debug Support Determining the core and system state When the ARM946E-S is in debug state, you can examine the core and system state by forcing the load and store multiples into the instruction pipeline. Before you can examine the core and system state, the debugger must determine whether the processor entered debug from Thumb state or ARM state, by examining bit 4 of the EmbeddedICE-RT debug status register.
EmbeddedICE-RT logic is configured so that a breakpoint/watchpoint causes the ARM to enter abort mode, taking the Prefetch Abort or Data Abort vectors respectively. You must be aware of a number of restrictions when the ARM is configured for real-time debugging: •...
The ETM interface is primarily one way. To provide code tracing, the ETM block must be able to monitor various ARM9E-S inputs and outputs. The required ARM9E-S inputs and outputs are collected and driven out from the ARM946E-S as the ETM interface.
Page 151
Chapter 10 Test Support This chapter describes the test methodology used for the ARM946E-S synthesized logic and tightly-coupled SRAM. It contains the following sections: • About the ARM946E-S test methodology on page 10-2 • Scan insertion and ATPG on page 10-3 •...
To achieve a high level of fault coverage, you can use scan insertion and ATPG techniques on the ARM9E-S core and ARM946E-S control logic as part of the synthesis flow. You can use BIST to provide high fault coverage of the compiled SRAM.
BIST of the SRAM. ATPG You can use the INTEST scan chain to enable an ATPG tool to access the ARM946E-S top-level inputs and outputs in an embedded design. This wrapper adds a scan source for each ARM946E-S input and a capture cell for each output. The ATPG tools use this...
Page 154
The INTEST wrapper allows the full range of BIST tests to be applied as detailed in BIST of memory arrays on page 10-5. The flow for generating the serialized patterns from ARM assembler source is supplied with the ARM946E-S implementation scripts.
SRAM can be BIST tested, while code is executed over the AHB. Serial scan access to the CP15 BIST operations is also provided for production test purposes, using a special mode of operation of the INTEST wrapper. See ARM946E-S INTEST wrapper on page 10-3.
Page 156
Note ARM recommends that you do not write application code that relies on the presence of the BIST address and general registers. ARM does not guarantee to support these registers in future versions of the ARM946E-S.
DBIST poke data 10.3.3 Pause modes ARM recommends that you use the following production test sequence for the SRAM: Test each SRAM using a full test. Test the BIST hardware for each SRAM. To allow testing of the BIST hardware, a pause mechanism enables you to halt the BIST test.
Page 158
• User pause on page 10-8. Note ARM recommends that you do not write application code that relies on the presence of the BIST pause mode. ARM does not guarantee to support this feature in future versions of the ARM946E-S.
Appendix B Signal Descriptions This appendix introduces the ARM946E-S processor. It contains the following sections: • Signal properties and requirements on page B-2 • Clock interface signals on page B-3 • AHB signals on page B-4 • Coprocessor interface signals on page B-6 •...
Signal Descriptions Signal properties and requirements In order to ensure ease of integration of the ARM946E-S into embedded applications and to simplify synthesis flow, the following design techniques have been used: • a single rising edge clock times all activity •...
CLK is also a rising edge of HCLK in the AHB system that the ARM946E-S is embedded in. Must be tied HIGH in systems where CLK and HCLK are intended to be the same frequency.
Bus grant highest priority master. Ownership of the address/control signals changes at the end of a transfer when HREADY is HIGH, so the ARM946E-S gets access to the bus when both HREADY and HGRANT are HIGH. HLOCK Output When HIGH, indicates that the ARM946E-S...
Page 177
The response can be OKAY (00), ERROR (01), RETRY (10), or SPLIT (11). HSIZE[2:0] Output Indicates the size of an ARM946E-S transfer. This Transfer size can be Byte (000), Halfword (001), or Word (010). Bit [2] is tied LOW.
Signal Descriptions Coprocessor interface signals Table B-3 describes the ARM946E-S coprocessor interface signals. Table B-3 Coprocessor interface signals Name Direction Description CPCLKEN Output Synchronous enable for coprocessor pipeline Coprocessor clock follower. When HIGH on the rising edge of CLK enable the pipeline follower logic can advance.
Page 179
Table B-3 Coprocessor interface signals (continued) Name Direction Description CPTBIT Output When HIGH indicates that the ARM946E-S is in Coprocessor Thumb state. When LOW indicates that the instruction Thumb ARM946E-S is in ARM state. Sampled by the coprocessor pipeline follower.
Asserted by external hardware to halt execution of Instruction the processor for debug purposes. If HIGH at the end breakpoint of an instruction fetch, it causes the ARM946E-S to enter debug state if that instruction reaches the Execute stage of the processor pipeline. DBGINSTREXEC...
Signal Descriptions JTAG signals Table B-5 describes the ARM946E-S JTAG signals. Table B-5 JTAG signals Name Direction Description DBGIR[3:0] Output These four bits reflect the current instruction loaded TAP controller into the TAP controller instruction register. These instruction register bits change when the TAP controller is in the UPDATE-IR state.
Signal Descriptions Miscellaneous signals Table B-6 describes the ARM946E-S miscellaneous signals. Table B-6 Miscellaneous signals Name Direction Description BIGENDOUT Output When HIGH, the ARM946E-S treats bytes in memory as being in big-endian format. When LOW, memory is treated as little-endian.
Signal Descriptions ETM interface signals Table B-7 describes the ARM946E-S ETM interface signals. Table B-7 ETM interface signals Name Direction Description ETMEN Input Synchronous ETM interface enable. This signal must be tied LOW if an ETM is not used. ETMBIGEND Output Big-endian configuration indication for the ETM.
Signal Descriptions INTEST wrapper signals Table B-8 describes the ARM946E-S INTEST wrapper signals. Table B-8 INTEST wrapper signals Name Direction Description INnotEXTEST Input Selects between INTEST and EXTEST mode of the INTEST wrapper scan chain. Input Serial input data for the INTEST wrapper scan chain.