Page 2
This document is intended only to assist the reader in the use of the product. ARM Limited shall not be liable for any loss or damage arising from the use of any information in this document, or any error or omission in such information, or any incorrect use of the product.
About the programmer’s model ..............2-2 About the ARM9E-S programmer’s model ..........2-3 ARM966E-S CP15 registers ............... 2-4 Chapter 3 Memory Map About the ARM966E-S memory map ............3-2 Tightly-coupled SRAM address space ............3-3 Bufferable write address space ..............3-4 Chapter 4 Tightly-coupled SRAM ARM966E-S SRAM requirements ...............
Read this chapter for a description of the programmer’s model including a summary of the ARM966E-S coprocessor registers. Chapter 3 Memory Map Read this chapter for a description of the ARM966E-S fixed memory map implementation. Chapter 4 Tightly-coupled SRAM Read this chapter for a description of the requirements and operation of the tightly-coupled SRAM.
Page 13
Read this chapter for a description of the test methodology used for the ARM966E-S synthesized logic and tightly-coupled SRAM. Appendix A Signal Descriptions Read this appendix for a description of the ARM966E-S signals. Appendix B AC Parameters Read this appendix for a description of the timing parameters applicable to the ARM966E-S.
Further reading This section lists publications by ARM Limited, and by third parties. If you would like further information on ARM products, or if you have questions not answered by this document, please contact or visit our web site at info@arm.com...
Preface Feedback ARM Limited welcomes feedback both on the ARM966E-S, and on the documentation. Feedback on the ARM966E-S If you have any comments or suggestions about this product, please contact your supplier giving: • the product name • a concise explanation of your comments...
The ARM966E-S includes support for external coprocessors allowing floating point or other application-specific hardware acceleration to be added. To minimize die size and power consumption the ARM966E-S does not provide virtual to physical address mapping as this is not required by most embedded systems. A simple fixed memory map is implemented for the close-coupled local RAM, ideally suited to small, fast, real-time embedded control applications.
Introduction Microprocessor block diagram The ARM966E-S block diagram is shown in Figure 1-1. DMA Controller interface AHB Peripherals Coprocessors Data Instruction Dout Dout System control External SRAM SRAM coprocessor Bus Interface Unit coprocessor (CP15) and write buffer interface Addr Addr...
Page 21
Chapter 2 Programmer’s Model This chapter describes the programmer’s model for the ARM966E-S. It contains the following sections: • About the programmer’s model on page 2-2 • About the ARM9E-S programmer’s model on page 2-3 • ARM966E-S CP15 registers on page 2-4.
Programmer’s Model About the programmer’s model The programmer’s model for the ARM966E-S macrocell primarily consists of the ARM9E-S core programmer’s model (see About the ARM9E-S programmer’s model on page 2-3). Additions to this model are required to control the operation of the ARM966E-S internal coprocessors, and any coprocessor connected to the external coprocessor interface.
About the ARM9E-S programmer’s model The ARM9E-S processor core implements the ARM architecture v5T, that includes the 32-bit ARM instruction set and the 16-bit Thumb instruction set. For a description of both instruction sets, see the ARM Architecture Reference Manual. Contact ARM for complete descriptions of both instruction sets.
Register 7, Core control on page 2-7 • Register 15, Test on page 2-9. 2.3.1 CP15 register map summary The ARM966E-S incorporates CP15 for system control. The register map for CP15 is shown in Table 2-1. Table 2-1 CP15 register map Register Function...
2.3.3 Register 1, Control register This register contains the global control bits of the ARM966E-S (see Table 2-3). All reserved bits must either be written with zero or one, as indicated, or written using read-modify-write. The reserved bits have an unpredictable value when read. To read and write this register: MRC p15, 0, rd, c1, c0, 0;...
Page 27
Bit 12 is initialized either HIGH or LOW during system reset depending on the value of the input pin INITRAM. Bit 7, Endian Selects the endian configuration of the ARM966E-S. When this bit is HIGH, big-endian configuration is selected. When LOW, little-endian configuration is selected. This bit is cleared LOW during reset.
Page 28
Programmer’s Model Wait for interrupt This operation allows the ARM966E-S to enter a low-power standby mode. When the operation is invoked, the clock enable to the processor core is negated until either an interrupt or a debug request occurs. This function is invoked by a write to Register 7.
The contents of this register are replicated on the ETMPROCID pins of the ARM966E-S. The ETMPROCIDWR signal is set HIGH for a single clock cycle whenever this register is written to. Table 2-4 shows the trace process identifier for read and write.
Page 33
Chapter 3 Memory Map This chapter describes the ARM966E-S fixed memory map implementation.It contains the following sections: • About the ARM966E-S memory map on page 3-2 • Tightly-coupled SRAM address space on page 3-3 • Bufferable write address space on page 3-4.
Memory Map About the ARM966E-S memory map The ARM966E-S couples Instruction and Data SRAM memories of configurable size to the ARM9E-S core. This allows high-speed operation without incurring the performance and power penalties of accessing the system bus. A write buffer is used to minimize traffic on the AHB bus.
Memory Map Bufferable write address space The use of the ARM966E-S write buffer is controlled by both the CP15 control register and the fixed address map. When the ARM966E-S comes out of reset, the write buffer is disabled by default. All data writes to the AHB are performed as unbuffered.
Page 37
Chapter 4 Tightly-coupled SRAM This chapter describes the tightly-coupled SRAM in the ARM966E-S. It contains the following sections: • ARM966E-S SRAM requirements on page 4-2 • SRAM stall cycles on page 4-3 • Enabling the SRAM on page 4-4 •...
Tightly-coupled SRAM ARM966E-S SRAM requirements The ARM966E-S tightly-coupled SRAM is built from blocks of ASIC library compiled SRAM. The Instruction SRAM (I-SRAM) and Data SRAM (D-SRAM) can each be any size from 0 bytes to 64MB, although to ease implementation the size must be an integer power of two.
I-SRAM are pipelined by one clock cycle. Any stall requirement is detected by the SRAM control and factored into its response to the ARM966E-S system controller. The ARM9E-S SYSCLKEN input is then de-asserted until the SRAM has performed the access.
If however, INITRAM is held HIGH during reset, both SRAM blocks are enabled when the ARM966E-S comes out of reset. This is normally used for a warm reset where the SRAM has already been programmed before the application of nRESET to the ARM966E-S.
Page 41
See ARM966E-S CP15 registers on page 2-4 for details of how to read and write the CP15 control register. When the I-SRAM has been enabled, all future ARM9E-S instruction fetches and data accesses to the I-SRAM address space as shown in Figure 3-1 on page 3-2 causes the I-SRAM to be accessed.
Tightly-coupled SRAM ARM966E-S SRAM wrapper The ARM966E-S allows you to have control over the size of the I-SRAM and D-SRAM (up to a maximum of 64MBytes each). It is not possible to have a single generic interface between the ARM966E-S and the SRAM, due to the large number of differing compiled SRAM that can be integrated into an ARM966E-S system, potentially each with a unique interface.
ByteWrite signals. Your own library RAMs are instantiated inside InstrRAM.v DataRAM.v 4.4.1 Example SRAM interfaces The example wrapper supplied by ARM contains three RAM interface examples. All of the interface modifications are done in the and the blocks for the IRamIF.v DRamIF.v...
Page 49
Chapter 5 Direct Memory Access (DMA) This chapter describes the optional DMA interface in the ARM966E-S. It contains the following sections: • About the DMA interface on page 5-2 • Timing interface on page 5-5 • DMAENABLE setup and hold cycles on page 5-11 •...
Direct Memory Access (DMA) About the DMA interface A DMA port is provided on the ARM966E-S. You can connect this port to the D-SRAM in the ARM966E-S. This allows direct access to the D-SRAM from outside the ARM966E-S boundary. If this feature is not required the DMA port is tied off in the RTL and made redundant.
Write Write Illegal Figure 5-2 shows how the ARM966E-S DMA port interfaces to a dual-port RAM. For modelling purposes, the dual-port DMA solution also supports the single-port access route. Single-port access reduces performance in the dual-port solution and is unlikely to be used, so to prevent the core from being stalled, DMAWait must be tied LOW.
Timing interface To ease the system integration task and to provide RAM independent timings, the ARM966E-S registers all DMA inputs and outputs. This section details the behavior of the ARM966E-S for DMA read and writes to single and dual-port RAMs.
Page 54
DMAReady is asserted. Read data is driven on DMARData in the third cycle after the read address is sampled by the ARM966E-S (one cycle to register the address, one cycle for the RAM read and one cycle for registering the RAM read data). The first read address, DMAAddr, is registered by the ARM966E-S on the next rising clock edge after DMAReady is asserted.
DMAAddr, must be valid in the same cycle. The read data, DMARData, is returned in the third cycle after the request is registered by the ARM966E-S (one cycle to register the request, one cycle to read the RAM, and one cycle to register the output data).
Note Because the ARM966E-S core does not need to be stalled for dual-port DMA accesses, the DMA controller can access the data RAM continuously. DMAWait must be tied LOW otherwise the DMA access is by the first port of the RAM and the interface behaves as described in Single-port RAM writes on page 5-6.
Chapter 6 Bus Interface Unit This chapter describes the ARM966E-S Bus Interface Unit (BIU) and write buffer. It contains the following sections: • About the BIU and write buffer on page 6-2 • Write buffer operation on page 6-3 •...
See the AMBA Rev 2.0 AHB specification for full details of this bus architecture. The ARM966E-S BIU implements a fully-compliant AHB bus master interface and incorporates a write buffer to increase system performance. The BIU is the link between the ARM9E-S core with its tightly-coupled SRAM and the external AHB memory.
Bus Interface Unit Write buffer operation The ARM966E-S implements a 12-entry write buffer, where the entries can be address or data depending on the nature of the writes being executed by the ARM9E-S core. The write buffer helps to decouple the core from the wait cycles incurred when accessing the AHB.
Page 65
Additionally, the AHB can be run at a lower rate than the ARM966E-S system introducing extra delay to the buffered write process. This can lead to the core trying to commit data at a higher rate than the FIFO can be drained, resulting in the FIFO becoming full.
6.3.2 ARM966E-S transfer descriptions The ARM966E-S BIU performs a subset of the possible AHB bus transfers available. This section describes the transfers that can be performed and some back-to-back transfer cases: •...
Bus request At the start of every AHB access, the ARM966E-S requests access to the bus by asserting HBUSREQ to the arbiter. It must then wait for an acknowledge signal from the arbiter (HGRANT), before beginning the transfer on the next rising edge of HCLK.
ID-2 ID-3 ID-4 HRDATA HREADY Figure 6-3 Sequential instruction fetches, no AHB data access required Back-to-back LDR or STR accesses Figure 6-4 shows ARM966E-S bus activity when a sequence of LDR instructions is executed. NONSEQ IDLE NONSEQ IDLE NONSEQ IDLE...
STM crossing a 1KB boundary AMBA Rev.2 Specification states that sequential accesses must not cross 1KB boundaries. The ARM966E-S splits sequential accesses that cross a 1KB boundary into two sets of separate accesses. Figure 6-10 on page 6-15 shows bus activity when a writing four words, crosses a 1KB boundary.
The ARM966E-S design uses a single rising edge clock CLK to time all internal activity. In many systems where the ARM966E-S is embedded, it is desirable to run the AHB at a lower rate. To support this requirement, the ARM966E-S requires a clock enable, HCLKEN, to time AHB transfers.
Page 78
If the slave being accessed at the HCLK rate has a multi-cycle response, the HREADY input to the ARM966E-S is driven LOW until the data is ready to be returned. The BIU must therefore perform a logical AND on the HREADY response with HCLKEN to detect that the AHB transfer has completed.
In this example, the slave peripheral has an input setup and hold, and an output hold and valid time relative to HCLK. The ARM966E-S has an input setup and hold, and an output hold and valid relative to CLK’, the clock at the bottom of the clock tree. Clock tree insertion must be used to position the HCLK to match CLK’...
Page 81
Chapter 7 Coprocessor Interface This chapter describes the ARM966E-S pipelined coprocessor interface. It contains the following sections: • About the coprocessor interface on page 7-2 • LDC/STC on page 7-4 • MCR/MRC on page 7-8 • Interlocked MCR on page 7-9 •...
The interface differs from the basic ARM9E-S coprocessor interface. To ease integration of an external coprocessor, the interface from the ARM966E-S to the coprocessor has been pipelined by a single clock cycle.
Page 83
The coprocessor data processing instruction ( ) is used for coprocessor instructions that do not operate on values in ARM registers or in main memory. One example is a floating-point multiply instruction for a floating-point accelerator processor. To enable coprocessors to continue execution of...
LDC/STC instructions are used respectively to transfer data to and from external coprocessor registers and memory. In the case of the ARM966E-S, the memory can be either tightly-coupled SRAM or AHB depending on the address range of the access and SRAM enable.
ARM9E-S processor core outputs the address for the LDC/STC. Also in this cycle, DnMREQ is driven LOW, indicating to the ARM966E-S memory system that a memory access is required at the data end of the device. The timing for the data on CPDOUT and CPDIN is shown in Figure 7-1 on page 7-4.
Page 87
Meaning WAIT LAST Note If an external coprocessor is not attached in the ARM966E-S embedded system, the CHSDE[1:0] and CHSEX[1:0] handshake inputs must be tied off to indicate ABSENT. 7.2.3 Multiple external coprocessors If multiple external coprocessors are to be attached to the ARM966E-S interface, the handshaking signals can be combined by ANDing bit1, and ORing bit0.
Page 93
Chapter 8 Debug Support This chapter describes the ARM966E-S debug interface. It contains the following sections: • About the debug interface on page 8-2 • Debug systems on page 8-4 • ARM966E-S scan chain 15 on page 8-7 • Debug interface signals on page 8-9 •...
This is known as halt mode operation and allows the internal state of the ARM9E-S core, ARM966E-S system, and external state of the AHB to be examined while all other system activity continues as normal. When debug is complete, the ARM9E-S restores the core and system state, and resumes program execution.
Debug Support 8.1.2 Clocks The system and test clocks must be synchronized externally to the ARM966E-S macrocell. The ARM Multi-ICE debug agent directly supports one or more cores within an ASIC design. To synchronize off-chip debug clocking with the ARM966E-S macrocell requires a three-stage synchronizer.
8.2.2 The protocol converter An interface, such as a parallel port, connects the debug host to the ARM966E-S development system. The messages broadcast over this connection must be converted to the interface signals of the ARM966E-S. The protocol converter performs the conversion.
15. This is used for debug access to the CP15 register bank, to allow the system state within the ARM966E-S to be configured while in debug state, for instance to enable or disable the SRAM before performing a debug load or store.
Scan chain 15 is provided to allow debug access to the CP15 register bank, to allow the system state within the ARM966E-S to be configured while in debug state. The order of scan chain 15 from the DBGTDI input to the DBGTDO output is shown...
Page 100
Table 8-2 on page 8-7. These bits are not used in the BIST Address registers and so there are no debug restrictions when accessing these registers. The ability to control the ARM966E-S system state through scan chain 15 provides extra debug visibility. For example, if the debugger wishes to compare the contents of...
DBGIEBKPT, DBGDEWPT, and EDBGRQ are system requests for the ARM966E-S to enter debug state • DBGACK is used by the ARM966E-S to flag back to the system that it is in debug state. 8.4.1 Entry into debug state on breakpoint Any instruction being fetched from memory is sampled at the end of a cycle.
Page 105
When the ARM9E-S is in debug state, both memory interfaces indicate internal cycles. This ensures that both the tightly-coupled SRAM within the ARM966E-S and the AHB interface are quiescent, allowing the rest of the AHB system to ignore the ARM9E-S and function as normal.
Debug Support Determining the core and system state When the ARM966E-S is in debug state, you can examine the core and system state by forcing the load and store multiples into the instruction pipeline. Before you can examine the core and system state, the debugger must determine whether the processor entered debug from Thumb state or ARM state, by examining bit 4 of the EmbeddedICE-RT debug status register.
EmbeddedICE-RT logic is configured so that a breakpoint or watchpoint causes the ARM to enter abort mode, taking the Prefetch Abort or Data Abort vectors respectively. When the ARM is configured for real-time debugging you must be aware of the following restrictions: •...
Chapter 9 Embedded Trace Macrocell Interface This chapter describes the ARM966E-S Embedded Trace Macrocell (ETM) interface. It contains the following sections: • About the ETM interface on page 9-2 • Enabling the ETM interface on page 9-3. ARM DDI 0186A...
The ARM966E-S supports the connection of an external Embedded Trace Module (ETM) to provide real time code tracing of the ARM966E-S in an embedded system. The ETM interface is primarily one way. In order to provide code tracing, the ETM block must be able to monitor various ARM9E-S inputs and outputs.
The ETMEN input is usually driven by the ETM, and driven HIGH once the ETM is programmed using its TAP controller. Note If an ETM is not used in an embedded ARM966E-S design, the ETMEN input must be tied LOW to save power. ARM DDI 0186A...
9.3.1 FIFOFULL The signal, FIFOFULL, is an input to the ARM966E-S driven by the ETM9. Whenever the programmed upper watermark of the ETM FIFO is filled, FIFOFULL is asserted. The ARM966E-S uses FIFOFULL to stall the ARM9E-S core, preventing trace loss.
Page 123
Chapter 10 Test Support This chapter describes the test methodology employed for the ARM966E-S synthesized logic and tightly-coupled SRAM. It contains the following sections: • About the ARM966E-S test methodology on page 10-2 • Scan insertion and ATPG on page 10-3 •...
To achieve a high level of fault coverage, scan insertion and ATPG techniques are used on the ARM9E-S core and ARM966E-S control logic as part of the synthesis flow. BIST is used to provide high fault coverage of the compiled SRAM.
10.2.1 ARM966E-S INTEST wrapper To facilitate testing of the shadow logic between the ARM966E-S scan chains and the scan chains in an OEM ASIC, a synthesis option allows an INTEST wrapper to be inserted into the ARM966E-S. The INTEST wrapper is a scan chain around the boundary of the ARM966E-S, connecting to all input and output pins.
CP15 register 15 address space. For details of the instructions used to access these registers, see Register 15, Test on page 2-9. Access to these registers is also available in debug mode, see ARM966E-S scan chain 15 on page 8-7.
Page 131
Chapter 11 Instruction cycle timings This chapter describes the instruction cycle timings for the ARM966E-S. It contains the following sections: • Introduction to instruction cycle timings on page 11-2 • When stall cycles do not occur on page 11-3 •...
In a system such as the ARM966E-S, the CLKEN input to the ARM9E-S core might be pulled LOW to stall the processor until the memory system is able to respond to the access.
ARM9E-S core can run within the ARM966E-S with no stall cycles introduced by the system controller. When this is the case, the ARM966E-S is running at peak efficiency and the instruction cycles exactly match those quoted in the ARM9E-S Technical Reference Manual.
11.4.1 Synchronization penalty At the start of an AHB access, the BIU within the ARM966E-S must wait for the first rising edge of HCLK (the HCLKEN input is true) before it can broadcast the necessary AHB control and address information for the access. This delay is the synchronization penalty.
HCLK cycle required for an IDLE cycle (=R) Number of words accessed by the transfer Table 11-4 lists the types of AHB transfers performed by the ARM966E-S and the number of CLK cycles required to perform them. This table indicates cycles where the ARM9E-S core must be stalled until one or more AHB accesses have completed, that is, for reads and unbuffered writes.
Technical Reference Manual. The number quoted assumes that the CLKEN input to the core is HIGH, ensuring no stall cycles. In the ARM966E-S, the best-case figure could match the latency quoted for the ARM9E-S core, if the necessary data and instructions were already in the D-SRAM and I-SRAM respectively.
AHB slave responses that might exist in the AHB system to which the ARM966E-S interfaces. Table 11-7 gives examples of interrupt latency for systems with different CLK to HCLK ratios. For each system, slaves can have different response times to NONSEQ and SEQ transfers.
Appendix A Signal Descriptions This appendix describes the ARM966E-S signals. It contains the following sections: • Signal properties and requirements on page A-2 • Clock interface signals on page A-3 • AHB signals on page A-4 • Coprocessor interface signals on page A-6 •...
Signal Descriptions Signal properties and requirements In order to ensure ease of integration of the ARM966E-S into embedded applications and to simplify synthesis flow, the following design techniques have been used: • a single rising edge clock times all activity •...
Signal Descriptions Clock interface signals Table A-1 describes the ARM966E-S clock interface signals. Table A-1 Clock interface signals Name Direction Description Input This clock times all operations in the ARM966E-S design. All outputs change from the rising edge and System clock all inputs are sampled on the rising edge.
(01--). Bit [3] is driven to 0 indicating not cacheable. HWDATA[31:0] Output The 32-bit write data bus is used to transfer data from the ARM966E-S to a selected bus slave during write Write data bus operations. HRDATA[31:0] Input...
Page 147
Ownership of the address and Bus grant control signals changes at the end of a transfer when HREADY is HIGH, so the ARM966E-S gets access to the bus when both HREADY and HGRANT are HIGH. ARM DDI 0186A...
Signal Descriptions Coprocessor interface signals Table A-3 describes the ARM966E-S coprocessor interface signals. Table A-3 Coprocessor interface signals Name Direction Description CPCLKEN Output Synchronous enable for coprocessor pipeline follower. When HIGH on the rising edge of CLK the Coprocessor clock pipeline follower logic is able to advance.
Page 149
Not coprocessor must enter the coprocessor pipeline. instruction request nCPTRANS Output When LOW indicates that the ARM966E-S is in User mode. When HIGH indicates that the ARM966E-S is Not coprocessor in privileged mode. Sampled by the coprocessor memory translate pipeline follower.
Signal Descriptions Debug signals Table A-4 describes the ARM966E-S debug signals. Table A-4 Debug signals Name Direction Description DBGIR[3:0] Output These four bits reflect the current instruction loaded into the TAP controller control register. These bits TAP controller change when the TAP controller is in the instruction register UPDATE-IR state.
Page 152
Asserted by external hardware to halt execution of the processor for debug purposes. If HIGH at the end Instruction of an instruction fetch, it causes the ARM966E-S to breakpoint enter debug state if that instruction reaches the Execute stage of the processor pipeline.
Signal Descriptions Miscellaneous signals Table A-5 describes the ARM966E-S miscellaneous signals. Table A-5 Miscellaneous signals Name Direction Description nFIQ Input This is the Fast Interrupt Request signal. This signal must be synchronous to CLK. Not fast interrupt request nIRQ Input This is the Interrupt Request signal.
Signal Descriptions ETM interface signals Table A-6 describes the ARM966E-S ETM interface signals. Table A-6 ETM interface signals Name Direction Description ETMEN Input Synchronous ETM interface enable. This signal must be tied LOW if an ETM is not used. FIFOFULL Input Asserted when ETM FIFO fills.
Selects the INTEST wrapper scan chain as the source for ARM966E-S inputs. SERIALEN Input Enables the INTEST wrapper BIST activation mode where the scan chain is used to apply serialized ARM instructions to the ARM966E-S to activate BIST test of the tightly-coupled SRAM. ICAPTUREEN Input 1 = INTEST wrapper in INTEST mode 0 = INTEST wrapper in EXTEST mode.
DMA write data. DMAWait Input DMA Wait. Used to stall the ARM966E-S to allow a DMA access to take place. This functionality is only required if the data RAM is single-port. This signal must be tied LOW if the data RAM is dual-port.
Appendix C SRAM Stall Cycles This appendix describes the tightly-coupled SRAM in the ARM966E-S. It contains the following section: • About SRAM stall cycles on page C-2. For details of the ARM9E-S interface signals referenced in this section, refer to the ARM9E-S Technical Reference Manual.
SRAMs and additional stall mechanism attributed to the I-SRAM only. Any stall requirement is detected by the SRAM control and factored into its response to the ARM966E-S system controller. The ARM9E-S SYSCLKEN input is then deasserted until the SRAM has performed the access.
Need help?
Do you have a question about the ARM966E-S and is the answer not in the manual?
Questions and answers