A6.4 L1 Data Memory System - ARM Cortex-A76 Core Technical Reference Manual

Table of Contents

Advertisement

A6.4
L1 data memory system
The L1 data cache is organized as a Virtually Indexed, Physically Tagged (VIPT) cache featuring four
ways.
Data cache invalidate on reset
A6.4.1
Memory system implementation
This section describes the implementation of the L1 memory system.
Limited Order Regions
The core offers support for four limited ordering region descriptors, as introduced by the Armv8.1
Limited Ordering Regions.
Atomic instructions
The Cortex-A76 core supports the atomic instructions added in Armv8.1 architecture.
Atomic instructions to cacheable memory can be performed as either near atomics or far atomics,
depending on where the cache line containing the data resides.
When an instruction hits in the L1 data cache in a unique state, then it is performed as a near atomic in
the L1 memory system. If the atomic operation misses in the L1 cache, or the line is shared with another
core, then the atomic is sent as a far atomic on the core CHI interface.
If the operation misses everywhere within the cluster, and the interconnect supports far atomics, then the
atomic is passed on to the interconnect to perform the operation.
When the operation hits anywhere inside the cluster, or when an interconnect does not support atomics,
the L3 memory system performs the atomic operation. If the line it is not already there, it allocates the
line into the L3 cache. This depends on whether the DSU is configured with an L3 cache.
Therefore, if software prefers that the atomic is performed as a near atomic, precede the atomic
instruction with a
Alternatively, CPUECTLR can be programmed such that different types of atomic instructions attempt to
execute as a near atomic. One cache fill will be made on an atomic. If the cache line is lost before the
atomic operation can be made, it will be sent as a far atomic.
The Cortex-A76 core supports atomics to device or non-cacheable memory, however this relies on the
interconnect also supporting atomics. If such an atomic instruction is executed when the interconnect
does not support them, it will result in an abort.
For more information on the CPUECTLR register, see
Register, EL1 on page
LDAPR instructions
The core supports Load acquire instructions adhering to the RCpc consistency semantic introduced in the
Armv8.3 extensions for A profile. This is reflected in register ID_AA64ISAR1_EL1 where bits[23:20]
are set to
in AArch64.
Transient memory region
The core has a specific behavior for memory regions that are marked as write-back cacheable and
transient, as defined in the Armv8.0 architecture.
100798_0300_00_en
The Armv8-A architecture does not support an operation to invalidate the entire data cache. If
software requires this function, it must be constructed by iterating over the cache geometry and
executing a series of individual invalidate by set/way instructions.
or
PLDW
PRFM PSTL1KEEP
B2-172.
to indicate that the core supports
0b0001
Copyright © 2016–2018 Arm Limited or its affiliates. All rights
instruction.
B2.26 CPUECTLR_EL1, CPU Extended Control
,
LDAPRB
LDAPRH
reserved.
Non-Confidential
A6 Level 1 memory system

A6.4 L1 data memory system

, and
instructions implemented
LDAPR
A6-77

Advertisement

Table of Contents
loading

Table of Contents