Instruction Cache
An example of inefficient cache code appears in
data access at address 0x101 in the loop,
also causes the cache to load the instruction being fetched at 0x103 (into
set 3). Each time the program calls the subroutine,
memory data accesses at 0x201 and 0x211 displace the instruction at
0x103 by loading the instructions at 0x203 and 0x213 (also into set 3). If
the program rarely calls the
cution, the repeated cache loads do not greatly influence performance. If
the program frequently calls the subroutine while in the loop, cache ineffi-
ciency has a noticeable effect on performance. To improve cache efficiency
on this code (if for instance, execution of the
rearrange the order of some instructions. Moving the subroutine call up
one location (starting at 0x201) also works. By using that order, the two
cached instructions end up in cache set 4, instead of set 3.
Table 3-1. Cache Inefficient Code
Address
0x0100
0x0101
0x0102
0x0103
0x0104
0x0105
0x0106
0x0107
...
0x0200
0x0201
...
0x0211
3-10
subroutine during the
Inner
Instruction
lcntr = 1024, do Outer until LCE;
r0 = dm(i0,m0), pm(i8,m8) = f3;
r1 = r0 – r15;
if eq call (Inner);
f2 = float r1;
f3 = f2 * f2;
Outer: f3 = f3 + f4;
pm(i8,m8) = f3;
Inner: r1 = R13;
r14 = pm(i9,m9);
pm(i9,m9) = r12;
ADSP-2126x SHARC Processor Hardware Reference
Table
3-1. The PM bus
, causes a bus conflict and
Outer
, the program
Inner
Outer loop
is time critical),
Outer loop
exe-
Need help?
Do you have a question about the ADSP-21261 SHARC and is the answer not in the manual?
Questions and answers