Intel
the instruction cache is either enabled or disabled. There is no performance benefit in
not using the instruction cache. The exception is that code, which locks code into the
instruction cache, must itself execute from non-cached memory.
3.10.4.1.1
Cache Miss Cost
The performance of the IXP45X/IXP46X network processors is highly dependent on
reducing the cache miss rate.
Note that this cycle penalty becomes significant when the core is running much faster
than external memory. Executing non-cached instructions severely curtails the
processor's performance in this case and it is very important to do everything possible
to minimize cache misses.
For the IXP45X/IXP46X network processors, care must be taken to optimize code to
have a maximum cache hit when accesses have been requested to the Expansion Bus
Interface or the PCI Bus Controller. These design recommendations are due to the
latency that may be associated with accessing the PCI Bus Controller and Expansion
Bus Controller. Retries will be issued to the Intel XScale processor until the requested
transaction is completed.
3.10.4.1.2
Round Robin Replacement Cache Policy
Both the data and the instruction caches use a round robin replacement policy to evict
a cache line. The simple consequence of this is that at sometime every line will be
evicted, assuming a non-trivial program. The less obvious consequence is that
predicting when and over which cache lines evictions take place is very difficult to
predict. This information must be gained by experimentation using performance
profiling.
3.10.4.1.3
Code Placement to Reduce Cache Misses
Code placement can greatly affect cache misses. One way to view the cache is to think
of it as 32 sets of 32 bytes, which span an address range of 1,024 bytes. When
running, the code maps into 32 blocks modular 1,024 of cache space. Any sets, which
are overused, will thrash the cache. The ideal situation is for the software tools to
distribute the code on a temporal evenness over this space.
This is very difficult if not impossible for a compiler to do. Most of the input needed to
best estimate how to distribute the code will come from profiling followed by compiler
based two pass optimizations.
3.10.4.1.4
Locking Code into the Instruction Cache
One very important instruction cache feature is the ability to lock code into the
instruction cache. Once locked into the instruction cache, the code is always available
for fast execution. Another reason for locking critical code into cache is that with the
round robin replacement policy, eventually the code will be evicted, even if it is a very
frequently executed function. Key code components to consider for locking are:
• Interrupt handlers
• Real time clock handlers
• OS critical code
• Time critical application code
The disadvantage to locking code into the cache is that it reduces the cache size for the
rest of the program. How much code to lock is very application dependent and requires
experimentation to optimize.
®
®
Intel
IXP45X and Intel
IXP46X Product Line of Network Processors
Developer's Manual
202
®
®
IXP45X and Intel
IXP46X Product Line of Network Processors—Intel XScale
®
Processor
August 2006
Order Number: 306262-004US
Need help?
Do you have a question about the IXP45X and is the answer not in the manual?
Questions and answers