Case Study 2: Optimizing Memory Fill - Intel PXA270 Optimization Manual

Pxa27x processor family
Table of Contents

Advertisement

Intel XScale® Microarchitecture & Intel® Wireless MMX™ Technology Optimization
Using preloads appropriately, the code can be desensitized to the memory latency (preload and
prefetches are the same). Preloads are described further in
Scheduling" on page
desensitization.
; for cache-line-aligned case
PLD [r5]
PLD [r5, #32]
PLD [r5, #64]
PLD [r5, #96]
LOOP :
ldrd r0, [r5], #8
ldrd r2, [r5], #8
ldrd r4, [r5], #8
str r0, [r6], #4
str r1, [r6], #4
ldrd r0, [r5], #8
pld r5, #96]; preloading 3 cache lines ahead
str r2, [r6], #4
str r3, [r6], #4
str r4, [r6], #4
str r5, [r6], #4
str r0, [r6], #4
str r1, [r6], #4
....
This code preloads three cache lines ahead of its current iteration. It also uses LDRD and groups
the STR s together to coalesce.
4.6.2

Case Study 2: Optimizing Memory Fill

Graphics applications use fill routines. Most of the personal data assistant (PDA) LCD displays use
output color format of RGB (16-bits or 8-bits). Therefore, most of the fill routines write out pixels
as bytes or half-words which is not recommended in terms of bus-bandwidth usage. However,
multiple pixels can be packed into a 32-bit data format and used for writing to the memory. Use
packing to improve efficiency.
Fill routines effectively make use of the write-coalescing feature which the PXA27x processor
provide if the LCD frame buffer is allocated as un-cached but bufferable. This code example shows
a common fill function:
unsigned short wColor, *pDst, DstStride;
BlitFill( ){
for (int i = 0; i < iRows; i++) {
// Set this solid color for whole scanline, then advance to next
}
4-30
5-2. The following code performs memcpy with optimizations for latency
for (int j=0; j<iCols; j++)
*pDst++ = wColor;
pDst += DstStride;
Section 5.1.1.1.2, "Preload Loop
Intel® PXA27x Processor Family Optimization Guide

Advertisement

Table of Contents
loading

This manual is also suitable for:

Pxa271Pxa272Pxa273

Table of Contents