Assembly Code
6.1 Assembly Code
6-2
The source that you write for the assembly optimizer is similar to assembly
source code; however, linear assembly does not include information about
parallel instructions, instruction latencies, or register usage. The assembly op-
timizer takes care of the difficulties of streamlining your code by:
Finding instructions that can be executed in parallel
Handling pipeline latencies during software pipelining
Assigning register usage
Defining which unit to use
Although you have the option with the 'C6000 to specify the functional unit or
register used, this may restrict the compiler's ability to fully optimize your code.
See the TMS320C6000 Optimizing C/C++ Compiler User's Guide for more in-
formation.
This chapter takes you through the optimization process manually to show you
how the assembly optimizer works and to help you understand when you might
want to perform some of the optimizations manually. Each section introduces
optimization techniques in increasing complexity:
Section 6.3 and section 6.4 begin with a dot product algorithm to show you
how to translate the C code to assembly code and then how to optimize
the linear assembly code with several simple techniques.
Section 6.5 and section 6.6 introduce techniques for the more complex al-
gorithms associated with software pipelining, such as modulo iteration in-
terval scheduling for both single-cycle loops and multicycle loops.
Section 6.7 uses an IIR filter algorithm to discuss the problems with loop
carry paths.
Section 6.8 and section 6.9 discuss the problems encountered with if-
then-else statements in a loop and how loop unrolling can be used to re-
solve them.
Section 6.10 introduces live-too-long issues in your code.
Section 6.11 uses a simple FIR filter algorithm to discuss redundant load
elimination.
Section 6.12 discusses the same FIR filter in terms of the interleaved
memory bank scheme used by 'C6000 devices.
Section 6.13 and section 6.14 show you how to execute the outer loop of
the FIR filter conditionally and in parallel with the inner loop.
Need help?
Do you have a question about the TMS320C6000 and is the answer not in the manual?
Questions and answers