Packed-Data Processing on the 'C64x
8.2.7.2 Combining Operations in the Vector Complex Multiply Kernel
8-32
The Vector Complex Multiply kernel that was originally shown in Example 8–4
can be optimized with a technique similar to the one that used with the Dot
Product kernel in Section 8.2.4.1. First, the loads and stores are vectorized in
order to bring data in more efficiently. Next, operations are combined together
into intrinsics to make full use of the machine.
Example 8–12 illustrates the vectorization step. For details, consult the earlier
examples, such as the Vector Sum. The complex multiplication step itself has
not yet been optimized at all.
Need help?
Do you have a question about the TMS320C6000 and is the answer not in the manual?
Questions and answers