Texas Instruments TMS320C64X Programmer's Reference Manual page 103

Dsp little-endian dsp library
Hide thumbs Also See for TMS320C64X:
Table of Contents

Advertisement

DSP_mat_trans
Function
Arguments
Description
Algorithm
Special Requirements
Implementation Notes
Benchmarks
Matrix Transpose
void DSP_mat_trans (const short *x, short rows, short columns, short *r)
x[rows*columns]
Pointer to input matrix.
rows
Number of rows in the input matrix. Must be a multiple
of 4.
columns
Number of columns in the input matrix. Must be a multiple
of 4.
r[columns*rows]
Pointer to output data vector of size rows*columns.
This function transposes the input matrix x[ ] and writes the result to matrix r[ ].
This is the C equivalent of the assembly code without restrictions. Note that
the assembly code is hand optimized and restrictions may apply.
void DSP_mat_trans(short *x, short rows, short columns, short
*r)
{
short i,j;
for(i=0; i<columns; i++)
for(j=0; j<rows; j++)
*(r+i*rows+j)=*(x+i+columns*j);
}
-
Rows and columns must be a multiple of 4.
-
Matrices are assumed to have 16-bit elements.
-
Bank Conflicts: No bank conflicts occur.
-
Interruptibility: The code is interrupt-tolerant but not interruptible.
-
Data from four adjacent rows, spaced "columns" apart are read, and a
local 4x4 transpose is performed in the register file. This leads to four
double words, that are "rows" apart. These loads and stores can cause
bank conflicts; hence, non-aligned loads and stores are used.
Cycles
(2 * rows + 9) * columns/4 + 3
Codesize
224 bytes
DSP_mat_trans
C64x+ DSPLIB Reference
4-75

Hide quick links:

Advertisement

Table of Contents
loading

This manual is also suitable for:

Tms320c64x+

Table of Contents