DSPF_sp_vecsum_sq
Algorithm
Special Requirements There are no alignment requirements.
Implementation Notes
Benchmarks
DSPF_sp_vecsum_sq
Function
Arguments
Description
4-60
This is the C equivalent of the Assembly Code without restrictions.
void DSPF_sp_vecrecip(const float* x, float* restrict r, int
n)
{
int i;
for(i = 0; i < n; i++)
r[i] = 1 / x[i];
}
-
The inner loop is unrolled four times to allow calculation of four reciprocals
in the kernel. However the stores are executed conditionally to allow 'n' to
be any number > 0.
-
Register sharing is used to make optimal use of available registers.
No extraneous loads occur except for the case when n ≤ 4 where a pad
-
of 16 bytes is required.
-
Endianess: This code is little endian.
-
Interruptibility: This code is interrupt-tolerant but not interruptible.
Cycles
8*floor((n-1)/4) + 53
e.g., for n = 100, cycles = 245
Code size
512
(in bytes)
Single-precision sum of squares
float DSPF_sp_vecsum_sq (const float *x, int n)
x
Pointer to first input array.
n
Number of elements in arrays.
This routine performs a sum of squares of the elements of the array x and re-
turns the sum.
Need help?
Do you have a question about the TMS320C67 DSP Series and is the answer not in the manual?
Questions and answers