Clipping To An Arbitrary Unsigned Range [High, Low]; Example 4-21 Simplified Clipping To An Arbitrary Signed Range - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

IA-32 Intel® Architecture Optimization
The code above converts values to unsigned numbers first and then clips
them to an unsigned range. The last instruction converts the data back to
signed data and places the data within the signed range. Conversion to
unsigned data is required for correct results when (
0x8000
If (
high
Example 4-21.

Example 4-21 Simplified Clipping to an Arbitrary Signed Range

; Input:
MM0
; Output:
MM1
;
paddssw
MM0, (packed_max - packed_high)
psubssw
MM0, (packed_usmax - packed_high + packed_ow)
paddw
MM0, low
This algorithm saves a cycle when it is known that (
0x8000
)
low
< 0x8000
a number greater in magnitude than
number. When the second instruction,
+ low)
negative number is subtracted. The result of this subtraction causes the
values in
and an incorrect answer is generated.

Clipping to an Arbitrary Unsigned Range [high, low]

Example 4-22 clips an unsigned value to the unsigned range [
]. If the value is less than
low
or
high
4-28
.
-
)
low
>= 0x8000
signed source operands
signed operands clipped to the unsigned
range [high, low]
; in effect this clips to high
; clips to low
; undo the previous two offsets
. The three-instruction algorithm does not work when (
, because
, in the three-step algorithm (Example 4-21) is executed, a
to be increased instead of decreased, as should be the case,
MM0
, respectively. This technique uses the packed-add and
, the algorithm can be simplified as shown in
minus any number
0xffff
0x8000
psubssw MM0, (0xffff - high
or greater than
low
-
high
low
-
high
low
< 0x8000
, which is a negative
, then clip to
high
)
<
)
>=
-
high
will yield
high,
low

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the ARCHITECTURE IA-32 and is the answer not in the manual?

Table of Contents

Save PDF