Replace Certain Shld Instructions With Alternative Code; Use 8-Bit Sign-Extended Immediates - AMD Athlon Processor x86 Optimization Manual

X86 code optimization
Table of Contents

Advertisement

AMD Athlon™ Processor x86 Code Optimization

Replace Certain SHLD Instructions with Alternative Code

Example 1
Example 2
Example 3

Use 8-Bit Sign-Extended Immediates

38
Certain instances of the SHLD instruction can be replaced by
alternative code using SHR and LEA. The alternative code has
lower latency and requires less execution resources. SHR and
LEA (32-bit version) are DirectPath instructions, while SHLD is
a VectorPath instruction. SHR and LEA preserves decode
bandwidth as it potentially enables the decoding of a third
DirectPath instruction.
(Avoid):
SHLD REG1, REG2, 1
(Preferred):
SHR REG2, 31
LEA REG1, [REG1*2 + REG2]
(Avoid):
SHLD REG1, REG2, 2
(Preferred):
SHR REG2, 30
LEA REG1, [REG1*4 + REG2]
(Avoid):
SHLD REG1, REG2, 3
(Preferred):
SHR REG2, 29
LEA REG1, [REG1*8 + REG2]
Using 8-bit sign-extended immediates improves code density
with no negative effects on the AMD Athlon processor. For
example, ADD BX, –5 should be encoded "83 C3 FB" and not
"81 C3 FF FB".
Replace Certain SHLD Instructions with Alternative
22007E/0—November 1999

Advertisement

Table of Contents
loading

Table of Contents