Intel Embedded Intel486 Hardware Reference Manual page 281

Embedded intel486 processor
Table of Contents

Advertisement

The Intel486 processor's on-chip cache dramatically speeds floating-point loads and stores. For
the Intel386 processor with a math coprocessor, instructions such as FLD (floating-point load)
will take 14-20 clock cycles if any external memory addressing is required. Once operands are
on the internal stack, it takes 23 to 31 cycles to execute the floating-point add instruction, depend-
ing on the value of the operands. Finally an external memory store can take up to 11-44 cycles.
Because the floating-point unit of the Intel486 processor is integrated, the entire operation exe-
cutes in fewer cycles. Data from the external memory can be cached. After that it can be accessed
by the floating-point unit, and loaded into the stack in three cycles on a cache hit. The floating-
point add instruction takes between 8 to 20 cycles depending on the value of the operands. Final-
ly, the store instruction takes 7 clocks.
Because the Intel486 processor provides a higher performance not only for floating point loads
and stores, but also for floating-point compute operations, a 3x to 4x performance boost is real-
ized for numerics-intensive routines. A large portion of the performance improvement is attrib-
uted to the fact that synchronous floating-point transfers occur on-chip.
9.9.2
Performance of the Floating-Point Unit
To achieve three to four times the floating-point performance of a non-integrated math coproces-
sor, the Intel486 processor's floating-point circuitry has been enhanced to reduce the number of
clock counts needed to execute frequently used instructions. Also, the interface to the processor's
registers and buses is much more efficient since all of the interacting units are on the same chip.
Table 9-3
shows the number of clock counts per instruction on the Intel486 processor.
Table 9-3. Floating-Point Instruction Execution
Instruction
FLD-Load
FST-Store
FADD/FSUB
FMUL
Floating multiply
FDIV
Floating divide
PERFORMANCE CONSIDERATIONS
Clock Counts
Intel486™ Processor
3
3
8-20
16
73
9-17

Advertisement

Table of Contents
loading

Table of Contents