Intel ARCHITECTURE IA-32 User Manual Page 535

  • Download
  • Add to my manuals
  • Print
  • Page
    / 568
  • Table of contents
  • BOOKMARKS
  • Rated. / 5. Based on customer reviews
Page view 534
IA-32 Instruction Latency and Throughput C
C-21
For the sake of simplicity, all data being requested is assumed to reside
in the first level data cache (cache hit). In general, IA-32 instructions
with load operations that execute in the integer ALU units require two
more clock cycles than the corresponding register-to-register flavor of
the same instruction. Throughput of these instructions with load
operation remains the same with the register-to-register flavor of the
instructions.
Floating-point, MMX technology, Streaming SIMD Extensions and
Streaming SIMD Extension 2 instructions with load operations require 6
more clocks in latency than the register-only version of the instructions,
but throughput remains the same.
When store operations are on the critical path, their results can generally
be forwarded to a dependent load in as few as zero cycles. Thus, the
latency to complete and store isn’t relevant here.
Page view 534
1 2 ... 530 531 532 533 534 535 536 537 538 539 540 ... 567 568

Comments to this Manuals

No comments