Intel ARCHITECTURE IA-32 User Manual Page 125

  • Download
  • Add to my manuals
  • Print
  • Page
    / 568
  • Table of contents
  • BOOKMARKS
  • Rated. / 5. Based on customer reviews
Page view 124
General Optimization Guidelines 2
2-53
User/Source Coding Rule 8. (H impact, H generality) To achieve effective
amortization of bus latency, software should pay attention to favor data access
patterns that result in higher concentrations of cache miss patterns with cache
miss strides that are significantly smaller than half of the hardware prefetch
trigger threshold.
Non-Temporal Store Bus Traffic
Peak system bus bandwidth is shared by several types of bus activities,
including: reads (from memory), read for ownership (of a cache line),
and writes. The data transfer rate for bus write transactions is higher if
64 bytes are written out to the bus at a time.
Typically, bus writes to Writeback (WB) type memory must share the
system bus bandwidth with read-for-ownership (RFO) traffic.
Non-temporal stores do not require RFO traffic; they do require care in
managing the access patterns in order to ensure 64 bytes are evicted at
once (rather than evicting several 8 byte chunks).
Although full 64-byte bus writes due to non-temporal stores have data
bandwidth that is twice that of bus writes to WB memory, transferring
8-byte chunks wastes bus request bandwidth and delivers significantly
lower data bandwidth.
Page view 124
1 2 ... 120 121 122 123 124 125 126 127 128 129 130 ... 567 568

Comments to this Manuals

No comments