Intel ARCHITECTURE IA-32 User Manual Page 326

  • Download
  • Add to my manuals
  • Print
  • Page
    / 568
  • Table of contents
  • BOOKMARKS
  • Rated. / 5. Based on customer reviews
Page view 325
IA-32 Intel® Architecture Optimization
6-36
Figure 6-7 shows how prefetch instructions and strip-mining can be
applied to increase performance in both of these scenarios.
For Pentium 4 processors, the left scenario shows a graphical
implementation of using
prefetchnta to prefetch data into selected
ways of the second-level cache only (SM1 denotes strip mine one way
of second-level), minimizing second-level cache pollution. Use
prefetchnta if the data is only touched once during the entire
execution pass in order to minimize cache pollution in the higher level
caches. This provides instant availability, assuming the prefetch was
issued far ahead enough, when the read access is issued.
Figure 6-7 Examples of Prefetch and Strip-mining for Temporally Adjacent and
Non-Adjacent Passes Loops
Temporally
non-adjacent passes
Temporally
adjacent passes
Prefetchnta
Dataset A
Reuse
Dataset A
Reuse
Dataset B
Prefetchnta
Dataset B
SM1
SM1
Prefetcht0
Dataset A
Prefetcht0
Dataset B
Reuse
Dataset B
Reuse
Dataset A
SM2
Page view 325
1 2 ... 321 322 323 324 325 326 327 328 329 330 331 ... 567 568

Comments to this Manuals

No comments