Intel ARCHITECTURE IA-32 User Manual Page 77

  • Download
  • Add to my manuals
  • Print
  • Page
    / 568
  • Table of contents
  • BOOKMARKS
  • Rated. / 5. Based on customer reviews
Page view 76
General Optimization Guidelines 2
2-5
Optimize Branch Predictability
Improve branch predictability and optimize instruction prefetching
by arranging code to be consistent with the static branch prediction
assumption: backward taken and forward not taken.
Avoid mixing near calls, far calls and returns.
Avoid implementing a call by pushing the return address and
jumping to the target. The hardware can pair up call and return
instructions to enhance predictability.
Use the pause instruction in spin-wait loops.
Inline functions according to coding recommendations.
Whenever possible, eliminate branches.
Avoid indirect calls.
Optimize Memory Access
Observe store-forwarding constraints.
Ensure proper data alignment to prevent data split across cache line.
boundary. This includes stack and passing parameters.
Avoid mixing code and data (self-modifying code).
Choose data types carefully (see next bullet below) and avoid type
casting.
Employ data structure layout optimization to ensure efficient use of
64-byte cache line size.
Favor parallel data access to mask latency over data accesses with
dependency that expose latency.
For cache-miss data traffic, favor smaller cache-miss strides to
avoid frequent DTLB misses.
Use prefetching appropriately.
Use the following techniques to enhance locality: blocking,
hardware-friendly tiling, loop interchange, loop skewing.
Page view 76
1 2 ... 72 73 74 75 76 77 78 79 80 81 82 ... 567 568

Comments to this Manuals

No comments