Intel ARCHITECTURE IA-32 User Manual Page 175

  • Download
  • Add to my manuals
  • Print
  • Page
    / 568
  • Table of contents
  • BOOKMARKS
  • Rated. / 5. Based on customer reviews
Page view 174
General Optimization Guidelines 2
2-103
first-level cache working set. Avoid having more than 8 cache lines that
are some multiple of 64 KB apart in the same second-level cache working
set. Avoid having a store followed by a non-dependent load with addresses
that differ by a multiple of 4 KB. 2-46
Assembly/Compiler Coding Rule 26. (M impact, L generality) If
(hopefully read-only) data must occur on the same page as code, avoid
placing it immediately after an indirect jump. For example, follow an
indirect jump with its mostly likely target, and place the data after an
unconditional branch. 2-47
Assembly/Compiler Coding Rule 27. (H impact, L generality) Always
put code and data on separate pages. Avoid self-modifying code wherever
possible. If code is to be modified, try to do it all at once and make sure
the code that performs the modifications and the code being modified are
on separate 4 KB pages or on separate aligned 1 KB subpages. 2-47
Assembly/Compiler Coding Rule 28. (H impact, L generality) If an
inner loop writes to more than four arrays, (four distinct cache lines),
apply loop fission to break up the body of the loop such that only four
arrays are being written to in each iteration of each of the resulting loops.
2-48
Assembly/Compiler Coding Rule 29. (M impact, H generality) All
branch targets should be 16-byte aligned. 2-57
Assembly/Compiler Coding Rule 30. (M impact, H generality) If the
body of a conditional is not likely to be executed, it should be placed in
another part of the program. If it is highly unlikely to be executed and
code locality is an issue, the body of the conditional should be placed on a
different code page. 2-57
Assembly/Compiler Coding Rule 31. (H impact, M generality)
Minimize changes to bits 8-12 of the floating point control word.
Changing among more than two values (each value being a combination
of these bits: precision, rounding and infinity control, and the rest of bits
in FCW) leads to delays that are on the order of the pipeline depth. 2-64
Page view 174
1 2 ... 170 171 172 173 174 175 176 177 178 179 180 ... 567 568

Comments to this Manuals

No comments