Intel ARCHITECTURE IA-32 User Manual Page 395

  • Download
  • Add to my manuals
  • Print
  • Page
    / 568
  • Table of contents
  • BOOKMARKS
  • Rated. / 5. Based on customer reviews
Page view 394
Multi-Core and Hyper-Threading Technology 7
7-49
On Hyper-Threading-Technology-enabled processors, excessive loop
unrolling is likely to reduce the Trace Cache’s ability to deliver high
bandwidth μop streams to the execution engine.
Optimization for Code Size
When the Trace Cache is continuously and repeatedly delivering μop
traces that are pre-built, the scheduler in the execution engine can
dispatch μops for execution at a high rate and maximize the utilization
of available execution resources. Optimizing application code size by
organizing code sequences that are repeatedly executed into sections,
each with a footprint that can fit into the Trace Cache, can improve
application performance greatly.
On Hyper-Threading-Technology-enabled processors, multithreaded
applications should improve code locality of frequently executed
sections and target one half of the size of Trace Cache for each
application thread when considering code size optimization. If code size
becomes an issue affecting the efficiency of the front end, this may be
detected by evaluating performance metrics discussed in the previous
sub-section with respect to loop unrolling.
User/Source Coding Rule 38. (L impact, L generality) Optimize code size to
improve locality of Trace cache and increase delivered trace length.
Using Thread Affinities to Manage Shared Platform
Resources
Each logical processor in an MP system has unique initial APIC_ID
which can be queried using CPUID. Resources shared by more than one
logical processors in a multi-threading platform can be mapped into a
three-level hierarchy for a non-clustered MP system. Each of the three
levels can be identified by a label, which can be extracted from the
Page view 394
1 2 ... 390 391 392 393 394 395 396 397 398 399 400 ... 567 568

Comments to this Manuals

No comments