Intel ARCHITECTURE IA-32 User Manual download pdf (Page 406)

100

101

IA-32 Intel® Architecture Optimization

7-60

throughput of a physical processor package. The non-halted CPI metric

can be interpreted as the inverse of the throughput of a logical

processor

When a single thread is executing and all on-chip execution resources

are available to it, non-halted CPI can indicate the unused execution

bandwidth available in the physical processor package. If the value of a

non-halted CPI is significantly higher than unity and overall on-chip

execution resource utilization is low, a multithreaded application can

direct tuning efforts to encompass the factors discussed earlier.

An optimized single thread with exclusive use of on-chip execution

resources may exhibit a non-halted CPI in the neighborhood of unity

Because most frequently used instructions typically decode into a single

micro-op and have throughput of no more than two cycles, an optimized

thread that retires one micro-op per cycle is only consuming about one

third of peak retirement bandwidth. Significant portions of the issue port

bandwidth are left unused. Thus, optimizing single-thread performance

usually can be complementary with optimizing a multithreaded

application to take advantage of the benefits of Hyper-Threading

Technology.

On a processor supporting Hyper-Threading Technology, it is possible

that an execution unit with lower throughput than one issue every two

cycles may find itself in contention from two threads implemented using

a data decomposition threading model. In one scenario, this can happen

when the inner loop of both threads rely on executing a low-throughput

instruction, such as

fdiv, and the execution time of the inner loop is

bound by the throughput of

fdiv.

9. Non-halted CPI can correlate to the resource utilization of an application thread, if the

application thread is affinitized to a fixed logical processor.

10. In current implementations of processors based on Intel NetBurst microarchitecture, the

theoretical lower bound for either non-halted CPI or non-sleep CPI is 1/3. Practical

applications rarely achieve any value close to the lower bound.

1 2 ... 401 402 403 404 405 406 407 408 409 410 411 ... 567 568

Comments to this Manuals

No comments

Intel ARCHITECTURE IA-32 User Manual Page 406

Comments to this Manuals

Related products and manuals for Computer Accessories Intel ARCHITECTURE IA-32