Intel ARCHITECTURE IA-32 User Manual download pdf (Page 383)

100

101

Multi-Core and Hyper-Threading Technology 7

7-37

latency of scattered memory reads can be improved by issuing multiple

memory reads back-to-back to overlap multiple outstanding memory

read transactions. The average latency of back-to-back bus reads is

likely to be lower than the average latency of scattered reads

interspersed with other bus transactions. This is because only the first

memory read needs to wait for the full delay of a cache miss.

User/Source Coding Rule 29. (M impact, M generality) Consider using

overlapping multiple back-to-back memory reads to improve effective cache

miss latencies.

Another technique to reduce effective memory latency is possible if one

can adjust the data access pattern such that the access strides causing

successive cache misses in the last-level cache is predominantly less

than the trigger threshold distance of the automatic hardware prefetcher.

See “Example of Effective Latency Reduction with H/W Prefetch” in

Chapter 6.

User/Source Coding Rule 30. (M impact, M generality) Consider adjusting

the sequencing of memory references such that the distribution of distances of

successive cache misses of the last level cache peaks towards 64 bytes.

Use Full Write Transactions to Achieve Higher Data Rate

Write transactions across the bus can result in write to physical memory

either using the full line size of 64 bytes or less than the full line size.

The latter is referred to as a partial write. Typically, writes to writeback

(WB) memory addresses are full-size and writes to write-combine (WC)

or uncacheable (UC) type memory addresses result in partial writes.

Both cached WB store operations and WC store operations utilize a set

of six WC buffers (64 bytes wide) to manage the traffic of write

transactions. When competing traffic closes a WC buffer before all

writes to the buffer are finished, this results in a series of 8-byte partial

bus transactions rather than a single 64-byte write transaction.

User/Source Coding Rule 31. (M impact, M generality) Use full write

transactions to achieve higher data throughput.

1 2 ... 378 379 380 381 382 383 384 385 386 387 388 ... 567 568

Comments to this Manuals

No comments

Intel ARCHITECTURE IA-32 User Manual Page 383

Comments to this Manuals

Related products and manuals for Computer Accessories Intel ARCHITECTURE IA-32