Intel ARCHITECTURE IA-32 User Manual Page 376

  • Download
  • Add to my manuals
  • Print
  • Page
    / 568
  • Table of contents
  • BOOKMARKS
  • Rated. / 5. Based on customer reviews
Page view 375
IA-32 Intel® Architecture Optimization
7-30
Prevent Sharing of Modified Data and False-Sharing
On an Intel Core Duo processor, sharing of modified data incurs a
performance penalty when a thread running on one core tries to read or
write data that is currently present in modified state in the first level
cache of the other core. This will cause eviction of the modified cache
line back into memory and reading it into the first-level cache of the
other core. The latency of such cache line transfer is much higher than
using data in the immediate first level cache or second level cache.
False sharing applies to data used by one thread that happens to reside
on the same cache line as different data used by another thread. These
situations can also incur performance delay depending on the topology
of the logical processors/cores in the platform.
An example of false sharing of multi-threading environment using
processors based on Intel NetBurst Microarchitecture is when
thread-private data and a thread synchronization variable are located
within the line size boundary (64 bytes) or sector boundary (128 bytes).
When one thread modifies the synchronization variable, the “dirty”
cache line must be written out to memory and updated for each physical
processor sharing the bus. Subsequently, data is fetched into each target
processor 128 bytes at a time, causing previously cached data to be
evicted from its cache on each target processor.
False sharing can experience performance penalty when the threads are
running on logical processors reside on different physical processors.
For processors that support Hyper-Threading Technology, false-sharing
incurs a performance penalty when two threads run on different cores,
different physical processors, or on two logical processors in the
physical processor package. In the first two cases, the performance
penalty is due to cache evictions to maintain cache coherency. In the
latter case, performance penalty is due to memory order machine clear
conditions.
False sharing is not expected to have a performance impact with a single
Intel Core Duo processor.
Page view 375
1 2 ... 371 372 373 374 375 376 377 378 379 380 381 ... 567 568

Comments to this Manuals

No comments