Intel ARCHITECTURE IA-32 User Manual Page 280

  • Download
  • Add to my manuals
  • Print
  • Page
    / 636
  • Table of contents
  • BOOKMARKS
  • Rated. / 5. Based on customer reviews
Page view 279
7-12 Vol. 3A
MULTIPLE-PROCESSOR MANAGEMENT
Program synchronization can also be carried out with serializing instructions (see Section 7.4).
These instructions are typically used at critical procedure or task boundaries to force completion
of all previous instructions before a jump to a new section of code or a context switch occurs.
Like the I/O and locking instructions, the processor waits until all previous instructions have
been completed and all buffered writes have been drained to memory before executing the seri-
alizing instruction.
The SFENCE, LFENCE, and MFENCE instructions provide a performance-efficient way of
insuring load and store memory ordering between routines that produce weakly-ordered results
and routines that consume that data. The functions of these instructions are as follows:
SFENCE — Serializes all store (write) operations that occurred prior to the SFENCE
instruction in the program instruction stream, but does not affect load operations.
LFENCE — Serializes all load (read) operations that occurred prior to the LFENCE
instruction in the program instruction stream, but does not affect store operations.
MFENCE — Serializes all store and load operations that occurred prior to the MFENCE
instruction in the program instruction stream.
Note that the SFENCE, LFENCE, and MFENCE instructions provide a more efficient method
of controlling memory ordering than the CPUID instruction.
The MTRRs were introduced in the P6 family processors to define the cache characteristics for
specified areas of physical memory. The following are two examples of how memory types set
up with MTRRs can be used strengthen or weaken memory ordering for the Pentium 4, Intel
Xeon, and P6 family processors:
The strong uncached (UC) memory type forces a strong-ordering model on memory
accesses. Here, all reads and writes to the UC memory region appear on the bus and out-of-
order or speculative accesses are not performed. This memory type can be applied to an
address range dedicated to memory mapped I/O devices to force strong memory ordering.
For areas of memory where weak ordering is acceptable, the write back (WB) memory
type can be chosen. Here, reads can be performed speculatively and writes can be buffered
and combined. For this type of memory, cache locking is performed on atomic (locked)
operations that do not split across cache lines, which helps to reduce the performance
penalty associated with the use of the typical synchronization instructions, such as XCHG,
that lock the bus during the entire read-modify-write operation. With the WB memory
type, the XCHG instruction locks the cache instead of the bus if the memory access is
contained within a cache line.
The PAT was introduced in the Pentium III processor to enhance the caching characteristics that
can be assigned to pages or groups of pages. The PAT mechanism typically used to strengthen
caching characteristics at the page level with respect to the caching characteristics established
by the MTRRs. Table 10-7 shows the interaction of the PAT with the MTRRs.
Page view 279
1 2 ... 275 276 277 278 279 280 281 282 283 284 285 ... 635 636

Comments to this Manuals

No comments