Intel ARCHITECTURE IA-32 User Manual Page 284

  • Download
  • Add to my manuals
  • Print
  • Page
    / 568
  • Table of contents
  • BOOKMARKS
  • Rated. / 5. Based on customer reviews
Page view 283
IA-32 Intel® Architecture Optimization
5-22
avoided since there is a penalty associated with writing this register;
typically, through the use of the
cvttps2pi and cvttss2si instructions,
the rounding control in
MXCSR can be always be set to round-nearest.
Flush-to-Zero and Denormals-are-Zero Modes
The flush-to-zero (FTZ) and denormals-are-zero (DAZ) mode are not
compatible with IEEE Standard 754. They are provided to improve
performance for applications where underflow is common and where
the generation of a denormalized result is not necessary. See
“Floating-point Modes and Exceptions” in Chapter 2.
SIMD Floating-point Programming Using SSE3
SSE3 enhances SSE and SSE2 with 9 instructions targeted for SIMD
floating-point programming. In contrast to many SSE and SSE2
instructions offering homogeneous arithmetic operations on parallel
data elements (see Figure 5-1) and favoring the vertical computation
model, SSE3 offers instructions that performs asymmetric arithmetic
operation and arithmetic operation on horizontal data elements.
ADDSUBPS and ADDSUBPD are two instructions with asymmetric
arithmetic processing capability (see Figure 5-4). HADDPS, HADDPD,
HSUBPS and HSUBPD offers horizontal arithmetic processing
capability (see Figure 5-5). In addition, MOVSLDUP, MOVSHDUP
and MOVDDUP can load data from memory (or XMM register) and
replicate data elements at once.
Page view 283
1 2 ... 279 280 281 282 283 284 285 286 287 288 289 ... 567 568

Comments to this Manuals

No comments