Intel ARCHITECTURE IA-32 User Manual Page 9

  • Download
  • Add to my manuals
  • Print
  • Page
    / 568
  • Table of contents
  • BOOKMARKS
  • Rated. / 5. Based on customer reviews
Page view 8
ix
Data Alignment........................................................................................................................ 5-4
Data Arrangement............................................................................................................ 5-4
Vertical versus Horizontal Computation...................................................................... 5-5
Data Swizzling ............................................................................................................ 5-9
Data Deswizzling ...................................................................................................... 5-14
Using MMX Technology Code for Copy or Shuffling Functions................................ 5-17
Horizontal ADD Using SSE....................................................................................... 5-18
Use of cvttps2pi/cvttss2si Instructions .................................................................................. 5-21
Flush-to-Zero and Denormals-are-Zero Modes .................................................................... 5-22
SIMD Floating-point Programming Using SSE3 ................................................................... 5-22
SSE3 and Complex Arithmetics ..................................................................................... 5-23
SSE3 and Horizontal Computation................................................................................. 5-26
SIMD Optimizations and Microarchitectures .................................................................. 5-27
Packed Floating-Point Performance......................................................................... 5-27
Chapter 6 Optimizing Cache Usage
General Prefetch Coding Guidelines....................................................................................... 6-2
Hardware Prefetching of Data................................................................................................. 6-4
Prefetch and Cacheability Instructions.................................................................................... 6-5
Prefetch................................................................................................................................... 6-6
Software Data Prefetch .................................................................................................... 6-6
The Prefetch Instructions – Pentium 4 Processor Implementation................................... 6-8
Prefetch and Load Instructions......................................................................................... 6-8
Cacheability Control................................................................................................................ 6-9
The Non-temporal Store Instructions.............................................................................. 6-10
Fencing..................................................................................................................... 6-10
Streaming Non-temporal Stores ............................................................................... 6-10
Memory Type and Non-temporal Stores................................................................... 6-11
Write-Combining....................................................................................................... 6-12
Streaming Store Usage Models...................................................................................... 6-13
Coherent Requests................................................................................................... 6-13
Non-coherent requests ............................................................................................. 6-13
Streaming Store Instruction Descriptions ....................................................................... 6-14
The fence Instructions.................................................................................................... 6-15
The sfence Instruction .............................................................................................. 6-15
The lfence Instruction ............................................................................................... 6-16
The mfence Instruction............................................................................................. 6-16
The clflush Instruction .................................................................................................... 6-17
Memory Optimization Using Prefetch.................................................................................... 6-18
Software-controlled Prefetch .......................................................................................... 6-18
Page view 8
1 2 3 4 5 6 7 8 9 10 11 12 13 14 ... 567 568

Comments to this Manuals

No comments