Intel ARCHITECTURE IA-32 User Manual download pdf (Page 210)

100

101

IA-32 Intel® Architecture Optimization

3-30

but is somewhat inefficient as there is the overhead of extra instructions

during computation. Performing the swizzle statically, when the data

structures are being laid out, is best as there is no runtime overhead.

As mentioned earlier, the SoA arrangement allows more efficient use of

the parallelism of the SIMD technologies because the data is ready for

computation in a more optimal vertical manner: multiplying

components

x0,x1,x2,x3 by xF,xF,xF,xF using 4 SIMD execution

slots to produce 4 unique results. In contrast, computing directly on AoS

data can lead to horizontal operations that consume SIMD execution

slots but produce only a single scalar result as shown by the many

“don’t-care” (DC) slots in Example 3-16.

Use of the SoA format for data structures can also lead to more efficient

use of caches and bandwidth. When the elements of the structure are not

accessed with equal frequency, such as when element

x, y, z are

accessed ten times more often than the other entries, then SoA not only

saves memory, but it also prevents fetching unnecessary data items

a, b,

and

Example 3-17 Hybrid SoA Data Structure

NumOfGroups = NumOfVertices/SIMDwidth

typedef struct{

float x[SIMDwidth];

float y[SIMDwidth];

float z[SIMDwidth];

} VerticesCoordList;

typedef struct{

int a[SIMDwidth];

int b[SIMDwidth];

int c[SIMDwidth];

. . .

} VerticesColorList;

VerticesCoordList VerticesCoord[NumOfGroups];

VerticesColorList VerticesColor[NumOfGroups];

1 2 ... 205 206 207 208 209 210 211 212 213 214 215 ... 567 568

Comments to this Manuals

No comments

Intel ARCHITECTURE IA-32 User Manual Page 210

Comments to this Manuals

Related products and manuals for Computer Accessories Intel ARCHITECTURE IA-32