Intel ARCHITECTURE IA-32 User Manual download pdf (Page 201)

100

101

Coding for SIMD Architectures 3

3-21

By adding the padding variable pad, the structure is now 8 bytes, and if

the first element is aligned to 8 bytes (64 bits), all following elements

will also be aligned. The sample declaration follows:

typedef struct { short x,y,z; char a; char pad; }

Point;

Point pt[N];

Using Arrays to Make Data Contiguous

In the following code,

for (i=0; i<N; i++) pt[i].y *= scale;

the second dimension y needs to be multiplied by a scaling value. Here

the

for loop accesses each y dimension in the array pt thus disallowing

the access to contiguous data. This can degrade the performance of the

application by increasing cache misses, by achieving poor utilization of

each cache line that is fetched, and by increasing the chance for accesses

which span multiple cache lines.

The following declaration allows you to vectorize the scaling operation

and further improve the alignment of the data access patterns:

short ptx[N], pty[N], ptz[N];

for (i=0; i<N; i++) pty[i] *= scale;

With the SIMD technology, choice of data organization becomes more

important and should be made carefully based on the operations that

will be performed on the data. In some applications, traditional data

arrangements may not lead to the maximum performance.

A simple example of this is an FIR filter. An FIR filter is effectively a

vector dot product in the length of the number of coefficient taps.

Consider the following code:

(data [ j ] *coeff [0] + data [j+1]*coeff [1]+...+data

[j+num of taps-1]*coeff [num of taps-1]),

If in the code above the filter operation of data element i is the vector

dot product that begins at data element

j, then the filter operation of

data element

i+1 begins at data element j+1.

1 2 ... 196 197 198 199 200 201 202 203 204 205 206 ... 567 568

Comments to this Manuals

No comments

Intel ARCHITECTURE IA-32 User Manual Page 201

Comments to this Manuals

Related products and manuals for Computer Accessories Intel ARCHITECTURE IA-32