Intel ARCHITECTURE IA-32 User Manual download pdf (Page 248)

100

101

IA-32 Intel® Architecture Optimization

4-28

The code above converts values to unsigned numbers first and then clips

them to an unsigned range. The last instruction converts the data back to

signed data and places the data within the signed range. Conversion to

unsigned data is required for correct results when (

high - low) <

0x8000.

If (

high - low) >= 0x8000, the algorithm can be simplified as shown in

Example 4-21.

This algorithm saves a cycle when it is known that (

high - low) >=

0x8000. The three-instruction algorithm does not work when (high -

low) < 0x8000, because 0xffff minus any number < 0x8000 will yield

a number greater in magnitude than

0x8000, which is a negative

number. When the second instruction,

psubssw MM0, (0xffff - high

+ low)

, in the three-step algorithm (Example 4-21) is executed, a

negative number is subtracted. The result of this subtraction causes the

values in

MM0 to be increased instead of decreased, as should be the case,

and an incorrect answer is generated.

Clipping to an Arbitrary Unsigned Range [high, low]

Example 4-22 clips an unsigned value to the unsigned range [high,

low

]. If the value is less than low or greater than high, then clip to low

high, respectively. This technique uses the packed-add and

Example 4-21 Simplified Clipping to an Arbitrary Signed Range

; Input: MM0 signed source operands

; Output: MM1 signed operands clipped to the unsigned

; range [high, low]

paddssw MM0, (packed_max - packed_high)

; in effect this clips to high

psubssw MM0, (packed_usmax - packed_high + packed_ow)

; clips to low

paddw MM0, low ; undo the previous two offsets

1 2 ... 243 244 245 246 247 248 249 250 251 252 253 ... 567 568

Comments to this Manuals

No comments

Intel ARCHITECTURE IA-32 User Manual Page 248

Comments to this Manuals

Related products and manuals for Computer Accessories Intel ARCHITECTURE IA-32