x64 floating point blends

Question

Description: Double-precision floating-point values from the second source operand (third operand) are conditionally merged with values from the first source operand (second operand) and written to the destination operand (first operand). The immediate bits [3:0] determine whether the corresponding double-precision floating-point value in the destination is copied from the second source or first source. If a bit in the mask, corresponding to a word, is “1", then the double-precision floating-point value in the second source operand is copied, else the value in the first source operand is copied.

What bit in the 8-bit immediate value is the one that matters? Do the other bits matter at all?

Jester Jester · Accepted Answer · 2015-01-04T00:28:24

As your quote says, the relevant bits are [3:0], that is the low 4 bits. Each of those control operation for the corresponding word. Since you have 4 words (floats) in an SSE register, you have 4 control bits. The top 4 bits are ignored. Note that the operation section has pseudocode that clearly describes the, erm, operation:

BLENDPS (128-bit Legacy SSE version)
IF (IMM8[0] = 0) THEN DEST[31:0] <- DEST[31:0]
        ELSE DEST [31:0] <- SRC[31:0] FI
IF (IMM8[1] = 0) THEN DEST[63:32] <- DEST[63:32]
        ELSE DEST [63:32] <- SRC[63:32] FI
IF (IMM8[2] = 0) THEN DEST[95:64] <- DEST[95:64]
        ELSE DEST [95:64] <- SRC[95:64] FI
IF (IMM8[3] = 0) THEN DEST[127:96] <- DEST[127:96]
        ELSE DEST [127:96] <- SRC[127:96] FI

Well, this is the single-precision BLENDPS. You mention double precision with 4 bits, so that must mean BLENDPD. With SSE registers, that only uses 2 bits since you can only fit 2 doubles into 128 bits. The AVX version indeed uses 4 bits. The logic is the same as above.

x64 floating point blends

1 Answers