VPDPWSSD — Multiply and Add Signed Word Integers

Opcode/

Op/

64/32

CPUID Feature

Description

Instruction

En

bit Mode

Flag

Support

EVEX.128.66.0F38.W0 52 /r

A

V/V

AVX512_VNNI

Multiply groups of 2 pairs signed words in

VPDPWSSD xmm1{k1}{z}, xmm2,

AVX512VL

xmm3/m128/m32bcst with corresponding

xmm3/m128/m32bcst

signed words of xmm2, summing those products and adding them to doubleword result in xmm1, under writemask k1.

EVEX.256.66.0F38.W0 52 /r

A

V/V

AVX512_VNNI

Multiply groups of 2 pairs signed words in

VPDPWSSD ymm1{k1}{z}, ymm2,

AVX512VL

ymm3/m256/m32bcst with corresponding

ymm3/m256/m32bcst

signed words of ymm2, summing those products and adding them to doubleword result in ymm1, under writemask k1.

EVEX.512.66.0F38.W0 52 /r

A

V/V

AVX512_VNNI

Multiply groups of 2 pairs signed words in

VPDPWSSD zmm1{k1}{z}, zmm2,

zmm3/m512/m32bcst with corresponding

zmm3/m512/m32bcst

signed words of zmm2, summing those products and adding them to doubleword result in zmm1, under writemask k1.

Instruction Operand Encoding

Op/En Tuple Operand 1 Operand 2 Operand 3 Operand 4
A Full ModRM:reg (r, w) EVEX.vvvv (r) ModRM:r/m (r) NA

Description

Multiplies the individual signed words of the first source operand by the corresponding signed words of the second source operand, producing intermediate signed, doubleword results. The adjacent doubleword results are then summed and accumulated in the destination operand.

This instruction supports memory fault suppression.

Operation

VPDPWSSD dest, src1, src2

(KL,VL)=(4,128), (8,256), (16,512)

ORIGDEST := DEST

FOR i := 0 TO KL-1:

IF k1[i] or *no writemask*:

IF SRC2 is memory and EVEX.b == 1:

t := SRC2.dword[0]

ELSE:

t := SRC2.dword[i]

p1dword := SIGN_EXTEND(SRC1.word[2*i]) * SIGN_EXTEND(t.word[0])

p2dword := SIGN_EXTEND(SRC1.word[2*i+1]) * SIGN_EXTEND(t.word[1])

DEST.dword[i] := ORIGDEST.dword[i] + p1dword + p2dword

ELSE IF *zeroing*:

DEST.dword[i] := 0

ELSE:

// Merge masking, dest element unchanged

DEST.dword[i] := ORIGDEST.dword[i]

DEST[MAX_VL-1:VL] := 0

Intel C/C++ Compiler Intrinsic Equivalent

VPDPWSSD __m128i _mm_dpwssd_epi32(__m128i, __m128i, __m128i);

VPDPWSSD __m128i _mm_mask_dpwssd_epi32(__m128i, __mmask8, __m128i, __m128i);

VPDPWSSD __m128i _mm_maskz_dpwssd_epi32(__mmask8, __m128i, __m128i, __m128i);

VPDPWSSD __m256i _mm256_dpwssd_epi32(__m256i, __m256i, __m256i);

VPDPWSSD __m256i _mm256_mask_dpwssd_epi32(__m256i, __mmask8, __m256i, __m256i);

VPDPWSSD __m256i _mm256_maskz_dpwssd_epi32(__mmask8, __m256i, __m256i, __m256i);

VPDPWSSD __m512i _mm512_dpwssd_epi32(__m512i, __m512i, __m512i);

VPDPWSSD __m512i _mm512_mask_dpwssd_epi32(__m512i, __mmask16, __m512i, __m512i);

VPDPWSSD __m512i _mm512_maskz_dpwssd_epi32(__mmask16, __m512i, __m512i, __m512i);

SIMD Floating-Point Exceptions

None.

Other Exceptions

See Table 2-49, “Type E4 Class Exception Conditions”.