[PATCH] D116039: [X86] Combine reduce (add (mul x, y)) to VNNI instruction.

LuoYuanke via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 21 23:41:10 PST 2021


LuoYuanke added a comment.

In D116039#3206040 <https://reviews.llvm.org/D116039#3206040>, @lebedev.ri wrote:

> Could you please explain why there are both the knownbits-based checks, and checks for ISD::SIGN/ZERO_EXTEND nodes?

The VPDPBUSD multiplies the individual unsigned bytes of the first source operand by the corresponding signed bytes of the second source operand, producing intermediate signed word results. The word results are then summed and accumulated in the destination dword element size operand.

For src2, it is signed value, so we don't need to check for ISD::SIGN nodes, because if the signed bits are 1 it is negative value and if the signed bits are 0 it is positive value.
But for src1, it is unsigned value. If it is a positive value it is OK, but if it is a negative value we can't use VPDPBUSD to combine the original nodes. See test case mul_sext_i4i4 and mul_zext_i4i4() in dpbusd_i4.ll. For mul_zext_i4i4 we can use VPDPBUSD, but for mul_sext_i4i4 we can't because the src1 may be negative value.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116039/new/

https://reviews.llvm.org/D116039



More information about the llvm-commits mailing list