[PATCH] D116039: [X86] Combine reduce (add (mul x, y)) to VNNI instruction.

Wed Dec 22 00:32:42 PST 2021

LuoYuanke added inline comments.

================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:41780
+  // value, so we just check the signed bits.
+  if ((IsFreeTruncation(Op0) && DAG.ComputeMinSignedBits(Op0) <= 9 &&
+       Op0.getOpcode() == ISD::ZERO_EXTEND) &&
----------------
LuoYuanke wrote:
> Maybe we can remove IsFreeTruncation() check as Roman mentions.
> Roman, do you mean to remove IsFreeTruncation() check?
> Maybe we can remove IsFreeTruncation() check as Roman mentions.
> Roman, do you mean to remove IsFreeTruncation() check?

If I remove the ISD::SIGN/ZERO_EXTEND check, I got crash with below test case in createVPDPBUSD(). I think there is room to improve the patch to cover more pattern. But to be conservatively I'd like to improve it in another patch, so that if we have regression we can revert less code.
Hi Roman,
What do you think?

```
declare i32 @llvm.vector.reduce.add.v16i32(<16 x i32>)

define dso_local i32 @mul_i4i2(<16 x i4> %b, i32 %c) {
entry:
  %0 = trunc <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15> to <16 x i16>
  %1 = zext <16 x i16> %0 to <16 x i32>
  %2 = zext <16 x i4> %b to <16 x i32>
  %3 = mul nsw <16 x i32> %2, %1
  %4 = call i32 @llvm.vector.reduce.add.v16i32(<16 x i32> %3)
  %op.extra = add nsw i32 %4, %c
  ret i32 %op.extra
}

```

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116039/new/

https://reviews.llvm.org/D116039