[PATCH] D116039: [X86] Combine reduce (add (mul x, y)) to VNNI instruction.
Phoebe Wang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Dec 25 21:57:03 PST 2021
pengfei added a comment.
Format the code as Lint suggested?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:41810
+ std::swap(Op0, Op1);
+
+ auto IsFreeTruncation =
----------------
Can we return false here if `Op0` is not ZERO_EXTEND?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:41813-41814
+ [](SDValue &Op) -> bool {
+ if ((Op.getOpcode() == ISD::ZERO_EXTEND ||
+ Op.getOpcode() == ISD::SIGN_EXTEND) &&
+ Op.getOperand(0).getValueType().getScalarSizeInBits() <= 8)
----------------
How about `ANY_EXTEND` ? The same below.
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:41826
+ if ((IsFreeTruncation(Op0) && DAG.ComputeMinSignedBits(Op0) <= 9 &&
+ Op0.getOpcode() == ISD::ZERO_EXTEND) &&
+ (IsFreeTruncation(Op1) && DAG.ComputeMinSignedBits(Op1) <= 8))
----------------
It's better to hoist to line 41810. How about `ANY_EXTEND`?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:41870
+ ZExt0.getValueType();
+ LogBias = 2;
+
----------------
It is always 2? Better to add a comment to explain.
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:41887-41888
+
+ // Actually build the DotProduct, split as 128/256/512 bits for
+ // SSE/AVX2/AVX512BW.
+ auto DpBuilder = [&](SelectionDAG &DAG, const SDLoc &DL,
----------------
AVXVNNI implies AVX2. We won't need to split to 128 bits.
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:42156-42157
+ EVT ExtractVT = Extract->getValueType(0);
+ // Verify the type we're extracting is i32, as the output element type of
+ // vpdpbusd and vpdpwssd is i32.
+ if (ExtractVT != MVT::i32)
----------------
Can we generate i32 first then do the truncation?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:42162-42163
+ EVT VT = Extract->getOperand(0).getValueType();
+ if (!isPowerOf2_32(VT.getVectorNumElements()))
+ return SDValue();
+
----------------
Can check `DCI.isAfterLegalizeDAG()` before calling the function instead?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:42210
+ EVT ResVT = EVT::getVectorVT(*DAG.getContext(), ExtractVT,
+ DpVT.getSizeInBits() / ExtractSizeInBits);
+ DP = DAG.getBitcast(ResVT, DP);
----------------
Can ues `ExtractVT.getSizeInBits()` derectly.
================
Comment at: llvm/test/CodeGen/X86/dpbusd.ll:13
+; AVXVNNI-NEXT: vextracti128 $1, %ymm0, %xmm1
+; AVXVNNI-NEXT: vpaddd %xmm1, %xmm0, %xmm0
+; AVXVNNI-NEXT: vpshufd {{.*#+}} xmm1 = xmm0[2,3,2,3]
----------------
This is the only and a strange diff with the AVX512 code. Is there anything wrong in one of each?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D116039/new/
https://reviews.llvm.org/D116039
More information about the llvm-commits
mailing list