[PATCH] D133768: [DAGCombine] Do not fold SRA/SRL of MUL into MULH when MUL's LSB are used, and MUL_LOHI is available
Juan Manuel Martinez CaamaƱo via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 13 04:11:26 PDT 2022
jmmartinez created this revision.
jmmartinez added projects: AMDGPU, LLVM.
Herald added subscribers: kosarev, ecnelises, kerbowa, hiraditya, jvesely.
Herald added a project: All.
jmmartinez requested review of this revision.
Herald added a subscriber: llvm-commits.
Folding into a sra(mul) / srl(mul) into a mulh introduces an extra multiplication to compute the high half of the multiplication,
while it is more profitable to compute the high and lower halfs with a single mul_lohi.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D133768
Files:
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
llvm/test/CodeGen/AMDGPU/mul_lohi.ll
Index: llvm/test/CodeGen/AMDGPU/mul_lohi.ll
===================================================================
--- /dev/null
+++ llvm/test/CodeGen/AMDGPU/mul_lohi.ll
@@ -0,0 +1,16 @@
+; RUN: llc -march=amdgcn -mcpu=gfx900 < %s | FileCheck -check-prefix=GCN %s
+
+define i32 @kernel(i32 %0, i32 %1, i32* %2) {
+ ; GCN-LABEL: kernel:
+ ; GCN: ; %bb.0:
+ ; GCN-NOT: v_mul_{{lo|hi}}
+ ; GCN: v_mad_u64_u32
+ %4 = zext i32 %0 to i64
+ %5 = zext i32 %1 to i64
+ %6 = mul nuw i64 %5, %4
+ %7 = lshr i64 %6, 32
+ %8 = trunc i64 %7 to i32
+ store i32 %8, i32* %2, align 4
+ %9 = trunc i64 %6 to i32
+ ret i32 %9
+}
Index: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
===================================================================
--- llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -9244,6 +9244,28 @@
EVT NarrowVT = LeftOp.getOperand(0).getValueType();
unsigned NarrowVTSize = NarrowVT.getScalarSizeInBits();
+ // return true if U may use the lower bits of its operands
+ auto UserOfLowerBits = [NarrowVTSize](SDNode *U) {
+ if (U->getOpcode() != ISD::SRL || U->getOpcode() != ISD::SRA) {
+ return true;
+ }
+ ConstantSDNode *UShiftAmtSrc = isConstOrConstSplat(U->getOperand(1));
+ if (!UShiftAmtSrc) {
+ return true;
+ }
+ unsigned UShiftAmt = UShiftAmtSrc->getZExtValue();
+ return UShiftAmt < NarrowVTSize;
+ };
+
+ // If the lower part of the MUL is also used and MUL_LOHI is supported
+ // do not introduce the MULH in favor of MUL_LOHI
+ unsigned MulLoHiOp = IsSignExt ? ISD::SMUL_LOHI : ISD::UMUL_LOHI;
+ if (ShiftOperand->use_size() > 1 &&
+ TLI.isOperationLegalOrCustom(MulLoHiOp, NarrowVT) &&
+ llvm::any_of(ShiftOperand->uses(), UserOfLowerBits)) {
+ return SDValue();
+ }
+
SDValue MulhRightOp;
if (ConstantSDNode *Constant = isConstOrConstSplat(RightOp)) {
unsigned ActiveBits = IsSignExt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D133768.459710.patch
Type: text/x-patch
Size: 1949 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220913/57f294aa/attachment.bin>
More information about the llvm-commits
mailing list