[llvm] [AMDGPU] Folding imm offset in more cases for scratch access (PR #70634)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 8 00:56:22 PST 2023
================
@@ -1146,13 +1146,62 @@ bool AMDGPUDAGToDAGISel::isDSOffset2Legal(SDValue Base, unsigned Offset0,
return CurDAG->SignBitIsZero(Base);
}
-bool AMDGPUDAGToDAGISel::isFlatScratchBaseLegal(SDValue Base,
+// Check that the address value of flat scratch load/store being put into
+// SGPR/VGPR is legal with respect to hardware's requirement that address in
+// SGPR/VGPR should be unsigned. When \p CheckTwoInstrs is set, we will check
+// against the instruction that defines \p Addr as well as the instruction that
+// defines the base address. When \p CheckTwoOperands is set, we will check both
+// operands (In case of two instructions, they are the operands from the
+// instruction that defines the base address).
+bool AMDGPUDAGToDAGISel::isFlatScratchBaseLegal(SDValue Addr,
+ bool CheckTwoInstrs,
+ bool CheckTwoOperands,
uint64_t FlatVariant) const {
if (FlatVariant != SIInstrFlags::FlatScratch)
return true;
- // When value in 32-bit Base can be negative calculate scratch offset using
- // 32-bit add instruction, otherwise use Base(unsigned) + offset.
- return CurDAG->SignBitIsZero(Base);
+
+ // Whether we can infer the operands are non-negative if the result is
+ // non-negative.
+ auto HasOnlyNonNegativeOperands = [](SDValue Addr) -> bool {
+ return (Addr.getOpcode() == ISD::ADD &&
+ Addr->getFlags().hasNoUnsignedWrap()) ||
+ Addr->getOpcode() == ISD::OR;
+ };
+
+ if (CheckTwoInstrs) {
+ auto Base = Addr.getOperand(0);
+ // Make sure we are doing SGPR + VGPR + Imm.
+ assert(isa<ConstantSDNode>(Addr.getOperand(1)));
+ auto *RHSImm = cast<ConstantSDNode>(Addr.getOperand(1));
+ if (HasOnlyNonNegativeOperands(Base) &&
+ (HasOnlyNonNegativeOperands(Addr) || RHSImm->getSExtValue() < 0))
+ return true;
+
+ auto LHS = Base.getOperand(0);
+ auto RHS = Base.getOperand(1);
+ return CurDAG->SignBitIsZero(LHS) && CurDAG->SignBitIsZero(RHS);
----------------
arsenm wrote:
Probably should check RHS first, as that's canonically simpler
https://github.com/llvm/llvm-project/pull/70634
More information about the llvm-commits
mailing list