[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)
Fabian Ritter via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Thu Jun 5 00:17:01 PDT 2025
================
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc &DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // This is already ensured by an assert in SelectionDAG::getNode(). Several
+ // combines here depend on this assumption.
+ assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+ return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+ return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+ return N0;
+
+ // fold (ptradd 0, x) -> x
+ if (isNullConstant(N0))
+ return N1;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1)) {
+ SDValue X = N0.getOperand(0);
+ SDValue Y = N0.getOperand(1);
+ SDValue Z = N1;
+ bool N0OneUse = N0.hasOneUse();
+ bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+ bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+ // (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+ // * y is a constant and (ptradd x, y) has one use; or
+ // * y and z are both constants.
+ if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+ SDNodeFlags Flags;
+ // If both additions in the original were NUW, the new ones are as well.
+ if (N->getFlags().hasNoUnsignedWrap() &&
+ N0->getFlags().hasNoUnsignedWrap())
+ Flags |= SDNodeFlags::NoUnsignedWrap;
+ SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+ AddToWorklist(Add.getNode());
+ return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+ }
+
+ // TODO: There is another possible fold here that was proven useful.
+ // It would be this:
+ //
+ // (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+ // * (ptradd x, y) has one use; and
+ // * y is a constant; and
+ // * z is not a constant.
+ //
+ // In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+ // opportunity to select more complex instructions such as SUBPT and
+ // MSUBPT. However, a hypothetical corner case has been found that we could
+ // not avoid. Consider this (pseudo-POSIX C):
+ //
+ // char *foo(char *x, int z) {return (x + LARGE_CONSTANT) + z;}
+ // char *p = mmap(LARGE_CONSTANT);
+ // char *q = foo(p, -LARGE_CONSTANT);
+ //
+ // Then x + LARGE_CONSTANT is one-past-the-end, so valid, and a
+ // further + z takes it back to the start of the mapping, so valid,
+ // regardless of the address mmap gave back. However, if mmap gives you an
+ // address < LARGE_CONSTANT (ignoring high bits), x - LARGE_CONSTANT will
+ // borrow from the high bits (with the subsequent + z carrying back into
+ // the high bits to give you a well-defined pointer) and thus trip
+ // FEAT_CPA's pointer corruption checks.
+ //
+ // We leave this fold as an opportunity for future work, addressing the
+ // corner case for FEAT_CPA, as well as reconciling the solution with the
+ // more general application of pointer arithmetic in other future targets.
----------------
ritter-x2a wrote:
My vague idea of handling this properly in the future would be to
- have an `inbounds` flag on `PTRADD` nodes (see #131862),
- have backends generate instructions that break for out-of-bounds arithmetic only when the `inbounds` flag is present, and
- give targets an option to request that transformations that would generate `PTRADD`s without `inbounds` flag are not applied.
I think that would give things like the CPA implementation a better semantic footing, since otherwise they would just be miscompiling the IR's `getelementptr`s without `inbounds` flags. However, at least the last point above is currently not on my critical path, so I'm open to adding the comment here or moving the other transform here.
https://github.com/llvm/llvm-project/pull/142739
More information about the llvm-branch-commits
mailing list