[llvm] [AArch64] Add CodeGen support for FEAT_CPA (PR #105669)
Jessica Clarke via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 22 10:07:54 PDT 2024
================
@@ -2578,6 +2581,98 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc &DL) {
return SDValue();
}
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+/// This function was adapted from DAGCombiner::visitPTRADD() from the Morello
+/// project, which is based on CHERI.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+ SDValue N0 = N->getOperand(0);
+ SDValue N1 = N->getOperand(1);
+ EVT PtrVT = N0.getValueType();
+ EVT IntVT = N1.getValueType();
+ SDLoc DL(N);
+
+ // fold (ptradd undef, y) -> undef
+ if (N0.isUndef())
+ return N0;
+
+ // fold (ptradd x, undef) -> undef
+ if (N1.isUndef())
+ return DAG.getUNDEF(PtrVT);
+
+ // fold (ptradd x, 0) -> x
+ if (isNullConstant(N1))
+ return N0;
+
+ if (N0.getOpcode() == ISD::PTRADD &&
+ !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1)) {
+ SDValue X = N0.getOperand(0);
+ SDValue Y = N0.getOperand(1);
+ SDValue Z = N1;
+ bool N0OneUse = N0.hasOneUse();
+ bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+ bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+ // (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
----------------
jrtc27 wrote:
This assumes pointer addition is commutative (not quite the right word), which isn't true for us. I also don't think it's true for you in the completely general case. Consider (pseudo-POSIX C, ignoring whatever annotations and allocator functions are needed to get CPA to kick in)
```
char *foo(char *x, int z) {
return (x + LARGE_CONSTANT) + z;
}
char *p = mmap(LARGE_CONSTANT);
char *q = foo(p, -LARGE_CONSTANT);
```
Then `x + PAGE_SIZE` is one-past-the-end, so valid, and a further `+ z` takes it back to the start of the mapping, so valid, regardless of the address mmap gave back. However, if mmap gives you an address `< LARGE_CONSTANT` (ignoring high bits), `x - LARGE_CONSTANT` will borrow from the high bits (with the subsequent `+ z` carrying back into the high bits to give you a well-defined pointer) and thus trip CPA's checks.
You can also come up with a similar case where y is negative and z is positive if you have an address at the end of the address space (high bits not included).
I wouldn't like to say what conditions you need to prove that you can commute like this (I believe both offsets being the same sign is sufficient, but may not be fully necessary for you?), but suspect they're rather similar to ours, which is why we've not done this for CHERI capabilities, which hit this much more often, since our boundaries for where this goes wrong are the capability's (representable) bounds rather than the ends of entire address space.
https://github.com/llvm/llvm-project/pull/105669
More information about the llvm-commits
mailing list