[llvm] [AArch64] Don't tail call memset if it would convert to a bzero. (PR #98969)

Tue Jul 16 11:20:49 PDT 2024

================
@@ -685,7 +687,10 @@ bool llvm::returnTypeIsEligibleForTailCall(const Function *F,
          (IID == Intrinsic::memmove &&
           TLI.getLibcallName(RTLIB::MEMMOVE) == StringRef("memmove")) ||
          (IID == Intrinsic::memset &&
-          TLI.getLibcallName(RTLIB::MEMSET) == StringRef("memset"))) &&
+          TLI.getLibcallName(RTLIB::MEMSET) == StringRef("memset") &&
+          (!isa<ConstantInt>(Call->getOperand(1)) ||
+           !cast<ConstantInt>(Call->getOperand(1))->isZero() ||
+           !TLI.getLibcallName(RTLIB::BZERO)))) &&
----------------
efriedma-quic wrote:

Due to the way calls are lowered, we have two different tail-call-position analysis passes, yes: one for regular calls, which are lowered as part of building the SelectionDAG, and one for library calls which are created as part of legalization/optimization/etc.

We could teach getMemmove() etc. to call the correct isInTailCallPosition() based on the arguments: if the caller passed in an Instruction, use the IR version, if the caller passed in an SDNode, use the SelectionDAG version.  Or we could try to use the SelectionDAG version exclusively: add an ISD::MEMMOVE node, and lower it after SelectionDAGBuilder.

https://github.com/llvm/llvm-project/pull/98969