[llvm] x86: fix musttail sibcall miscompilation (PR #168956)

Eli Friedman via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 24 15:46:35 PST 2025


================
@@ -2018,6 +2018,61 @@ SDValue X86TargetLowering::getMOVL(SelectionDAG &DAG, const SDLoc &dl, MVT VT,
   return DAG.getVectorShuffle(VT, dl, V1, V2, Mask);
 }
 
+// Returns the type of copying which is required to set up a byval argument to
+// a tail-called function. This isn't needed for non-tail calls, because they
+// always need the equivalent of CopyOnce, but tail-calls sometimes need two to
+// avoid clobbering another argument (CopyViaTemp), and sometimes can be
+// optimised to zero copies when forwarding an argument from the caller's
+// caller (NoCopy).
+X86TargetLowering::ByValCopyKind X86TargetLowering::ByValNeedsCopyForTailCall(
+    SelectionDAG &DAG, SDValue Src, SDValue Dst, ISD::ArgFlagsTy Flags) const {
+  MachineFrameInfo &MFI = DAG.getMachineFunction().getFrameInfo();
+
+  // Globals are always safe to copy from.
+  if (isa<GlobalAddressSDNode>(Src) || isa<ExternalSymbolSDNode>(Src))
+    return CopyOnce;
+
+  // Can only analyse frame index nodes, conservatively assume we need a
+  // temporary.
+  auto *SrcFrameIdxNode = dyn_cast<FrameIndexSDNode>(Src);
+  auto *DstFrameIdxNode = dyn_cast<FrameIndexSDNode>(Dst);
+  if (!SrcFrameIdxNode || !DstFrameIdxNode)
+    return CopyViaTemp;
+
+  int SrcFI = SrcFrameIdxNode->getIndex();
+  int DstFI = DstFrameIdxNode->getIndex();
+  assert(MFI.isFixedObjectIndex(DstFI) &&
+         "byval passed in non-fixed stack slot");
+
+  int64_t SrcOffset = MFI.getObjectOffset(SrcFI);
+  int64_t DstOffset = MFI.getObjectOffset(DstFI);
+
+  // FIXME:
+
+  //  // If the source is in the local frame, then the copy to the argument
+  //  memory
+  //  // is always valid.
+  //  bool FixedSrc = MFI.isFixedObjectIndex(SrcFI);
+  //  if (!FixedSrc ||
+  //      (FixedSrc && SrcOffset < -(int64_t)AFI->getArgRegsSaveSize()))
+  //    return CopyOnce;
----------------
efriedma-quic wrote:

The basic check here is checking whether we're referring to a non-fixed stack allocation (which will be allocated later), or some fixed stack offset which doesn't overlap the argument list, or some fixed stack offset which does overlap the argument list.  The first two cases, we just need a single copy; for the third case, we need CopyViaTemp.

In the Arm calling convention, this is a bit more complicated because byval values can be split across registers and memory: the allocation for a byval value can start below the argument list, but extend into it.

This isn't a thing on x86, so I think you can just replace the reference to getArgRegsSaveSize() with zero.

https://github.com/llvm/llvm-project/pull/168956


More information about the llvm-commits mailing list