[PATCH] D105807: [X86] pr51000 struct return tailcalling

Nathan Sidwell via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 12 04:53:39 PDT 2021


urnathan created this revision.
urnathan added a reviewer: craig.topper.
Herald added subscribers: pengfei, hiraditya.
urnathan requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

This addresses pr51000 -- a failure to tailcall an object factory function.  The underlying issue is tailcalling to or from functions that return a structure.

The x86 backend -- in common with many others -- disables tailcalling when either the caller or callee return a struct (regardless of whether the struct is in registers or via additional out parameter).  I suspect this is from when the middle end was less smart, and didn't handle the return object lifetime well.  However, AFAICT, it is now smarter, and its tailcallable optimizations do account for that object's lifetime -- just like any other local object or pointer parameter.  Plus gone are the days of ABIs that returned structs in globals! (at least I hope so).  However, I'm not sure whether there are x86 ABIs that return structs via a pointer in an abn
ormal register that the middle end doesn't know about?  I'm not sure how such an implementation would be representable without actually modeling at least a proxy for that register in the middle end, as otherwise it wouldn't know about liveness or escaping of the object to which that register ends up pointing?

The x86 backend only tries to tailcall fns the middle end has marked as tail calls, and then it checks for additional constraints like not being able to tail call through the PLT, or variadic fns in all cases.

So, just removing the struct restriction appears to be sufficient. Obviously this affects the existing sibcall.ll test, and I added a new C++ testcase to check that we generate the expected code in the 51000 cases, and some closely related cases (and do not tail call when the callee returns a large struct into the caller's stack frame).

I bootstrapped an x86_64-linux release compiler and observed that the stage 2 and stage 3 compilers have exactly the same size text, data and bss.  Both compilers validate, of course.

pr51000 hypothesizes the issue is arch independent -- that is incorrect.  Each code generator that manifests this issue has the same restriction. I've only looked at x86 for the moment.


https://reviews.llvm.org/D105807

Files:
  clang/test/CodeGenCXX/pr51000.cpp
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/lib/Target/X86/X86ISelLowering.h
  llvm/test/CodeGen/X86/sibcall.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D105807.357901.patch
Type: text/x-patch
Size: 18992 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210712/8d547629/attachment-0001.bin>


More information about the llvm-commits mailing list