[PATCH] D105807: [X86] pr51000 struct return tailcalling
Eli Friedman via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 13 11:28:20 PDT 2021
efriedma added inline comments.
================
Comment at: llvm/test/CodeGen/X86/sibcall.ll:854
%b = call fastcc %struct.foo* @ret_struct()
tail call fastcc void @t21_f_sret2(%struct.foo* noalias sret(%struct.foo) %a, %struct.foo* noalias %b) nounwind
ret void
----------------
urnathan wrote:
> efriedma wrote:
> > Oh, now I remember what happened the last time someone tried to fix this... t21_sret_to_sret_structs_mismatch() is also illegal to transform, for similar reasons to t21_sret_to_non_sret().
> Ah, I think I understand now -- I was thinking this would be UB, but I can see how it might be legit. Let me walk though my understanding.
>
> `define fastcc void @t21_sret_to_sret_structs_mismatch(%struct.foo* noalias sret(%struct.foo) %agg.result, %struct.foo* noalias %a) nounwind {`
> we receive an sret as %agg.result and a explicit pointer in %a (to uninitialized memory? or is that too C++-specific?).
>
> ` %b = call fastcc %struct.foo* @ret_struct()`
> this gives us a pointer to a foo in %b [aside, this seems superfluous to the xform being tested?]
>
> ` tail call fastcc void @t21_f_sret2(%struct.foo* noalias sret(%struct.foo) %a, %struct.foo* noalias %b) nounwind`
> We call t21_f_sret2 passing it %a as its incoming sret pointer and %b as an explicit arg. It'll return %a, not our incoming sret.
>
> I thought this would be UB, because we're not initializing our own return value -- in C++ we should be calling a constructor. but (a) we could have inlined that and (b) it could be trivial and do nothing. That was my logical error.
>
> So, I think
> a) if we're not an sret function we can tail call[*],
> b) else if we're calling a non-sret function we cannot tail call,
> c) else if we're passing our sret register as the callee's sret register we can tail call[*],
> d) else we cannot.
> [*] provided the remaining checks are ok
That set of rules looks right.
For the testcase, a C equivalent would be something like this:
```
struct A { int x[10]; };
__attribute((noinline)) struct A t21_f_sret2(struct A *__restrict x) {
return (struct A){0};
}
struct A t21_sret_to_sret_structs_mismatch(struct A a[__restrict static 1]) {
struct A r = {};
*a = t21_f_sret2(0);
return r;
}
```
If you run this through clang -O2, you get IR almost identical to this testcase. (The result is missing a "tail" marker, but that's just a missed optimization; we're missing a run of tailcallelim in the normal pass pipeline.)
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D105807/new/
https://reviews.llvm.org/D105807
More information about the llvm-commits
mailing list