[llvm] r219899 - TRE: make TRE a bit more aggressive

Nick Lewycky nlewycky at google.com
Fri Oct 17 13:22:48 PDT 2014


On 17 October 2014 13:00, Reid Kleckner <rnk at google.com> wrote:

> On Fri, Oct 17, 2014 at 12:56 PM, Nick Lewycky <nlewycky at google.com>
> wrote:
>
>> On 17 October 2014 12:17, Rafael EspĂ­ndola <rafael.espindola at gmail.com>
>> wrote:
>>
>>> In fact this was PR7272.
>>>
>>
>> PR7272 is the other way around?
>>
>> define void @foo(i32* %x) {
>>   tail call void @bar(i32* byval %x)  ;; bad tail from PR7272
>>   ret void
>> }
>>
>> define void @foo(i32* byval %x) {
>>   tail call void @bar(i32* %x)  ;; okay tail from this patch
>>   ret void
>> }
>>
>
> This is precisely the case when it's not safe to tail call. You're
> capturing the address of something allocated on the current stack frame and
> passing it down, which you can't do.
>

Maybe I'm misunderstanding, but I really think this is fine. Specifically I
claim your error is on the clause "allocated on the current stack frame".
It's not allocated on the current stack frame, it's allocated on the stack
frame one above us.

declare void @bar(i32*)

define void @foo(i32* byval %x) {
  call void @bar(i32* %x)
  ret void
}

define void @test(i32* %y) {
  call void @foo(i32* byval %y)
  ret void
}

foo:                                    # @foo
        pushq   %rax
        leaq    16(%rsp), %rdi
        callq   bar
        popq    %rax
        retq

test:                                   # @test
        pushq   %rax
        movl    (%rdi), %eax
        movl    %eax, (%rsp)
        callq   foo
        popq    %rax
        retq

Let's see that again with a larger value being passed.

%large = type { [33 x i8] }

declare void @bar(%large*)

define void @foo(%large* byval %x) {
  call void @bar(%large* %x)
  ret void
}

define void @test(%large* %y) {
  call void @foo(%large* byval %y)
  ret void
}

foo:                                    # @foo
        pushq   %rax
        leaq    16(%rsp), %rdi
        callq   bar
        popq    %rax
        retq

test:                                   # @test
        subq    $40, %rsp
        movb    32(%rdi), %al
        movb    %al, 32(%rsp)
        movq    24(%rdi), %rax
        movq    %rax, 24(%rsp)
        movq    16(%rdi), %rax
        movq    %rax, 16(%rsp)
        movq    (%rdi), %rax
        movq    8(%rdi), %rcx
        movq    %rcx, 8(%rsp)
        movq    %rax, (%rsp)
        callq   foo
        addq    $40, %rsp
        retq

Nice big stack adjustment in test, not in foo. Passing a byval argument is
always, as far as I can tell, in the caller's stack not the callee's stack.
foo is free to tail call that argument down to bar. It's equivalent to
having an alloca in test, then foo can still tail call bar (we get that
case right today).

There might be other ABIs which but byvals in the callee's stack. I don't
know.

The PR7272 case doesn't have that problem. It could hypothetically be
> lowered correctly, and IMO should be rejected by the backend instead of the
> middle end.
>

It's hard for the backend to redo the analysis to determine whether any
byval arguments flow into tail calls. I think if we do this in the middle,
the backend would have to simply not do tail calls in the case where there
function has a byval argument.

Nick

At least I think so?
>>
>> On 17 October 2014 15:14, Rafael EspĂ­ndola <rafael.espindola at gmail.com>
>>> wrote:
>>> > Is this actually safe?
>>> >
>>> > The caller will deallocate the stack and then pass a pointer to that
>>> > area to the callee...
>>> >
>>> > On 15 October 2014 23:27, Saleem Abdulrasool <compnerd at compnerd.org>
>>> wrote:
>>> >> Author: compnerd
>>> >> Date: Wed Oct 15 22:27:30 2014
>>> >> New Revision: 219899
>>> >>
>>> >> URL: http://llvm.org/viewvc/llvm-project?rev=219899&view=rev
>>> >> Log:
>>> >> TRE: make TRE a bit more aggressive
>>> >>
>>> >> Make tail recursion elimination a bit more aggressive.  This allows
>>> us to get
>>> >> tail recursion on functions that are just branches to a different
>>> function.  The
>>> >> fact that the function takes a byval argument does not restrict it
>>> from being
>>> >> optimised into just a tail call.
>>> >>
>>> >> Added:
>>> >>     llvm/trunk/test/Transforms/TailCallElim/byval.ll
>>> >> Modified:
>>> >>     llvm/trunk/lib/Transforms/Scalar/TailRecursionElimination.cpp
>>> >>     llvm/trunk/test/Transforms/Inline/byval-tail-call.ll
>>> >>     llvm/trunk/test/Transforms/TailCallElim/basic.ll
>>> >>
>>> >> Modified:
>>> llvm/trunk/lib/Transforms/Scalar/TailRecursionElimination.cpp
>>> >> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/TailRecursionElimination.cpp?rev=219899&r1=219898&r2=219899&view=diff
>>> >>
>>> ==============================================================================
>>> >> --- llvm/trunk/lib/Transforms/Scalar/TailRecursionElimination.cpp
>>> (original)
>>> >> +++ llvm/trunk/lib/Transforms/Scalar/TailRecursionElimination.cpp Wed
>>> Oct 15 22:27:30 2014
>>> >> @@ -249,12 +249,7 @@ bool TailCallElim::markTails(Function &F
>>> >>      return false;
>>> >>    AllCallsAreTailCalls = true;
>>> >>
>>> >> -  // The local stack holds all alloca instructions and all byval
>>> arguments.
>>> >>    AllocaDerivedValueTracker Tracker;
>>> >> -  for (Argument &Arg : F.args()) {
>>> >> -    if (Arg.hasByValAttr())
>>> >> -      Tracker.walk(&Arg);
>>> >> -  }
>>> >>    for (auto &BB : F) {
>>> >>      for (auto &I : BB)
>>> >>        if (AllocaInst *AI = dyn_cast<AllocaInst>(&I))
>>> >> @@ -310,9 +305,8 @@ bool TailCallElim::markTails(Function &F
>>> >>          for (auto &Arg : CI->arg_operands()) {
>>> >>            if (isa<Constant>(Arg.getUser()))
>>> >>              continue;
>>> >> -          if (Argument *A = dyn_cast<Argument>(Arg.getUser()))
>>> >> -            if (!A->hasByValAttr())
>>> >> -              continue;
>>> >> +          if (isa<Argument>(Arg.getUser()))
>>> >> +            continue;
>>> >>            SafeToTail = false;
>>> >>            break;
>>> >>          }
>>> >>
>>> >> Modified: llvm/trunk/test/Transforms/Inline/byval-tail-call.ll
>>> >> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/Inline/byval-tail-call.ll?rev=219899&r1=219898&r2=219899&view=diff
>>> >>
>>> ==============================================================================
>>> >> --- llvm/trunk/test/Transforms/Inline/byval-tail-call.ll (original)
>>> >> +++ llvm/trunk/test/Transforms/Inline/byval-tail-call.ll Wed Oct 15
>>> 22:27:30 2014
>>> >> @@ -27,10 +27,11 @@ define internal void @qux(i32* byval %x)
>>> >>    tail call void @ext(i32* null)
>>> >>    ret void
>>> >>  }
>>> >> +
>>> >>  define void @frob(i32* %x) {
>>> >>  ; CHECK-LABEL: define void @frob(
>>> >>  ; CHECK: alloca i32
>>> >> -; CHECK: {{^ *}}call void @ext(
>>> >> +; CHECK: {{^ *}}tail call void @ext(
>>> >>  ; CHECK: tail call void @ext(i32* null)
>>> >>  ; CHECK: ret void
>>> >>    tail call void @qux(i32* byval %x)
>>> >>
>>> >> Modified: llvm/trunk/test/Transforms/TailCallElim/basic.ll
>>> >> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/TailCallElim/basic.ll?rev=219899&r1=219898&r2=219899&view=diff
>>> >>
>>> ==============================================================================
>>> >> --- llvm/trunk/test/Transforms/TailCallElim/basic.ll (original)
>>> >> +++ llvm/trunk/test/Transforms/TailCallElim/basic.ll Wed Oct 15
>>> 22:27:30 2014
>>> >> @@ -147,7 +147,7 @@ cond_false:
>>> >>  ; Don't tail call if a byval arg is captured.
>>> >>  define void @test9(i32* byval %a) {
>>> >>  ; CHECK-LABEL: define void @test9(
>>> >> -; CHECK: {{^ *}}call void @use(
>>> >> +; CHECK: {{^ *}}tail call void @use(
>>> >>    call void @use(i32* %a)
>>> >>    ret void
>>> >>  }
>>> >>
>>> >> Added: llvm/trunk/test/Transforms/TailCallElim/byval.ll
>>> >> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/TailCallElim/byval.ll?rev=219899&view=auto
>>> >>
>>> ==============================================================================
>>> >> --- llvm/trunk/test/Transforms/TailCallElim/byval.ll (added)
>>> >> +++ llvm/trunk/test/Transforms/TailCallElim/byval.ll Wed Oct 15
>>> 22:27:30 2014
>>> >> @@ -0,0 +1,34 @@
>>> >> +; RUN: opt -mtriple i386 -Os -S %s -o - | FileCheck %s
>>> >> +; RUN: opt -mtriple x86_64 -Os -S %s -o - | FileCheck %s
>>> >> +; RUN: opt -mtriple armv7 -Os -S %s -o - | FileCheck %s
>>> >> +
>>> >> +%struct.D16 = type { [16 x double] }
>>> >> +
>>> >> +declare void @_Z2OpP3D16PKS_S2_(%struct.D16*, %struct.D16*,
>>> %struct.D16*)
>>> >> +
>>> >> +define void @_Z7TestRefRK3D16S1_(%struct.D16* noalias sret
>>> %agg.result, %struct.D16* %RHS, %struct.D16* %LHS) {
>>> >> +  %1 = alloca %struct.D16*, align 8
>>> >> +  %2 = alloca %struct.D16*, align 8
>>> >> +  store %struct.D16* %RHS, %struct.D16** %1, align 8
>>> >> +  store %struct.D16* %LHS, %struct.D16** %2, align 8
>>> >> +  %3 = load %struct.D16** %1, align 8
>>> >> +  %4 = load %struct.D16** %2, align 8
>>> >> +  call void @_Z2OpP3D16PKS_S2_(%struct.D16* %agg.result,
>>> %struct.D16* %3, %struct.D16* %4)
>>> >> +  ret void
>>> >> +}
>>> >> +
>>> >> +; CHECK: define void @_Z7TestRefRK3D16S1_({{.*}}) {
>>> >> +; CHECK:   tail call void @_Z2OpP3D16PKS_S2_(%struct.D16*
>>> %agg.result, %struct.D16* %RHS, %struct.D16* %LHS)
>>> >> +; CHECK:   ret void
>>> >> +; CHECK: }
>>> >> +
>>> >> +define void @_Z7TestVal3D16S_(%struct.D16* noalias sret %agg.result,
>>> %struct.D16* byval align 8 %RHS, %struct.D16* byval align 8 %LHS) {
>>> >> +  call void @_Z2OpP3D16PKS_S2_(%struct.D16* %agg.result,
>>> %struct.D16* %RHS, %struct.D16* %LHS)
>>> >> +  ret void
>>> >> +}
>>> >> +
>>> >> +; CHECK: define void @_Z7TestVal3D16S_({{.*}}) {
>>> >> +; CHECK:   tail call void @_Z2OpP3D16PKS_S2_(%struct.D16*
>>> %agg.result, %struct.D16* %RHS, %struct.D16* %LHS)
>>> >> +; CHECK:   ret void
>>> >> +; CHECK: }
>>> >> +
>>> >>
>>> >>
>>> >> _______________________________________________
>>> >> llvm-commits mailing list
>>> >> llvm-commits at cs.uiuc.edu
>>> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141017/7e183016/attachment.html>


More information about the llvm-commits mailing list