[PATCH] D91020: [X86] Unbind the ebx with GOT address in regcall calling convention

Xiang Zhang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Nov 28 23:31:10 PST 2020


xiangzhangllvm marked 5 inline comments as done.
xiangzhangllvm added inline comments.


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:4136
 
       // Note: The actual moving to ECX is done further down.
       GlobalAddressSDNode *G = dyn_cast<GlobalAddressSDNode>(Callee);
----------------
xiangzhangllvm wrote:
> xiangzhangllvm wrote:
> > LuoYuanke wrote:
> > > xiangzhangllvm wrote:
> > > > LuoYuanke wrote:
> > > > > Would you add a test case for tail call? Is there any conflict to ECX?
> > > > I check here local before, For X86, tail call will not work with regcall, For X86_64 it will work, but X86_64 don't fix ebx with GOT point.
> > > > please check the following case, it will not generate jump  for 2nd command.
> > > > 
> > > > ```
> > > >   1 ; llc -mtriple=x86_64-unknown-linux-gnu -relocation-model=pic
> > > >   2 ; llc -mtriple=i386-unknown-linux-gnu -relocation-model=pic  <-tailcallopt>
> > > >   3
> > > >   4 declare x86_regcallcc void @regcall_not_lazy(i32 %a0, i32 %b0)
> > > >   5
> > > >   6 define void @tail_call_regcall() nounwind {
> > > >   7   tail call x86_regcallcc void @regcall_not_lazy(i32 0, i32 1)
> > > >   8   ret void
> > > >   9 }
> > > > ```
> > > It seems compiler generate jmp instruction only when the argument number is less or equal to 2 and without pic relocation model.
> > > ```
> > > ; llc -mtriple=x86_64-unknown-linux-gnu
> > > ; llc -mtriple=i386-unknown-linux-gnu
> > > 
> > > @foo6 = external global void (i32 %0, i32 %1, i32 %2, i32 %3, i32 %4, i32 %5)*
> > > 
> > > define void @tail_call_regcall6(i32 %a, i32 %b, i32 %c, i32 %d, i32 %e, ...) nounwind {
> > >   %t0 = alloca i32, align 128
> > >   %t1 = load void (i32, i32, i32, i32, i32, i32)*, void (i32, i32, i32, i32, i32, i32)** @foo6, align 4
> > >   tail call x86_regcallcc void %t1(i32 0, i32 1, i32 2, i32 3, i32 4, i32 5) nounwind
> > >   ret void
> > > }
> > > 
> > > @foo5 = external global void (i32 %0, i32 %1, i32 %2, i32 %3, i32 %4)*
> > > 
> > > define void @tail_call_regcall5(i32 %a, i32 %b, i32 %c, i32 %d, i32 %e) nounwind {
> > >   %t1 = load void (i32, i32, i32, i32, i32)*, void (i32, i32, i32, i32, i32)** @foo5, align 4
> > >   ; tail call x86_regcallcc void %t1(i32 0, i32 1, i32 2, i32 3, i32 4) nounwind
> > >   tail call x86_regcallcc void %t1(i32 %a, i32 %b, i32 %c, i32 %d, i32 %e) nounwind
> > >   ret void
> > > }
> > > 
> > > @foo4 = external global void (i32 %0, i32 %1, i32 %2, i32 %3)*
> > > 
> > > define void @tail_call_regcall4(i32 %a, i32 %b, i32 %c, i32 %d) nounwind {
> > >   %t1 = load void (i32, i32, i32, i32)*, void (i32, i32, i32, i32)** @foo4, align 4
> > >   ; tail call x86_regcallcc void %t1(i32 0, i32 1, i32 2, i32 3, i32 4) nounwind
> > >   tail call x86_regcallcc void %t1(i32 %a, i32 %b, i32 %c, i32 %d) nounwind
> > >   ret void
> > > }
> > > 
> > > @foo3 = external global void (i32 %0, i32 %1, i32 %2)*
> > > 
> > > define void @tail_call_regcall3(i32 %a, i32 %b) nounwind {
> > >   %t1 = load void (i32, i32, i32)*, void (i32, i32, i32)** @foo3, align 4
> > >   tail call x86_regcallcc void %t1(i32 0, i32 1, i32 2) nounwind
> > >   ret void
> > > }
> > > 
> > > @foo2 = external global void (i32 %0, i32 %1)*
> > > 
> > > define void @tail_call_regcall2(i32 %a, i32 %b) nounwind {
> > >   %t1 = load void (i32, i32)*, void (i32, i32)** @foo2, align 4
> > >   tail call x86_regcallcc void %t1(i32 0, i32 1) nounwind
> > >   ; tail call x86_regcallcc void %t1(i32 %a, i32 %b) nounwind
> > >   ret void
> > > }
> > > ```
> > I add "-tailcallopt" on your test, all jump disappeared.
> > The constrain of tail call should just be "variable argument lists are used" (should not according to the numbers of function).
> > I guess there must be some bug about tail call itself.
> > Anyway I'll take a deeper look to check the tail call, all my tests I checked before under pic mode.
> > And for the relation of ebx and GOT we only need to check pic mode.
> > It seems compiler generate jmp instruction only when the argument number is less or equal to 2 and without pic relocation model.
>     Yes, in fact, current X86 Lowering has consider the "tailcall address may be in a register", and it try to "escape" no register for allocation problem, so it limited the register number for function args.
>     For PIC mode, one more register need to be "bind" to GOT, so the register number for function args should less than non-PIC mode. And PIC mode will disable tail calls to external symbols with default visibility.
>    So I reproduce the tail call case for PIC in following case:
> 
> ```
> ; llc -mtriple=i386-unknown-linux-gnu -relocation-model=pic tail.ll
> 
> @a0 = global i32 0, align 4
> 
> define x86_regcallcc void @tail_call_regcall2(i32 %a) nounwind {
>   tail call x86_regcallcc void @__regcall3__func(i32 %a) nounwind
>   ret void
> }
> 
> define internal x86_regcallcc void @__regcall3__func(i32 %i1) #0 {
> entry:
>   store i32 %i1, i32* @a0, align 4
>   ret void
> }
> ```
> Current change did no affect on it. (tail call load the  address of the callee into ECX at PIC duo to the ebx/callee-saved problem)
Though this patch not affect tailcall, I still add test (X86/tailregccpic.ll) to make sure it.
What's more, tailcall has limited argument number for tail callee, =1 in pic mode, it has already consider the register allocation problem.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91020/new/

https://reviews.llvm.org/D91020



More information about the llvm-commits mailing list