[PATCH] D91020: [X86] Unbind the ebx with GOT address in regcall calling convention

Xiang Zhang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Nov 7 22:55:43 PST 2020


xiangzhangllvm marked 2 inline comments as not done.
xiangzhangllvm added inline comments.


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:4120
   if (Subtarget.isPICStyleGOT()) {
     // ELF / PIC requires GOT in the EBX register before function calls via PLT
+    // GOT pointer (except regcall).
----------------
LuoYuanke wrote:
> To be conservative, do you need to check if the argument number exceed 4 (>=5)?
It is unclear to me why we need fix ebx for GOT points other than let it freely choose, (even in other calling convention), 
In my eyes, It is not well/strange to bind the ebx with GOT according to argument number.


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:4136
 
       // Note: The actual moving to ECX is done further down.
       GlobalAddressSDNode *G = dyn_cast<GlobalAddressSDNode>(Callee);
----------------
LuoYuanke wrote:
> xiangzhangllvm wrote:
> > LuoYuanke wrote:
> > > Would you add a test case for tail call? Is there any conflict to ECX?
> > I check here local before, For X86, tail call will not work with regcall, For X86_64 it will work, but X86_64 don't fix ebx with GOT point.
> > please check the following case, it will not generate jump  for 2nd command.
> > 
> > ```
> >   1 ; llc -mtriple=x86_64-unknown-linux-gnu -relocation-model=pic
> >   2 ; llc -mtriple=i386-unknown-linux-gnu -relocation-model=pic  <-tailcallopt>
> >   3
> >   4 declare x86_regcallcc void @regcall_not_lazy(i32 %a0, i32 %b0)
> >   5
> >   6 define void @tail_call_regcall() nounwind {
> >   7   tail call x86_regcallcc void @regcall_not_lazy(i32 0, i32 1)
> >   8   ret void
> >   9 }
> > ```
> It seems compiler generate jmp instruction only when the argument number is less or equal to 2 and without pic relocation model.
> ```
> ; llc -mtriple=x86_64-unknown-linux-gnu
> ; llc -mtriple=i386-unknown-linux-gnu
> 
> @foo6 = external global void (i32 %0, i32 %1, i32 %2, i32 %3, i32 %4, i32 %5)*
> 
> define void @tail_call_regcall6(i32 %a, i32 %b, i32 %c, i32 %d, i32 %e, ...) nounwind {
>   %t0 = alloca i32, align 128
>   %t1 = load void (i32, i32, i32, i32, i32, i32)*, void (i32, i32, i32, i32, i32, i32)** @foo6, align 4
>   tail call x86_regcallcc void %t1(i32 0, i32 1, i32 2, i32 3, i32 4, i32 5) nounwind
>   ret void
> }
> 
> @foo5 = external global void (i32 %0, i32 %1, i32 %2, i32 %3, i32 %4)*
> 
> define void @tail_call_regcall5(i32 %a, i32 %b, i32 %c, i32 %d, i32 %e) nounwind {
>   %t1 = load void (i32, i32, i32, i32, i32)*, void (i32, i32, i32, i32, i32)** @foo5, align 4
>   ; tail call x86_regcallcc void %t1(i32 0, i32 1, i32 2, i32 3, i32 4) nounwind
>   tail call x86_regcallcc void %t1(i32 %a, i32 %b, i32 %c, i32 %d, i32 %e) nounwind
>   ret void
> }
> 
> @foo4 = external global void (i32 %0, i32 %1, i32 %2, i32 %3)*
> 
> define void @tail_call_regcall4(i32 %a, i32 %b, i32 %c, i32 %d) nounwind {
>   %t1 = load void (i32, i32, i32, i32)*, void (i32, i32, i32, i32)** @foo4, align 4
>   ; tail call x86_regcallcc void %t1(i32 0, i32 1, i32 2, i32 3, i32 4) nounwind
>   tail call x86_regcallcc void %t1(i32 %a, i32 %b, i32 %c, i32 %d) nounwind
>   ret void
> }
> 
> @foo3 = external global void (i32 %0, i32 %1, i32 %2)*
> 
> define void @tail_call_regcall3(i32 %a, i32 %b) nounwind {
>   %t1 = load void (i32, i32, i32)*, void (i32, i32, i32)** @foo3, align 4
>   tail call x86_regcallcc void %t1(i32 0, i32 1, i32 2) nounwind
>   ret void
> }
> 
> @foo2 = external global void (i32 %0, i32 %1)*
> 
> define void @tail_call_regcall2(i32 %a, i32 %b) nounwind {
>   %t1 = load void (i32, i32)*, void (i32, i32)** @foo2, align 4
>   tail call x86_regcallcc void %t1(i32 0, i32 1) nounwind
>   ; tail call x86_regcallcc void %t1(i32 %a, i32 %b) nounwind
>   ret void
> }
> ```
I add "-tailcallopt" on your test, all jump disappeared.
The constrain of tail call should just be "variable argument lists are used" (should not according to the numbers of function).
I guess there must be some bug about tail call itself.
Anyway I'll take a deeper look to check the tail call, all my tests I checked before under pic mode.
And for the relation of ebx and GOT we only need to check pic mode.


================
Comment at: llvm/test/CodeGen/X86/x86-regcall-got.ll:55
+attributes #0 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }
+attributes #1 = { noinline nounwind optnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }
+
----------------
MaskRay wrote:
> The attributes can be reduced 
> The attributes can be reduced 




================
Comment at: llvm/test/CodeGen/X86/x86-regcall-got.ll:63
+!2 = !{i32 7, !"PIC Level", i32 1}
+!3 = !{!"Intel(R) oneAPI DPC++ Compiler 2021.1 (YYYY.x.0.MMDD)"}
----------------
MaskRay wrote:
> Unneeded metadatas should be stripped.
I'll fix it, thank you!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91020/new/

https://reviews.llvm.org/D91020



More information about the llvm-commits mailing list