[PATCH] D69372: [X86][VARARG] Avoid spilling xmm vararg arguments.

Alexey Lapshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 14 10:42:08 PST 2020


avl added a comment.

In D69372#1818166 <https://reviews.llvm.org/D69372#1818166>, @rnk wrote:

> > For the noimplicitfloat mode, the compiler mustn't generate
> >  floating-point code if it was not asked directly to do so.
>
> I don't think checking AL on function entry will work reliably. In general, musttail is meant to forward all possible register parameters. Normal arguments may be passed in XMM registers without setting AL. Consider this code:
>
>   typedef float v4f32 __attribute__((__vector_size__(16), __aligned__(16)));
>   int gv_i32;
>   v4f32 gv_v4f32;
>   void bar(int x, v4f32 v) {
>     gv_i32 = x;
>     gv_v4f32 = v;
>   }
>   void foo() {
>     v4f32 v = {};
>     bar(42, v);
>   }
>
>
> It should be possible to use musttail thunk with a prototype of `void (int, ...)` between the call to foo and bar.


if I correctly understood the example - such a situation should not occur.
My understanding is that musttail thunk and it`s target _must_ have identical signatures.

In the above example, thunk assumed this definition:  void (int, ...) while the target has this definition: void (int x, v4f32 v), that is wrong.

There should be either:

thunk: void (int, ...)
target: void (int, ...)

either:

thunk: void (int x, v4f32 v) 
target: void (int x, v4f32 v)

In both these cases, this patch would work correctly.
Thus, assuming identical declarations, checking AL should work correctly.

> And, if that thunk contains a function call, it will need to spill&fill all XMM argument registers. If you look at the generated assembly for calling `bar`, AL is not set to 1:
> 
>   # %bb.0:                                # %entry
>           pushq   %rax
>           .cfi_def_cfa_offset 16
>           xorps   %xmm0, %xmm0
>           movl    $42, %edi
>           callq   _Z3bariDv4_f

Right. AL is not set to 1. That is so because bar() does not use variable arguments according to its declaration. 
Thunk function should assume the same signature and do not use AL trick then.

> Is there actually a use case for musttail thunks and `noimplicitfloat`? Could we instead just declare that such a thunk represents explicit FP usage?

I do not know the real examples. But in general, it looks like a valid use case(musttail thunks and noimplicitfloat).

> Furthermore, this will probably do the wrong thing on Windows, where AL is never set, to my knowledge. Consider this C++ code that uses musttail in the MS ABI:
> 
>   int gv_i32;
>   double gv_f64;
>   class Foo {
>     void virtual bar(int x, double v);
>     void foo();
>   };
>   void Foo::bar(int x, double v) {
>     gv_i32 = x;
>     gv_f64 = v;
>   }
>   void Foo::foo() {
>     auto mp = &Foo::bar;
>     (this->*mp)(42, 0.0);
>   }
> 
> 
> The virtual member pointer thunk needs to pass along XMM2, although it doesn't contain a function call.

This patch does not work on windows . It is only for amd64 ABI:

bool X86CallLowering::lowerCall()
...

  if (STI.is64Bit() && !IsFixed && !STI.isCallingConvWin64(Info.CallConv)) {
    // From AMD64 ABI document:
    // For calls that may call functions that use varargs or stdargs
    // (prototype-less calls or calls to functions containing ellipsis (...) in
    // the declaration) %al is used as hidden argument to specify the number
    // of SSE registers used. The contents of %al do not need to match exactly
    // the number of registers, but must be an ubound on the number of SSE
    // registers used and is in the range 0 - 8 inclusive.
  
    MIRBuilder.buildInstr(X86::MOV8ri)
        .addDef(X86::AL)
        .addImm(Handler.getNumXmmRegs());
    MIB.addUse(X86::AL, RegState::Implicit);
  }

X86ISelLowering.cpp:
// FIXME: Get this from tablegen.
static ArrayRef<MCPhysReg> get64BitArgumentXMMs(MachineFunction &MF,

                                                CallingConv::ID CallConv,
                                                const X86Subtarget &Subtarget) {
  assert(Subtarget.is64Bit());
  if (Subtarget.isCallingConvWin64(CallConv)) {
    // The XMM registers which might contain var arg parameters are shadowed
    // in their paired GPR.  So we only need to save the GPR to their home
    // slots.
    // TODO: __vectorcall will change this.
    return None;
  }


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69372/new/

https://reviews.llvm.org/D69372





More information about the llvm-commits mailing list