[llvm] r269001 - [X86][AVX512] Use the proper load/store for AVX512 registers.

Quentin Colombet via llvm-commits llvm-commits at lists.llvm.org
Tue May 10 09:10:15 PDT 2016


> On May 9, 2016, at 10:37 PM, Craig Topper <craig.topper at gmail.com> wrote:
> 
> I made a few updates in r269017, r269018, r269019. We're now saving the full ZMM registers with AVX512. I also added them to the getCallPreservedMask function too since the code was already similar to the callee save code.

Thanks for catching this Craig!

> 
> Any idea why 32-bit mode doesn't have an AVX or AVX512 check?

It is probably broken.
The whole calling convention stuff is pretty broken as far as I can tell with respect to honoring actual features. I’ve pinged Juergen who added some of that code to know what was the intended semantic of those callee saved lists.

Cheers,
-Quentin
> 
> On Mon, May 9, 2016 at 7:28 PM, Craig Topper <craig.topper at gmail.com <mailto:craig.topper at gmail.com>> wrote:
> Though the test it fails is codeGen/X86/x86-interrupt_cc.ll     so maybe its another calling convention issue.
> 
> On Mon, May 9, 2016 at 7:24 PM, Craig Topper <craig.topper at gmail.com <mailto:craig.topper at gmail.com>> wrote:
> Those assertions should check for hasVLX() not hasAVX512() and that fails regressions. So this still isn't completely right.
> 
> On Mon, May 9, 2016 at 6:09 PM, Quentin Colombet via llvm-commits <llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>> wrote:
> Author: qcolombet
> Date: Mon May  9 20:09:14 2016
> New Revision: 269001
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=269001&view=rev <http://llvm.org/viewvc/llvm-project?rev=269001&view=rev>
> Log:
> [X86][AVX512] Use the proper load/store for AVX512 registers.
> 
> When loading or storing AVX512 registers we were not using the AVX512
> variant of the load and store for VR128 and VR256 like registers.
> Thus, we ended up with the wrong encoding and actually were dropping the
> high bits of the instruction. The result was that we load or store the
> wrong register. The effect is visible only when we emit the object file
> directly and disassemble it. Then, the output of the disassembler does
> not match the assembly input.
> 
> This is related to llvm.org/PR27481 <http://llvm.org/PR27481>.
> 
> Added:
>     llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll
> Modified:
>     llvm/trunk/lib/Target/X86/X86CallingConv.td
>     llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
>     llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp
> 
> Modified: llvm/trunk/lib/Target/X86/X86CallingConv.td
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86CallingConv.td?rev=269001&r1=269000&r2=269001&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86CallingConv.td?rev=269001&r1=269000&r2=269001&view=diff>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86CallingConv.td (original)
> +++ llvm/trunk/lib/Target/X86/X86CallingConv.td Mon May  9 20:09:14 2016
> @@ -899,6 +899,8 @@ def CSR_64_AllRegs     : CalleeSavedRegs
>  def CSR_64_AllRegs_AVX : CalleeSavedRegs<(sub (add CSR_64_MostRegs, RAX, RSP,
>                                                     (sequence "YMM%u", 0, 15)),
>                                                (sequence "XMM%u", 0, 15))>;
> +def CSR_64_AllRegs_AVX512 : CalleeSavedRegs<(add CSR_64_AllRegs_AVX,
> +                                             (sequence "YMM%u", 16, 31))>;
> 
>  // Standard C + YMM6-15
>  def CSR_Win64_Intel_OCL_BI_AVX : CalleeSavedRegs<(add RBX, RBP, RDI, RSI, R12,
> 
> Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=269001&r1=269000&r2=269001&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=269001&r1=269000&r2=269001&view=diff>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86InstrInfo.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86InstrInfo.cpp Mon May  9 20:09:14 2016
> @@ -4645,23 +4645,35 @@ static unsigned getLoadStoreRegOpcode(un
>      assert((X86::VR128RegClass.hasSubClassEq(RC) ||
>              X86::VR128XRegClass.hasSubClassEq(RC))&& "Unknown 16-byte regclass");
>      // If stack is realigned we can use aligned stores.
> +    if (X86::VR128RegClass.hasSubClassEq(RC)) {
> +      if (isStackAligned)
> +        return load ? (HasAVX ? X86::VMOVAPSrm : X86::MOVAPSrm)
> +                    : (HasAVX ? X86::VMOVAPSmr : X86::MOVAPSmr);
> +      else
> +        return load ? (HasAVX ? X86::VMOVUPSrm : X86::MOVUPSrm)
> +                    : (HasAVX ? X86::VMOVUPSmr : X86::MOVUPSmr);
> +    }
> +    assert(STI.hasAVX512() && "Using extended register requires AVX512");
>      if (isStackAligned)
> -      return load ?
> -        (HasAVX ? X86::VMOVAPSrm : X86::MOVAPSrm) :
> -        (HasAVX ? X86::VMOVAPSmr : X86::MOVAPSmr);
> +      return load ? X86::VMOVAPSZ128rm : X86::VMOVAPSZ128mr;
>      else
> -      return load ?
> -        (HasAVX ? X86::VMOVUPSrm : X86::MOVUPSrm) :
> -        (HasAVX ? X86::VMOVUPSmr : X86::MOVUPSmr);
> +      return load ? X86::VMOVUPSZ128rm : X86::VMOVUPSZ128mr;
>    }
>    case 32:
>      assert((X86::VR256RegClass.hasSubClassEq(RC) ||
>              X86::VR256XRegClass.hasSubClassEq(RC)) && "Unknown 32-byte regclass");
>      // If stack is realigned we can use aligned stores.
> +    if (X86::VR256RegClass.hasSubClassEq(RC)) {
> +      if (isStackAligned)
> +        return load ? X86::VMOVAPSYrm : X86::VMOVAPSYmr;
> +      else
> +        return load ? X86::VMOVUPSYrm : X86::VMOVUPSYmr;
> +    }
> +    assert(STI.hasAVX512() && "Using extended register requires AVX512");
>      if (isStackAligned)
> -      return load ? X86::VMOVAPSYrm : X86::VMOVAPSYmr;
> +      return load ? X86::VMOVAPSZ256rm : X86::VMOVAPSZ256mr;
>      else
> -      return load ? X86::VMOVUPSYrm : X86::VMOVUPSYmr;
> +      return load ? X86::VMOVUPSZ256rm : X86::VMOVUPSZ256mr;
>    case 64:
>      assert(X86::VR512RegClass.hasSubClassEq(RC) && "Unknown 64-byte regclass");
>      if (isStackAligned)
> 
> Modified: llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp?rev=269001&r1=269000&r2=269001&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp?rev=269001&r1=269000&r2=269001&view=diff>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp Mon May  9 20:09:14 2016
> @@ -294,10 +294,11 @@ X86RegisterInfo::getCalleeSavedRegs(cons
>      return CSR_64_SaveList;
>    case CallingConv::X86_INTR:
>      if (Is64Bit) {
> +      if (HasAVX512)
> +        return CSR_64_AllRegs_AVX512_SaveList;
>        if (HasAVX)
>          return CSR_64_AllRegs_AVX_SaveList;
> -      else
> -        return CSR_64_AllRegs_SaveList;
> +      return CSR_64_AllRegs_SaveList;
>      } else {
>        if (HasSSE)
>          return CSR_32_AllRegs_SSE_SaveList;
> 
> Added: llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll?rev=269001&view=auto <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll?rev=269001&view=auto>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll (added)
> +++ llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll Mon May  9 20:09:14 2016
> @@ -0,0 +1,19 @@
> +; RUN: llc -verify-machineinstrs -mtriple=x86_64-apple-macosx -show-mc-encoding -mattr=+avx512f < %s | FileCheck %s
> +
> +
> +; Make sure we spill the high numbered YMM registers with the right encoding.
> +; CHECK-LABEL: foo
> +; CHECK: movups %ymm31, {{.+}}
> +; CHECK:      encoding: [0x62,0x61,0x7c,0x28,0x11,0xbc,0x24,0xf0,0x03,0x00,0x00]
> +; ymm30 is used as an anchor for the previous regexp.
> +; CHECK-NEXT: movups %ymm30
> +; CHECK: call
> +; CHECK: iret
> +
> +define x86_intrcc void @foo(i8* %frame) {
> +  call void @bar()
> +  ret void
> +}
> +
> +declare void @bar()
> +
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits>
> 
> 
> 
> -- 
> ~Craig
> 
> 
> 
> -- 
> ~Craig
> 
> 
> 
> -- 
> ~Craig

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160510/b7631621/attachment.html>


More information about the llvm-commits mailing list