<div dir="ltr">I made a few updates in r269017, r269018, r269019. We're now saving the full ZMM registers with AVX512. I also added them to the getCallPreservedMask function too since the code was already similar to the callee save code.<div><br></div><div>Any idea why 32-bit mode doesn't have an AVX or AVX512 check?</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, May 9, 2016 at 7:28 PM, Craig Topper <span dir="ltr"><<a href="mailto:craig.topper@gmail.com" target="_blank">craig.topper@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Though the test it fails is codeGen/X86/x86-interrupt_cc.ll so maybe its another calling convention issue.</div><div class="gmail_extra"><div><div class="h5"><br><div class="gmail_quote">On Mon, May 9, 2016 at 7:24 PM, Craig Topper <span dir="ltr"><<a href="mailto:craig.topper@gmail.com" target="_blank">craig.topper@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Those assertions should check for hasVLX() not hasAVX512() and that fails regressions. So this still isn't completely right.</div><div class="gmail_extra"><div><div><br><div class="gmail_quote">On Mon, May 9, 2016 at 6:09 PM, Quentin Colombet via llvm-commits <span dir="ltr"><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author: qcolombet<br>
Date: Mon May 9 20:09:14 2016<br>
New Revision: 269001<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=269001&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=269001&view=rev</a><br>
Log:<br>
[X86][AVX512] Use the proper load/store for AVX512 registers.<br>
<br>
When loading or storing AVX512 registers we were not using the AVX512<br>
variant of the load and store for VR128 and VR256 like registers.<br>
Thus, we ended up with the wrong encoding and actually were dropping the<br>
high bits of the instruction. The result was that we load or store the<br>
wrong register. The effect is visible only when we emit the object file<br>
directly and disassemble it. Then, the output of the disassembler does<br>
not match the assembly input.<br>
<br>
This is related to <a href="http://llvm.org/PR27481" rel="noreferrer" target="_blank">llvm.org/PR27481</a>.<br>
<br>
Added:<br>
llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll<br>
Modified:<br>
llvm/trunk/lib/Target/X86/X86CallingConv.td<br>
llvm/trunk/lib/Target/X86/X86InstrInfo.cpp<br>
llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86CallingConv.td<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86CallingConv.td?rev=269001&r1=269000&r2=269001&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86CallingConv.td?rev=269001&r1=269000&r2=269001&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86CallingConv.td (original)<br>
+++ llvm/trunk/lib/Target/X86/X86CallingConv.td Mon May 9 20:09:14 2016<br>
@@ -899,6 +899,8 @@ def CSR_64_AllRegs : CalleeSavedRegs<br>
def CSR_64_AllRegs_AVX : CalleeSavedRegs<(sub (add CSR_64_MostRegs, RAX, RSP,<br>
(sequence "YMM%u", 0, 15)),<br>
(sequence "XMM%u", 0, 15))>;<br>
+def CSR_64_AllRegs_AVX512 : CalleeSavedRegs<(add CSR_64_AllRegs_AVX,<br>
+ (sequence "YMM%u", 16, 31))>;<br>
<br>
// Standard C + YMM6-15<br>
def CSR_Win64_Intel_OCL_BI_AVX : CalleeSavedRegs<(add RBX, RBP, RDI, RSI, R12,<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=269001&r1=269000&r2=269001&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=269001&r1=269000&r2=269001&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86InstrInfo.cpp (original)<br>
+++ llvm/trunk/lib/Target/X86/X86InstrInfo.cpp Mon May 9 20:09:14 2016<br>
@@ -4645,23 +4645,35 @@ static unsigned getLoadStoreRegOpcode(un<br>
assert((X86::VR128RegClass.hasSubClassEq(RC) ||<br>
X86::VR128XRegClass.hasSubClassEq(RC))&& "Unknown 16-byte regclass");<br>
// If stack is realigned we can use aligned stores.<br>
+ if (X86::VR128RegClass.hasSubClassEq(RC)) {<br>
+ if (isStackAligned)<br>
+ return load ? (HasAVX ? X86::VMOVAPSrm : X86::MOVAPSrm)<br>
+ : (HasAVX ? X86::VMOVAPSmr : X86::MOVAPSmr);<br>
+ else<br>
+ return load ? (HasAVX ? X86::VMOVUPSrm : X86::MOVUPSrm)<br>
+ : (HasAVX ? X86::VMOVUPSmr : X86::MOVUPSmr);<br>
+ }<br>
+ assert(STI.hasAVX512() && "Using extended register requires AVX512");<br>
if (isStackAligned)<br>
- return load ?<br>
- (HasAVX ? X86::VMOVAPSrm : X86::MOVAPSrm) :<br>
- (HasAVX ? X86::VMOVAPSmr : X86::MOVAPSmr);<br>
+ return load ? X86::VMOVAPSZ128rm : X86::VMOVAPSZ128mr;<br>
else<br>
- return load ?<br>
- (HasAVX ? X86::VMOVUPSrm : X86::MOVUPSrm) :<br>
- (HasAVX ? X86::VMOVUPSmr : X86::MOVUPSmr);<br>
+ return load ? X86::VMOVUPSZ128rm : X86::VMOVUPSZ128mr;<br>
}<br>
case 32:<br>
assert((X86::VR256RegClass.hasSubClassEq(RC) ||<br>
X86::VR256XRegClass.hasSubClassEq(RC)) && "Unknown 32-byte regclass");<br>
// If stack is realigned we can use aligned stores.<br>
+ if (X86::VR256RegClass.hasSubClassEq(RC)) {<br>
+ if (isStackAligned)<br>
+ return load ? X86::VMOVAPSYrm : X86::VMOVAPSYmr;<br>
+ else<br>
+ return load ? X86::VMOVUPSYrm : X86::VMOVUPSYmr;<br>
+ }<br>
+ assert(STI.hasAVX512() && "Using extended register requires AVX512");<br>
if (isStackAligned)<br>
- return load ? X86::VMOVAPSYrm : X86::VMOVAPSYmr;<br>
+ return load ? X86::VMOVAPSZ256rm : X86::VMOVAPSZ256mr;<br>
else<br>
- return load ? X86::VMOVUPSYrm : X86::VMOVUPSYmr;<br>
+ return load ? X86::VMOVUPSZ256rm : X86::VMOVUPSZ256mr;<br>
case 64:<br>
assert(X86::VR512RegClass.hasSubClassEq(RC) && "Unknown 64-byte regclass");<br>
if (isStackAligned)<br>
<br>
Modified: llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp?rev=269001&r1=269000&r2=269001&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp?rev=269001&r1=269000&r2=269001&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp (original)<br>
+++ llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp Mon May 9 20:09:14 2016<br>
@@ -294,10 +294,11 @@ X86RegisterInfo::getCalleeSavedRegs(cons<br>
return CSR_64_SaveList;<br>
case CallingConv::X86_INTR:<br>
if (Is64Bit) {<br>
+ if (HasAVX512)<br>
+ return CSR_64_AllRegs_AVX512_SaveList;<br>
if (HasAVX)<br>
return CSR_64_AllRegs_AVX_SaveList;<br>
- else<br>
- return CSR_64_AllRegs_SaveList;<br>
+ return CSR_64_AllRegs_SaveList;<br>
} else {<br>
if (HasSSE)<br>
return CSR_32_AllRegs_SSE_SaveList;<br>
<br>
Added: llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll?rev=269001&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll?rev=269001&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll (added)<br>
+++ llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll Mon May 9 20:09:14 2016<br>
@@ -0,0 +1,19 @@<br>
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-apple-macosx -show-mc-encoding -mattr=+avx512f < %s | FileCheck %s<br>
+<br>
+<br>
+; Make sure we spill the high numbered YMM registers with the right encoding.<br>
+; CHECK-LABEL: foo<br>
+; CHECK: movups %ymm31, {{.+}}<br>
+; CHECK: encoding: [0x62,0x61,0x7c,0x28,0x11,0xbc,0x24,0xf0,0x03,0x00,0x00]<br>
+; ymm30 is used as an anchor for the previous regexp.<br>
+; CHECK-NEXT: movups %ymm30<br>
+; CHECK: call<br>
+; CHECK: iret<br>
+<br>
+define x86_intrcc void @foo(i8* %frame) {<br>
+ call void @bar()<br>
+ ret void<br>
+}<br>
+<br>
+declare void @bar()<br>
+<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
</blockquote></div><br><br clear="all"><div><br></div></div></div><span><font color="#888888">-- <br><div>~Craig</div>
</font></span></div>
</blockquote></div><br><br clear="all"><div><br></div></div></div><span class="HOEnZb"><font color="#888888">-- <br><div>~Craig</div>
</font></span></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature">~Craig</div>
</div>