<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On May 9, 2016, at 10:37 PM, Craig Topper <<a href="mailto:craig.topper@gmail.com" class="">craig.topper@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">I made a few updates in r269017, r269018, r269019. We're now saving the full ZMM registers with AVX512. I also added them to the getCallPreservedMask function too since the code was already similar to the callee save code.</div></div></blockquote><div><br class=""></div><div>Thanks for catching this Craig!</div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><br class=""></div><div class="">Any idea why 32-bit mode doesn't have an AVX or AVX512 check?</div></div></div></blockquote><div><br class=""></div>It is probably broken.</div><div>The whole calling convention stuff is pretty broken as far as I can tell with respect to honoring actual features. I’ve pinged Juergen who added some of that code to know what was the intended semantic of those callee saved lists.</div><div><br class=""></div><div>Cheers,</div><div>-Quentin<br class=""><blockquote type="cite" class=""><div class=""><div class="gmail_extra"><br class=""><div class="gmail_quote">On Mon, May 9, 2016 at 7:28 PM, Craig Topper <span dir="ltr" class=""><<a href="mailto:craig.topper@gmail.com" target="_blank" class="">craig.topper@gmail.com</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class="">Though the test it fails is codeGen/X86/x86-interrupt_cc.ll so maybe its another calling convention issue.</div><div class="gmail_extra"><div class=""><div class="h5"><br class=""><div class="gmail_quote">On Mon, May 9, 2016 at 7:24 PM, Craig Topper <span dir="ltr" class=""><<a href="mailto:craig.topper@gmail.com" target="_blank" class="">craig.topper@gmail.com</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class="">Those assertions should check for hasVLX() not hasAVX512() and that fails regressions. So this still isn't completely right.</div><div class="gmail_extra"><div class=""><div class=""><br class=""><div class="gmail_quote">On Mon, May 9, 2016 at 6:09 PM, Quentin Colombet via llvm-commits <span dir="ltr" class=""><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank" class="">llvm-commits@lists.llvm.org</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author: qcolombet<br class="">
Date: Mon May 9 20:09:14 2016<br class="">
New Revision: 269001<br class="">
<br class="">
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=269001&view=rev" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project?rev=269001&view=rev</a><br class="">
Log:<br class="">
[X86][AVX512] Use the proper load/store for AVX512 registers.<br class="">
<br class="">
When loading or storing AVX512 registers we were not using the AVX512<br class="">
variant of the load and store for VR128 and VR256 like registers.<br class="">
Thus, we ended up with the wrong encoding and actually were dropping the<br class="">
high bits of the instruction. The result was that we load or store the<br class="">
wrong register. The effect is visible only when we emit the object file<br class="">
directly and disassemble it. Then, the output of the disassembler does<br class="">
not match the assembly input.<br class="">
<br class="">
This is related to <a href="http://llvm.org/PR27481" rel="noreferrer" target="_blank" class="">llvm.org/PR27481</a>.<br class="">
<br class="">
Added:<br class="">
llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll<br class="">
Modified:<br class="">
llvm/trunk/lib/Target/X86/X86CallingConv.td<br class="">
llvm/trunk/lib/Target/X86/X86InstrInfo.cpp<br class="">
llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp<br class="">
<br class="">
Modified: llvm/trunk/lib/Target/X86/X86CallingConv.td<br class="">
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86CallingConv.td?rev=269001&r1=269000&r2=269001&view=diff" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86CallingConv.td?rev=269001&r1=269000&r2=269001&view=diff</a><br class="">
==============================================================================<br class="">
--- llvm/trunk/lib/Target/X86/X86CallingConv.td (original)<br class="">
+++ llvm/trunk/lib/Target/X86/X86CallingConv.td Mon May 9 20:09:14 2016<br class="">
@@ -899,6 +899,8 @@ def CSR_64_AllRegs : CalleeSavedRegs<br class="">
def CSR_64_AllRegs_AVX : CalleeSavedRegs<(sub (add CSR_64_MostRegs, RAX, RSP,<br class="">
(sequence "YMM%u", 0, 15)),<br class="">
(sequence "XMM%u", 0, 15))>;<br class="">
+def CSR_64_AllRegs_AVX512 : CalleeSavedRegs<(add CSR_64_AllRegs_AVX,<br class="">
+ (sequence "YMM%u", 16, 31))>;<br class="">
<br class="">
// Standard C + YMM6-15<br class="">
def CSR_Win64_Intel_OCL_BI_AVX : CalleeSavedRegs<(add RBX, RBP, RDI, RSI, R12,<br class="">
<br class="">
Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp<br class="">
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=269001&r1=269000&r2=269001&view=diff" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=269001&r1=269000&r2=269001&view=diff</a><br class="">
==============================================================================<br class="">
--- llvm/trunk/lib/Target/X86/X86InstrInfo.cpp (original)<br class="">
+++ llvm/trunk/lib/Target/X86/X86InstrInfo.cpp Mon May 9 20:09:14 2016<br class="">
@@ -4645,23 +4645,35 @@ static unsigned getLoadStoreRegOpcode(un<br class="">
assert((X86::VR128RegClass.hasSubClassEq(RC) ||<br class="">
X86::VR128XRegClass.hasSubClassEq(RC))&& "Unknown 16-byte regclass");<br class="">
// If stack is realigned we can use aligned stores.<br class="">
+ if (X86::VR128RegClass.hasSubClassEq(RC)) {<br class="">
+ if (isStackAligned)<br class="">
+ return load ? (HasAVX ? X86::VMOVAPSrm : X86::MOVAPSrm)<br class="">
+ : (HasAVX ? X86::VMOVAPSmr : X86::MOVAPSmr);<br class="">
+ else<br class="">
+ return load ? (HasAVX ? X86::VMOVUPSrm : X86::MOVUPSrm)<br class="">
+ : (HasAVX ? X86::VMOVUPSmr : X86::MOVUPSmr);<br class="">
+ }<br class="">
+ assert(STI.hasAVX512() && "Using extended register requires AVX512");<br class="">
if (isStackAligned)<br class="">
- return load ?<br class="">
- (HasAVX ? X86::VMOVAPSrm : X86::MOVAPSrm) :<br class="">
- (HasAVX ? X86::VMOVAPSmr : X86::MOVAPSmr);<br class="">
+ return load ? X86::VMOVAPSZ128rm : X86::VMOVAPSZ128mr;<br class="">
else<br class="">
- return load ?<br class="">
- (HasAVX ? X86::VMOVUPSrm : X86::MOVUPSrm) :<br class="">
- (HasAVX ? X86::VMOVUPSmr : X86::MOVUPSmr);<br class="">
+ return load ? X86::VMOVUPSZ128rm : X86::VMOVUPSZ128mr;<br class="">
}<br class="">
case 32:<br class="">
assert((X86::VR256RegClass.hasSubClassEq(RC) ||<br class="">
X86::VR256XRegClass.hasSubClassEq(RC)) && "Unknown 32-byte regclass");<br class="">
// If stack is realigned we can use aligned stores.<br class="">
+ if (X86::VR256RegClass.hasSubClassEq(RC)) {<br class="">
+ if (isStackAligned)<br class="">
+ return load ? X86::VMOVAPSYrm : X86::VMOVAPSYmr;<br class="">
+ else<br class="">
+ return load ? X86::VMOVUPSYrm : X86::VMOVUPSYmr;<br class="">
+ }<br class="">
+ assert(STI.hasAVX512() && "Using extended register requires AVX512");<br class="">
if (isStackAligned)<br class="">
- return load ? X86::VMOVAPSYrm : X86::VMOVAPSYmr;<br class="">
+ return load ? X86::VMOVAPSZ256rm : X86::VMOVAPSZ256mr;<br class="">
else<br class="">
- return load ? X86::VMOVUPSYrm : X86::VMOVUPSYmr;<br class="">
+ return load ? X86::VMOVUPSZ256rm : X86::VMOVUPSZ256mr;<br class="">
case 64:<br class="">
assert(X86::VR512RegClass.hasSubClassEq(RC) && "Unknown 64-byte regclass");<br class="">
if (isStackAligned)<br class="">
<br class="">
Modified: llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp<br class="">
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp?rev=269001&r1=269000&r2=269001&view=diff" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp?rev=269001&r1=269000&r2=269001&view=diff</a><br class="">
==============================================================================<br class="">
--- llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp (original)<br class="">
+++ llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp Mon May 9 20:09:14 2016<br class="">
@@ -294,10 +294,11 @@ X86RegisterInfo::getCalleeSavedRegs(cons<br class="">
return CSR_64_SaveList;<br class="">
case CallingConv::X86_INTR:<br class="">
if (Is64Bit) {<br class="">
+ if (HasAVX512)<br class="">
+ return CSR_64_AllRegs_AVX512_SaveList;<br class="">
if (HasAVX)<br class="">
return CSR_64_AllRegs_AVX_SaveList;<br class="">
- else<br class="">
- return CSR_64_AllRegs_SaveList;<br class="">
+ return CSR_64_AllRegs_SaveList;<br class="">
} else {<br class="">
if (HasSSE)<br class="">
return CSR_32_AllRegs_SSE_SaveList;<br class="">
<br class="">
Added: llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll<br class="">
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll?rev=269001&view=auto" rel="noreferrer" target="_blank" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll?rev=269001&view=auto</a><br class="">
==============================================================================<br class="">
--- llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll (added)<br class="">
+++ llvm/trunk/test/CodeGen/X86/x86-interrupt_cc.ll Mon May 9 20:09:14 2016<br class="">
@@ -0,0 +1,19 @@<br class="">
+; RUN: llc -verify-machineinstrs -mtriple=x86_64-apple-macosx -show-mc-encoding -mattr=+avx512f < %s | FileCheck %s<br class="">
+<br class="">
+<br class="">
+; Make sure we spill the high numbered YMM registers with the right encoding.<br class="">
+; CHECK-LABEL: foo<br class="">
+; CHECK: movups %ymm31, {{.+}}<br class="">
+; CHECK: encoding: [0x62,0x61,0x7c,0x28,0x11,0xbc,0x24,0xf0,0x03,0x00,0x00]<br class="">
+; ymm30 is used as an anchor for the previous regexp.<br class="">
+; CHECK-NEXT: movups %ymm30<br class="">
+; CHECK: call<br class="">
+; CHECK: iret<br class="">
+<br class="">
+define x86_intrcc void @foo(i8* %frame) {<br class="">
+ call void @bar()<br class="">
+ ret void<br class="">
+}<br class="">
+<br class="">
+declare void @bar()<br class="">
+<br class="">
<br class="">
<br class="">
_______________________________________________<br class="">
llvm-commits mailing list<br class="">
<a href="mailto:llvm-commits@lists.llvm.org" target="_blank" class="">llvm-commits@lists.llvm.org</a><br class="">
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br class="">
</blockquote></div><br class=""><br clear="all" class=""><div class=""><br class=""></div></div></div><span class=""><font color="#888888" class="">-- <br class=""><div class="">~Craig</div>
</font></span></div>
</blockquote></div><br class=""><br clear="all" class=""><div class=""><br class=""></div></div></div><span class="HOEnZb"><font color="#888888" class="">-- <br class=""><div class="">~Craig</div>
</font></span></div>
</blockquote></div><br class=""><br clear="all" class=""><div class=""><br class=""></div>-- <br class=""><div class="gmail_signature">~Craig</div>
</div>
</div></blockquote></div><br class=""></body></html>