<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Nov 25, 2016, at 2:50 PM, Michael Kuperstein <<a href="mailto:mkuper@google.com" class="">mkuper@google.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Hi Volkan,<div class=""><br class=""></div><div class="">I'll take a look once I'm back from vacation (mid next week), but I don't think we should be passing an empty Ops here.</div></div></div></blockquote><div><br class=""></div><div>InlineSpiller::foldMemoryOperand(…) doesn’t check if FoldOps is empty, so it is possible. I think we should fix this in either TargetInstrInfo::foldMemoryOperand or InlineSpiller::foldMemoryOperand.</div><div> </div><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class="">Do you have a test that this fails with?</div></div></div></blockquote><div><br class=""></div><div>This change breaks one of our internal tests.</div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><br class=""></div><div class="">Thanks,</div><div class=""> Michael</div></div></div></blockquote><div><br class=""></div>Thank you,</div><div>Volkan</div><div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div class=""><br class=""></div></div><div class="gmail_extra"><br class=""><div class="gmail_quote">On Thu, Nov 24, 2016 at 7:45 AM, Volkan Keles <span dir="ltr" class=""><<a href="mailto:vkeles@apple.com" target="_blank" class="">vkeles@apple.com</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word" class="">Hi Michael,<div class=""><br class=""><div class=""><div class=""><div class="h5"><blockquote type="cite" class=""><div class="">On Nov 23, 2016, at 6:33 PM, Michael Kuperstein via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org" target="_blank" class="">llvm-commits@lists.llvm.org</a>> wrote:</div><br class="m_2580039044826008080Apple-interchange-newline"><div class=""><div class="">Author: mkuper<br class="">Date: Wed Nov 23 12:33:49 2016<br class="">New Revision: 287792<br class=""><br class="">URL: <a href="http://llvm.org/viewvc/llvm-project?rev=287792&view=rev" target="_blank" class="">http://llvm.org/viewvc/llvm-<wbr class="">project?rev=287792&view=rev</a><br class="">Log:<br class="">[X86] Allow folding of stack reloads when loading a subreg of the spilled reg<br class=""><br class="">We did not support subregs in InlineSpiller:<wbr class="">foldMemoryOperand() because targets<br class="">may not deal with them correctly.<br class=""><br class="">This adds a target hook to let the spiller know that a target can handle<br class="">subregs, and actually enables it for x86 for the case of stack slot reloads.<br class="">This fixes PR30832.<br class=""><br class="">Differential Revision: <a href="https://reviews.llvm.org/D26521" target="_blank" class="">https://reviews.llvm.org/<wbr class="">D26521</a><br class=""><br class="">Modified:<br class=""> llvm/trunk/include/llvm/<wbr class="">Target/TargetInstrInfo.h<br class=""> llvm/trunk/lib/CodeGen/<wbr class="">InlineSpiller.cpp<br class=""> llvm/trunk/lib/CodeGen/<wbr class="">TargetInstrInfo.cpp<br class=""> llvm/trunk/lib/Target/X86/<wbr class="">X86InstrInfo.cpp<br class=""> llvm/trunk/lib/Target/X86/<wbr class="">X86InstrInfo.h<br class=""> llvm/trunk/test/CodeGen/<wbr class="">X86/partial-fold32.ll<br class=""> llvm/trunk/test/CodeGen/<wbr class="">X86/partial-fold64.ll<br class=""> llvm/trunk/test/CodeGen/<wbr class="">X86/vector-half-conversions.ll<br class=""><br class="">Modified: llvm/trunk/include/llvm/<wbr class="">Target/TargetInstrInfo.h<br class="">URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetInstrInfo.h?rev=287792&r1=287791&r2=287792&view=diff" target="_blank" class="">http://llvm.org/viewvc/llvm-<wbr class="">project/llvm/trunk/include/<wbr class="">llvm/Target/TargetInstrInfo.h?<wbr class="">rev=287792&r1=287791&r2=<wbr class="">287792&view=diff</a><br class="">==============================<wbr class="">==============================<wbr class="">==================<br class="">--- llvm/trunk/include/llvm/<wbr class="">Target/TargetInstrInfo.h (original)<br class="">+++ llvm/trunk/include/llvm/<wbr class="">Target/TargetInstrInfo.h Wed Nov 23 12:33:49 2016<br class="">@@ -817,6 +817,20 @@ public:<br class=""> /// anything was changed.<br class=""> virtual bool expandPostRAPseudo(<wbr class="">MachineInstr &MI) const { return false; }<br class=""><br class="">+ /// Check whether the target can fold a load that feeds a subreg operand<br class="">+ /// (or a subreg operand that feeds a store).<br class="">+ /// For example, X86 may want to return true if it can fold<br class="">+ /// movl (%esp), %eax<br class="">+ /// subb, %al, ...<br class="">+ /// Into:<br class="">+ /// subb (%esp), ...<br class="">+ ///<br class="">+ /// Ideally, we'd like the target implementation of foldMemoryOperand() to<br class="">+ /// reject subregs - but since this behavior used to be enforced in the<br class="">+ /// target-independent code, moving this responsibility to the targets<br class="">+ /// has the potential of causing nasty silent breakage in out-of-tree targets.<br class="">+ virtual bool isSubregFoldable() const { return false; }<br class="">+<br class=""> /// Attempt to fold a load or store of the specified stack<br class=""> /// slot into the specified machine instruction for the specified operand(s).<br class=""> /// If this is possible, a new instruction is returned with the specified<br class=""><br class="">Modified: llvm/trunk/lib/CodeGen/<wbr class="">InlineSpiller.cpp<br class="">URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/InlineSpiller.cpp?rev=287792&r1=287791&r2=287792&view=diff" target="_blank" class="">http://llvm.org/viewvc/llvm-<wbr class="">project/llvm/trunk/lib/<wbr class="">CodeGen/InlineSpiller.cpp?rev=<wbr class="">287792&r1=287791&r2=287792&<wbr class="">view=diff</a><br class="">==============================<wbr class="">==============================<wbr class="">==================<br class="">--- llvm/trunk/lib/CodeGen/<wbr class="">InlineSpiller.cpp (original)<br class="">+++ llvm/trunk/lib/CodeGen/<wbr class="">InlineSpiller.cpp Wed Nov 23 12:33:49 2016<br class="">@@ -739,9 +739,12 @@ foldMemoryOperand(ArrayRef<<wbr class="">std::pair<Mac<br class=""> bool WasCopy = MI->isCopy();<br class=""> unsigned ImpReg = 0;<br class=""><br class="">- bool SpillSubRegs = (MI->getOpcode() == TargetOpcode::STATEPOINT ||<br class="">- MI-><wbr class="">getOpcode() == TargetOpcode::PATCHPOINT ||<br class="">- MI-><wbr class="">getOpcode() == TargetOpcode::STACKMAP);<br class="">+ // Spill subregs if the target allows it.<br class="">+ // We always want to spill subregs for stackmap/patchpoint pseudos.<br class="">+ bool SpillSubRegs = TII.isSubregFoldable() ||<br class="">+ MI-><wbr class="">getOpcode() == TargetOpcode::STATEPOINT ||<br class="">+ MI-><wbr class="">getOpcode() == TargetOpcode::PATCHPOINT ||<br class="">+ MI-><wbr class="">getOpcode() == TargetOpcode::STACKMAP;<br class=""><br class=""> // TargetInstrInfo::<wbr class="">foldMemoryOperand only expects explicit, non-tied<br class=""> // operands.<br class="">@@ -754,7 +757,7 @@ foldMemoryOperand(ArrayRef<<wbr class="">std::pair<Mac<br class=""> ImpReg = MO.getReg();<br class=""> continue;<br class=""> }<br class="">- // FIXME: Teach targets to deal with subregs.<br class="">+<br class=""> if (!SpillSubRegs && MO.getSubReg())<br class=""> return false;<br class=""> // We cannot fold a load instruction into a def.<br class=""><br class="">Modified: llvm/trunk/lib/CodeGen/<wbr class="">TargetInstrInfo.cpp<br class="">URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/TargetInstrInfo.cpp?rev=287792&r1=287791&r2=287792&view=diff" target="_blank" class="">http://llvm.org/viewvc/llvm-<wbr class="">project/llvm/trunk/lib/<wbr class="">CodeGen/TargetInstrInfo.cpp?<wbr class="">rev=287792&r1=287791&r2=<wbr class="">287792&view=diff</a><br class="">==============================<wbr class="">==============================<wbr class="">==================<br class="">--- llvm/trunk/lib/CodeGen/<wbr class="">TargetInstrInfo.cpp (original)<br class="">+++ llvm/trunk/lib/CodeGen/<wbr class="">TargetInstrInfo.cpp Wed Nov 23 12:33:49 2016<br class="">@@ -515,6 +515,31 @@ MachineInstr *TargetInstrInfo::foldMemor<br class=""> assert(MBB && "foldMemoryOperand needs an inserted instruction");<br class=""> MachineFunction &MF = *MBB->getParent();<br class=""><br class="">+ // If we're not folding a load into a subreg, the size of the load is the<br class="">+ // size of the spill slot. But if we are, we need to figure out what the<br class="">+ // actual load size is.<br class="">+ int64_t MemSize = 0;<br class="">+ const MachineFrameInfo &MFI = MF.getFrameInfo();<br class="">+ const TargetRegisterInfo *TRI = MF.getSubtarget().<wbr class="">getRegisterInfo();<br class="">+<br class=""></div></div></blockquote><div class=""><br class=""></div></div></div><div class="">I think there is a missing check here. If the Ops is empty, MemSize will be 0 and the compiler will hit the assertion below.</div><div class="">What about:</div><div class="">+ if ((Flags & MachineMemOperand::MOStore) || Ops.empty()) {<span class=""><br class="">+ MemSize = MFI.getObjectSize(FI);<br class="">+ } else {</span></div><div class=""><br class=""></div><div class="">Can you please look into this?</div><div class=""><div class="h5"><br class=""><blockquote type="cite" class=""><div class=""><div class="">+ if (Flags & MachineMemOperand::MOStore) {<br class="">+ MemSize = MFI.getObjectSize(FI);<br class="">+ } else {<br class="">+ for (unsigned Idx : Ops) {<br class="">+ int64_t OpSize = MFI.getObjectSize(FI);<br class="">+<br class="">+ if (auto SubReg = MI.getOperand(Idx).getSubReg()<wbr class="">) {<br class="">+ unsigned SubRegSize = TRI->getSubRegIdxSize(SubReg);<br class="">+ if (SubRegSize > 0 && !(SubRegSize % 8))<br class="">+ OpSize = SubRegSize / 8;<br class="">+ }<br class="">+<br class="">+ MemSize = std::max(MemSize, OpSize);<br class="">+ }<br class="">+ }<br class="">+<br class="">+ assert(MemSize && "Did not expect a zero-sized stack slot");<br class="">+<br class=""> MachineInstr *NewMI = nullptr;<br class=""><br class=""> if (MI.getOpcode() == TargetOpcode::STACKMAP ||<br class="">@@ -538,10 +563,9 @@ MachineInstr *TargetInstrInfo::foldMemor<br class=""> assert((!(Flags & MachineMemOperand::MOLoad) ||<br class=""> NewMI->mayLoad()) &&<br class=""> "Folded a use to a non-load!");<br class="">- const MachineFrameInfo &MFI = MF.getFrameInfo();<br class=""> assert(MFI.<wbr class="">getObjectOffset(FI) != -1);<br class=""> MachineMemOperand *MMO = MF.getMachineMemOperand(<br class="">- MachinePointerInfo::<wbr class="">getFixedStack(MF, FI), Flags, MFI.getObjectSize(FI),<br class="">+ MachinePointerInfo::<wbr class="">getFixedStack(MF, FI), Flags, MemSize,<br class=""> MFI.<wbr class="">getObjectAlignment(FI));<br class=""> NewMI->addMemOperand(MF, MMO);<br class=""><br class="">@@ -558,7 +582,6 @@ MachineInstr *TargetInstrInfo::foldMemor<br class=""><br class=""> const MachineOperand &MO = MI.getOperand(1 - Ops[0]);<br class=""> MachineBasicBlock::iterator Pos = MI;<br class="">- const TargetRegisterInfo *TRI = MF.getSubtarget().<wbr class="">getRegisterInfo();<br class=""><br class=""> if (Flags == MachineMemOperand::MOStore)<br class=""> storeRegToStackSlot(*MBB, Pos, MO.getReg(), MO.isKill(), FI, RC, TRI);<br class=""><br class="">Modified: llvm/trunk/lib/Target/X86/<wbr class="">X86InstrInfo.cpp<br class="">URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=287792&r1=287791&r2=287792&view=diff" target="_blank" class="">http://llvm.org/viewvc/llvm-<wbr class="">project/llvm/trunk/lib/Target/<wbr class="">X86/X86InstrInfo.cpp?rev=<wbr class="">287792&r1=287791&r2=287792&<wbr class="">view=diff</a><br class="">==============================<wbr class="">==============================<wbr class="">==================<br class="">--- llvm/trunk/lib/Target/X86/<wbr class="">X86InstrInfo.cpp (original)<br class="">+++ llvm/trunk/lib/Target/X86/<wbr class="">X86InstrInfo.cpp Wed Nov 23 12:33:49 2016<br class="">@@ -6843,6 +6843,14 @@ X86InstrInfo::<wbr class="">foldMemoryOperandImpl(Mach<br class=""> if (!MF.getFunction()-><wbr class="">optForSize() && hasPartialRegUpdate(MI.<wbr class="">getOpcode()))<br class=""> return nullptr;<br class=""><br class="">+ // Don't fold subreg spills, or reloads that use a high subreg.<br class="">+ for (auto Op : Ops) {<br class="">+ MachineOperand &MO = MI.getOperand(Op);<br class="">+ auto SubReg = MO.getSubReg();<br class="">+ if (SubReg && (MO.isDef() || SubReg == X86::sub_8bit_hi))<br class="">+ return nullptr;<br class="">+ }<br class="">+<br class=""> const MachineFrameInfo &MFI = MF.getFrameInfo();<br class=""> unsigned Size = MFI.getObjectSize(FrameIndex);<br class=""> unsigned Alignment = MFI.getObjectAlignment(<wbr class="">FrameIndex);<br class="">@@ -6967,6 +6975,14 @@ MachineInstr *X86InstrInfo::foldMemoryOp<br class=""> MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,<br class=""> MachineBasicBlock::<wbr class="">iterator InsertPt, MachineInstr &LoadMI,<br class=""> LiveIntervals *LIS) const {<br class="">+<br class="">+ // TODO: Support the case where LoadMI loads a wide register, but MI<br class="">+ // only uses a subreg.<br class="">+ for (auto Op : Ops) {<br class="">+ if (MI.getOperand(Op).getSubReg()<wbr class="">)<br class="">+ return nullptr;<br class="">+ }<br class="">+<br class=""> // If loading from a FrameIndex, fold directly from the FrameIndex.<br class=""> unsigned NumOps = LoadMI.getDesc().<wbr class="">getNumOperands();<br class=""> int FrameIndex;<br class=""><br class="">Modified: llvm/trunk/lib/Target/X86/<wbr class="">X86InstrInfo.h<br class="">URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.h?rev=287792&r1=287791&r2=287792&view=diff" target="_blank" class="">http://llvm.org/viewvc/llvm-<wbr class="">project/llvm/trunk/lib/Target/<wbr class="">X86/X86InstrInfo.h?rev=287792&<wbr class="">r1=287791&r2=287792&view=diff</a><br class="">==============================<wbr class="">==============================<wbr class="">==================<br class="">--- llvm/trunk/lib/Target/X86/<wbr class="">X86InstrInfo.h (original)<br class="">+++ llvm/trunk/lib/Target/X86/<wbr class="">X86InstrInfo.h Wed Nov 23 12:33:49 2016<br class="">@@ -378,6 +378,10 @@ public:<br class=""><br class=""> bool expandPostRAPseudo(<wbr class="">MachineInstr &MI) const override;<br class=""><br class="">+ /// Check whether the target can fold a load that feeds a subreg operand<br class="">+ /// (or a subreg operand that feeds a store).<br class="">+ bool isSubregFoldable() const override { return true; }<br class="">+<br class=""> /// foldMemoryOperand - If this target supports it, fold a load or store of<br class=""> /// the specified stack slot into the specified machine instruction for the<br class=""> /// specified operand(s). If this is possible, the target should perform the<br class=""><br class="">Modified: llvm/trunk/test/CodeGen/X86/<wbr class="">partial-fold32.ll<br class="">URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/partial-fold32.ll?rev=287792&r1=287791&r2=287792&view=diff" target="_blank" class="">http://llvm.org/viewvc/llvm-<wbr class="">project/llvm/trunk/test/<wbr class="">CodeGen/X86/partial-fold32.ll?<wbr class="">rev=287792&r1=287791&r2=<wbr class="">287792&view=diff</a><br class="">==============================<wbr class="">==============================<wbr class="">==================<br class="">--- llvm/trunk/test/CodeGen/X86/<wbr class="">partial-fold32.ll (original)<br class="">+++ llvm/trunk/test/CodeGen/X86/<wbr class="">partial-fold32.ll Wed Nov 23 12:33:49 2016<br class="">@@ -3,8 +3,7 @@<br class=""> define fastcc i8 @fold32to8(i32 %add, i8 %spill) {<br class=""> ; CHECK-LABEL: fold32to8:<br class=""> ; CHECK: movl %ecx, (%esp) # 4-byte Spill<br class="">-; CHECK: movl (%esp), %eax # 4-byte Reload<br class="">-; CHECK: subb %al, %dl<br class="">+; CHECK: subb (%esp), %dl # 1-byte Folded Reload<br class=""> entry:<br class=""> tail call void asm sideeffect "", "~{eax},~{ebx},~{ecx},~{edi},~<wbr class="">{esi},~{ebp},~{dirflag},~{<wbr class="">fpsr},~{flags}"()<br class=""> %trunc = trunc i32 %add to i8<br class=""><br class="">Modified: llvm/trunk/test/CodeGen/X86/<wbr class="">partial-fold64.ll<br class="">URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/partial-fold64.ll?rev=287792&r1=287791&r2=287792&view=diff" target="_blank" class="">http://llvm.org/viewvc/llvm-<wbr class="">project/llvm/trunk/test/<wbr class="">CodeGen/X86/partial-fold64.ll?<wbr class="">rev=287792&r1=287791&r2=<wbr class="">287792&view=diff</a><br class="">==============================<wbr class="">==============================<wbr class="">==================<br class="">--- llvm/trunk/test/CodeGen/X86/<wbr class="">partial-fold64.ll (original)<br class="">+++ llvm/trunk/test/CodeGen/X86/<wbr class="">partial-fold64.ll Wed Nov 23 12:33:49 2016<br class="">@@ -3,8 +3,7 @@<br class=""> define i32 @fold64to32(i64 %add, i32 %spill) {<br class=""> ; CHECK-LABEL: fold64to32:<br class=""> ; CHECK: movq %rdi, -{{[0-9]+}}(%rsp) # 8-byte Spill<br class="">-; CHECK: movq -{{[0-9]+}}(%rsp), %rax # 8-byte Reload<br class="">-; CHECK: subl %eax, %esi<br class="">+; CHECK: subl -{{[0-9]+}}(%rsp), %esi # 4-byte Folded Reload<br class=""> entry:<br class=""> tail call void asm sideeffect "", "~{rax},~{rbx},~{rcx},~{rdx},~<wbr class="">{rdi},~{rbp},~{r8},~{r9},~{<wbr class="">r10},~{r11},~{r12},~{r13},~{<wbr class="">r14},~{r15},~{dirflag},~{fpsr}<wbr class="">,~{flags}"()<br class=""> %trunc = trunc i64 %add to i32<br class="">@@ -15,8 +14,7 @@ entry:<br class=""> define i8 @fold64to8(i64 %add, i8 %spill) {<br class=""> ; CHECK-LABEL: fold64to8:<br class=""> ; CHECK: movq %rdi, -{{[0-9]+}}(%rsp) # 8-byte Spill<br class="">-; CHECK: movq -{{[0-9]+}}(%rsp), %rax # 8-byte Reload<br class="">-; CHECK: subb %al, %sil<br class="">+; CHECK: subb -{{[0-9]+}}(%rsp), %sil # 1-byte Folded Reload<br class=""> entry:<br class=""> tail call void asm sideeffect "", "~{rax},~{rbx},~{rcx},~{rdx},~<wbr class="">{rdi},~{rbp},~{r8},~{r9},~{<wbr class="">r10},~{r11},~{r12},~{r13},~{<wbr class="">r14},~{r15},~{dirflag},~{fpsr}<wbr class="">,~{flags}"()<br class=""> %trunc = trunc i64 %add to i8<br class=""><br class="">Modified: llvm/trunk/test/CodeGen/X86/<wbr class="">vector-half-conversions.ll<br class="">URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-half-conversions.ll?rev=287792&r1=287791&r2=287792&view=diff" target="_blank" class="">http://llvm.org/viewvc/llvm-<wbr class="">project/llvm/trunk/test/<wbr class="">CodeGen/X86/vector-half-<wbr class="">conversions.ll?rev=287792&r1=<wbr class="">287791&r2=287792&view=diff</a><br class="">==============================<wbr class="">==============================<wbr class="">==================<br class="">--- llvm/trunk/test/CodeGen/X86/<wbr class="">vector-half-conversions.ll (original)<br class="">+++ llvm/trunk/test/CodeGen/X86/<wbr class="">vector-half-conversions.ll Wed Nov 23 12:33:49 2016<br class="">@@ -4788,9 +4788,8 @@ define <8 x i16> @cvt_8f64_to_8i16(<8 x<br class=""> ; AVX1-NEXT: orl %ebx, %r14d<br class=""> ; AVX1-NEXT: shlq $32, %r14<br class=""> ; AVX1-NEXT: orq %r15, %r14<br class="">-; AVX1-NEXT: vmovupd (%rsp), %ymm0 # 32-byte Reload<br class="">-; AVX1-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]<br class="">-; AVX1-NEXT: vzeroupper<br class="">+; AVX1-NEXT: vpermilpd $1, (%rsp), %xmm0 # 16-byte Folded Reload<br class="">+; AVX1-NEXT: # xmm0 = mem[1,0]<br class=""> ; AVX1-NEXT: callq __truncdfhf2<br class=""> ; AVX1-NEXT: movw %ax, %bx<br class=""> ; AVX1-NEXT: shll $16, %ebx<br class="">@@ -4856,9 +4855,8 @@ define <8 x i16> @cvt_8f64_to_8i16(<8 x<br class=""> ; AVX2-NEXT: orl %ebx, %r14d<br class=""> ; AVX2-NEXT: shlq $32, %r14<br class=""> ; AVX2-NEXT: orq %r15, %r14<br class="">-; AVX2-NEXT: vmovupd (%rsp), %ymm0 # 32-byte Reload<br class="">-; AVX2-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]<br class="">-; AVX2-NEXT: vzeroupper<br class="">+; AVX2-NEXT: vpermilpd $1, (%rsp), %xmm0 # 16-byte Folded Reload<br class="">+; AVX2-NEXT: # xmm0 = mem[1,0]<br class=""> ; AVX2-NEXT: callq __truncdfhf2<br class=""> ; AVX2-NEXT: movw %ax, %bx<br class=""> ; AVX2-NEXT: shll $16, %ebx<br class="">@@ -5585,9 +5583,8 @@ define void @store_cvt_8f64_to_8i16(<8 x<br class=""> ; AVX1-NEXT: vzeroupper<br class=""> ; AVX1-NEXT: callq __truncdfhf2<br class=""> ; AVX1-NEXT: movw %ax, {{[0-9]+}}(%rsp) # 2-byte Spill<br class="">-; AVX1-NEXT: vmovupd {{[0-9]+}}(%rsp), %ymm0 # 32-byte Reload<br class="">-; AVX1-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]<br class="">-; AVX1-NEXT: vzeroupper<br class="">+; AVX1-NEXT: vpermilpd $1, {{[0-9]+}}(%rsp), %xmm0 # 16-byte Folded Reload<br class="">+; AVX1-NEXT: # xmm0 = mem[1,0]<br class=""> ; AVX1-NEXT: callq __truncdfhf2<br class=""> ; AVX1-NEXT: movl %eax, %r12d<br class=""> ; AVX1-NEXT: vmovupd {{[0-9]+}}(%rsp), %ymm0 # 32-byte Reload<br class="">@@ -5654,9 +5651,8 @@ define void @store_cvt_8f64_to_8i16(<8 x<br class=""> ; AVX2-NEXT: vzeroupper<br class=""> ; AVX2-NEXT: callq __truncdfhf2<br class=""> ; AVX2-NEXT: movw %ax, {{[0-9]+}}(%rsp) # 2-byte Spill<br class="">-; AVX2-NEXT: vmovupd {{[0-9]+}}(%rsp), %ymm0 # 32-byte Reload<br class="">-; AVX2-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]<br class="">-; AVX2-NEXT: vzeroupper<br class="">+; AVX2-NEXT: vpermilpd $1, {{[0-9]+}}(%rsp), %xmm0 # 16-byte Folded Reload<br class="">+; AVX2-NEXT: # xmm0 = mem[1,0]<br class=""> ; AVX2-NEXT: callq __truncdfhf2<br class=""> ; AVX2-NEXT: movl %eax, %r12d<br class=""> ; AVX2-NEXT: vmovupd {{[0-9]+}}(%rsp), %ymm0 # 32-byte Reload<br class=""><br class=""><br class="">______________________________<wbr class="">_________________<br class="">llvm-commits mailing list<br class=""><a href="mailto:llvm-commits@lists.llvm.org" target="_blank" class="">llvm-commits@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits" target="_blank" class="">http://lists.llvm.org/cgi-bin/<wbr class="">mailman/listinfo/llvm-commits</a><br class=""></div></div></blockquote><br class=""></div></div></div><div class="">Thank you,</div><div class="">Volkan</div><br class=""></div></div></blockquote></div><br class=""></div>
</div></blockquote></div><br class=""></body></html>