<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">On 11/25/2016 06:55 AM, Michael
      Kuperstein wrote:<br>
    </div>
    <blockquote
cite="mid:CAL_y90mux5wikbA4xHwN1Cb8-e+NnTCdbPvFZrhBWF1-wY8aSA@mail.gmail.com"
      type="cite">
      <div dir="ltr">So, I had this exact discussion with Matthias on
        the review thread, and on IRC.</div>
    </blockquote>
    Ah, glad to know it got discussed.  That was my primary concern. 
    I'll share my 2cts below, but that's just for the record, not
    because I'm asking for changes.<br>
    <blockquote
cite="mid:CAL_y90mux5wikbA4xHwN1Cb8-e+NnTCdbPvFZrhBWF1-wY8aSA@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div><br>
        </div>
        <div>The problem isn't in-tree targets, it's out-of-tree
          targets. Now, generally speaking, breaking out-of-tree targets
          is fine, but in this case, I think it's a particularly nasty
          kind of break - it's a change that silently relaxes an API
          invariant. And the way the breakage would manifest is by
          creating nasty-to-debug miscompiles.</div>
        <div>I'd really rather not be *that* hostile to downstream.</div>
      </div>
    </blockquote>
    Honestly, this really feels like the wrong tradeoff to me.  We
    shouldn't be taking code complexity upstream to prevent possible
    problems in downstream out of tree backends.  We should give notice
    of potentially breaking changes (llvm-dev email, release notes,
    etc..), but the maintenance responsibility for the out of tree code
    should lie on the out of tree users.  Beyond the obvious goal of
    avoiding confusing complexity upstream, this is one of our main
    incentive mechanisms for out of tree users to follow ToT and
    eventually become upstream contributors.  <br>
    <br>
    One possible middle ground would be to offer the callback (with the
    safe default) for a limited migration period.  Explicitly document
    the callback as only being present in one release, update the
    release documents to clearly state the required migration, and
    remove the callback one day after the next release has landed.  This
    would give a softer migration period without accumulating technical
    complexity long term.  <br>
    <blockquote
cite="mid:CAL_y90mux5wikbA4xHwN1Cb8-e+NnTCdbPvFZrhBWF1-wY8aSA@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div><br>
        </div>
        <div>We could make the break less silent by changing the
          foldMemoryOperand API in a way that'll break in compile time,
          but it's really not clear it's worth it.</div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Thu, Nov 24, 2016 at 7:50 PM, Philip
          Reames <span dir="ltr"><<a moz-do-not-send="true"
              href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex"><span
              class="">On 11/23/2016 10:33 AM, Michael Kuperstein via
              llvm-commits wrote:<br>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                Author: mkuper<br>
                Date: Wed Nov 23 12:33:49 2016<br>
                New Revision: 287792<br>
                <br>
                URL: <a moz-do-not-send="true"
                  href="http://llvm.org/viewvc/llvm-project?rev=287792&view=rev"
                  rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject?rev=287792&view=rev</a><br>
                Log:<br>
                [X86] Allow folding of stack reloads when loading a
                subreg of the spilled reg<br>
                <br>
                We did not support subregs in
                InlineSpiller:foldMemoryOperan<wbr>d() because targets<br>
                may not deal with them correctly.<br>
                <br>
                This adds a target hook to let the spiller know that a
                target can handle<br>
                subregs, and actually enables it for x86 for the case of
                stack slot reloads.<br>
                This fixes PR30832.<br>
              </blockquote>
            </span>
            This feels like a weird design.  If I remember correctly,
            foldMemoryOperand is allowed to do nothing if it doesn't
            know how to fold.  Given this, why not just update the in
            tree targets to check for a sub-reg load and bail out?  Why
            do we need yet another target hook?
            <div class="HOEnZb">
              <div class="h5"><br>
                <br>
                <blockquote class="gmail_quote" style="margin:0 0 0
                  .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  <br>
                  Differential Revision: <a moz-do-not-send="true"
                    href="https://reviews.llvm.org/D26521"
                    rel="noreferrer" target="_blank">https://reviews.llvm.org/D2652<wbr>1</a><br>
                  <br>
                  Modified:<br>
                       llvm/trunk/include/llvm/Targe<wbr>t/TargetInstrInfo.h<br>
                       llvm/trunk/lib/CodeGen/Inline<wbr>Spiller.cpp<br>
                       llvm/trunk/lib/CodeGen/Target<wbr>InstrInfo.cpp<br>
                       llvm/trunk/lib/Target/X86/X86<wbr>InstrInfo.cpp<br>
                       llvm/trunk/lib/Target/X86/X86<wbr>InstrInfo.h<br>
                       llvm/trunk/test/CodeGen/X86/p<wbr>artial-fold32.ll<br>
                       llvm/trunk/test/CodeGen/X86/p<wbr>artial-fold64.ll<br>
                       llvm/trunk/test/CodeGen/X86/v<wbr>ector-half-conversions.ll<br>
                  <br>
                  Modified: llvm/trunk/include/llvm/Target<wbr>/TargetInstrInfo.h<br>
                  URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetInstrInfo.h?rev=287792&r1=287791&r2=287792&view=diff"
                    rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/llvm/trunk/include/llvm/<wbr>Target/TargetInstrInfo.h?rev=<wbr>287792&r1=287791&r2=287792&<wbr>view=diff</a><br>
                  ==============================<wbr>==============================<wbr>==================<br>
                  --- llvm/trunk/include/llvm/Target<wbr>/TargetInstrInfo.h
                  (original)<br>
                  +++ llvm/trunk/include/llvm/Target<wbr>/TargetInstrInfo.h
                  Wed Nov 23 12:33:49 2016<br>
                  @@ -817,6 +817,20 @@ public:<br>
                      /// anything was changed.<br>
                      virtual bool expandPostRAPseudo(MachineInst<wbr>r
                  &MI) const { return false; }<br>
                    +  /// Check whether the target can fold a load that
                  feeds a subreg operand<br>
                  +  /// (or a subreg operand that feeds a store).<br>
                  +  /// For example, X86 may want to return true if it
                  can fold<br>
                  +  /// movl (%esp), %eax<br>
                  +  /// subb, %al, ...<br>
                  +  /// Into:<br>
                  +  /// subb (%esp), ...<br>
                  +  ///<br>
                  +  /// Ideally, we'd like the target implementation of
                  foldMemoryOperand() to<br>
                  +  /// reject subregs - but since this behavior used
                  to be enforced in the<br>
                  +  /// target-independent code, moving this
                  responsibility to the targets<br>
                  +  /// has the potential of causing nasty silent
                  breakage in out-of-tree targets.<br>
                  +  virtual bool isSubregFoldable() const { return
                  false; }<br>
                  +<br>
                      /// Attempt to fold a load or store of the
                  specified stack<br>
                      /// slot into the specified machine instruction
                  for the specified operand(s).<br>
                      /// If this is possible, a new instruction is
                  returned with the specified<br>
                  <br>
                  Modified: llvm/trunk/lib/CodeGen/InlineS<wbr>piller.cpp<br>
                  URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/InlineSpiller.cpp?rev=287792&r1=287791&r2=287792&view=diff"
                    rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/llvm/trunk/lib/CodeGen/<wbr>InlineSpiller.cpp?rev=287792&<wbr>r1=287791&r2=287792&view=diff</a><br>
                  ==============================<wbr>==============================<wbr>==================<br>
                  --- llvm/trunk/lib/CodeGen/InlineS<wbr>piller.cpp
                  (original)<br>
                  +++ llvm/trunk/lib/CodeGen/InlineS<wbr>piller.cpp Wed
                  Nov 23 12:33:49 2016<br>
                  @@ -739,9 +739,12 @@ foldMemoryOperand(ArrayRef<std<wbr>::pair<Mac<br>
                      bool WasCopy = MI->isCopy();<br>
                      unsigned ImpReg = 0;<br>
                    -  bool SpillSubRegs = (MI->getOpcode() ==
                  TargetOpcode::STATEPOINT ||<br>
                  -                       MI->getOpcode() ==
                  TargetOpcode::PATCHPOINT ||<br>
                  -                       MI->getOpcode() ==
                  TargetOpcode::STACKMAP);<br>
                  +  // Spill subregs if the target allows it.<br>
                  +  // We always want to spill subregs for
                  stackmap/patchpoint pseudos.<br>
                  +  bool SpillSubRegs = TII.isSubregFoldable() ||<br>
                  +                      MI->getOpcode() ==
                  TargetOpcode::STATEPOINT ||<br>
                  +                      MI->getOpcode() ==
                  TargetOpcode::PATCHPOINT ||<br>
                  +                      MI->getOpcode() ==
                  TargetOpcode::STACKMAP;<br>
                        // TargetInstrInfo::foldMemoryOpe<wbr>rand only
                  expects explicit, non-tied<br>
                      // operands.<br>
                  @@ -754,7 +757,7 @@ foldMemoryOperand(ArrayRef<std<wbr>::pair<Mac<br>
                          ImpReg = MO.getReg();<br>
                          continue;<br>
                        }<br>
                  -    // FIXME: Teach targets to deal with subregs.<br>
                  +<br>
                        if (!SpillSubRegs && MO.getSubReg())<br>
                          return false;<br>
                        // We cannot fold a load instruction into a def.<br>
                  <br>
                  Modified: llvm/trunk/lib/CodeGen/TargetI<wbr>nstrInfo.cpp<br>
                  URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/TargetInstrInfo.cpp?rev=287792&r1=287791&r2=287792&view=diff"
                    rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/llvm/trunk/lib/CodeGen/<wbr>TargetInstrInfo.cpp?rev=<wbr>287792&r1=287791&r2=287792&<wbr>view=diff</a><br>
                  ==============================<wbr>==============================<wbr>==================<br>
                  --- llvm/trunk/lib/CodeGen/TargetI<wbr>nstrInfo.cpp
                  (original)<br>
                  +++ llvm/trunk/lib/CodeGen/TargetI<wbr>nstrInfo.cpp
                  Wed Nov 23 12:33:49 2016<br>
                  @@ -515,6 +515,31 @@ MachineInstr
                  *TargetInstrInfo::foldMemor<br>
                      assert(MBB && "foldMemoryOperand needs an
                  inserted instruction");<br>
                      MachineFunction &MF = *MBB->getParent();<br>
                    +  // If we're not folding a load into a subreg, the
                  size of the load is the<br>
                  +  // size of the spill slot. But if we are, we need
                  to figure out what the<br>
                  +  // actual load size is.<br>
                  +  int64_t MemSize = 0;<br>
                  +  const MachineFrameInfo &MFI =
                  MF.getFrameInfo();<br>
                  +  const TargetRegisterInfo *TRI =
                  MF.getSubtarget().getRegisterI<wbr>nfo();<br>
                  +<br>
                  +  if (Flags & MachineMemOperand::MOStore) {<br>
                  +    MemSize = MFI.getObjectSize(FI);<br>
                  +  } else {<br>
                  +    for (unsigned Idx : Ops) {<br>
                  +      int64_t OpSize = MFI.getObjectSize(FI);<br>
                  +<br>
                  +      if (auto SubReg =
                  MI.getOperand(Idx).getSubReg()<wbr>) {<br>
                  +        unsigned SubRegSize =
                  TRI->getSubRegIdxSize(SubReg);<br>
                  +        if (SubRegSize > 0 && !(SubRegSize
                  % 8))<br>
                  +          OpSize = SubRegSize / 8;<br>
                  +      }<br>
                  +<br>
                  +      MemSize = std::max(MemSize, OpSize);<br>
                  +    }<br>
                  +  }<br>
                  +<br>
                  +  assert(MemSize && "Did not expect a
                  zero-sized stack slot");<br>
                  +<br>
                      MachineInstr *NewMI = nullptr;<br>
                        if (MI.getOpcode() == TargetOpcode::STACKMAP ||<br>
                  @@ -538,10 +563,9 @@ MachineInstr
                  *TargetInstrInfo::foldMemor<br>
                        assert((!(Flags & MachineMemOperand::MOLoad)
                  ||<br>
                                NewMI->mayLoad()) &&<br>
                               "Folded a use to a non-load!");<br>
                  -    const MachineFrameInfo &MFI =
                  MF.getFrameInfo();<br>
                        assert(MFI.getObjectOffset(FI) != -1);<br>
                        MachineMemOperand *MMO =
                  MF.getMachineMemOperand(<br>
                  -        MachinePointerInfo::getFixedSt<wbr>ack(MF,
                  FI), Flags, MFI.getObjectSize(FI),<br>
                  +        MachinePointerInfo::getFixedSt<wbr>ack(MF,
                  FI), Flags, MemSize,<br>
                            MFI.getObjectAlignment(FI));<br>
                        NewMI->addMemOperand(MF, MMO);<br>
                    @@ -558,7 +582,6 @@ MachineInstr
                  *TargetInstrInfo::foldMemor<br>
                        const MachineOperand &MO = MI.getOperand(1 -
                  Ops[0]);<br>
                      MachineBasicBlock::iterator Pos = MI;<br>
                  -  const TargetRegisterInfo *TRI =
                  MF.getSubtarget().getRegisterI<wbr>nfo();<br>
                        if (Flags == MachineMemOperand::MOStore)<br>
                        storeRegToStackSlot(*MBB, Pos, MO.getReg(),
                  MO.isKill(), FI, RC, TRI);<br>
                  <br>
                  Modified: llvm/trunk/lib/Target/X86/X86I<wbr>nstrInfo.cpp<br>
                  URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=287792&r1=287791&r2=287792&view=diff"
                    rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/llvm/trunk/lib/Target/X8<wbr>6/X86InstrInfo.cpp?rev=287792&<wbr>r1=287791&r2=287792&view=diff</a><br>
                  ==============================<wbr>==============================<wbr>==================<br>
                  --- llvm/trunk/lib/Target/X86/X86I<wbr>nstrInfo.cpp
                  (original)<br>
                  +++ llvm/trunk/lib/Target/X86/X86I<wbr>nstrInfo.cpp
                  Wed Nov 23 12:33:49 2016<br>
                  @@ -6843,6 +6843,14 @@ X86InstrInfo::foldMemoryOperan<wbr>dImpl(Mach<br>
                      if (!MF.getFunction()->optForSize<wbr>()
                  && hasPartialRegUpdate(MI.getOpco<wbr>de()))<br>
                        return nullptr;<br>
                    +  // Don't fold subreg spills, or reloads that use
                  a high subreg.<br>
                  +  for (auto Op : Ops) {<br>
                  +    MachineOperand &MO = MI.getOperand(Op);<br>
                  +    auto SubReg = MO.getSubReg();<br>
                  +    if (SubReg && (MO.isDef() || SubReg ==
                  X86::sub_8bit_hi))<br>
                  +      return nullptr;<br>
                  +  }<br>
                  +<br>
                      const MachineFrameInfo &MFI =
                  MF.getFrameInfo();<br>
                      unsigned Size = MFI.getObjectSize(FrameIndex);<br>
                      unsigned Alignment =
                  MFI.getObjectAlignment(FrameIn<wbr>dex);<br>
                  @@ -6967,6 +6975,14 @@ MachineInstr
                  *X86InstrInfo::foldMemoryOp<br>
                        MachineFunction &MF, MachineInstr &MI,
                  ArrayRef<unsigned> Ops,<br>
                        MachineBasicBlock::iterator InsertPt,
                  MachineInstr &LoadMI,<br>
                        LiveIntervals *LIS) const {<br>
                  +<br>
                  +  // TODO: Support the case where LoadMI loads a wide
                  register, but MI<br>
                  +  // only uses a subreg.<br>
                  +  for (auto Op : Ops) {<br>
                  +    if (MI.getOperand(Op).getSubReg()<wbr>)<br>
                  +      return nullptr;<br>
                  +  }<br>
                  +<br>
                      // If loading from a FrameIndex, fold directly
                  from the FrameIndex.<br>
                      unsigned NumOps = LoadMI.getDesc().getNumOperand<wbr>s();<br>
                      int FrameIndex;<br>
                  <br>
                  Modified: llvm/trunk/lib/Target/X86/X86I<wbr>nstrInfo.h<br>
                  URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.h?rev=287792&r1=287791&r2=287792&view=diff"
                    rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/llvm/trunk/lib/Target/X8<wbr>6/X86InstrInfo.h?rev=287792&r1<wbr>=287791&r2=287792&view=diff</a><br>
                  ==============================<wbr>==============================<wbr>==================<br>
                  --- llvm/trunk/lib/Target/X86/X86I<wbr>nstrInfo.h
                  (original)<br>
                  +++ llvm/trunk/lib/Target/X86/X86I<wbr>nstrInfo.h Wed
                  Nov 23 12:33:49 2016<br>
                  @@ -378,6 +378,10 @@ public:<br>
                        bool expandPostRAPseudo(MachineInst<wbr>r
                  &MI) const override;<br>
                    +  /// Check whether the target can fold a load that
                  feeds a subreg operand<br>
                  +  /// (or a subreg operand that feeds a store).<br>
                  +  bool isSubregFoldable() const override { return
                  true; }<br>
                  +<br>
                      /// foldMemoryOperand - If this target supports
                  it, fold a load or store of<br>
                      /// the specified stack slot into the specified
                  machine instruction for the<br>
                      /// specified operand(s).  If this is possible,
                  the target should perform the<br>
                  <br>
                  Modified: llvm/trunk/test/CodeGen/X86/pa<wbr>rtial-fold32.ll<br>
                  URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/partial-fold32.ll?rev=287792&r1=287791&r2=287792&view=diff"
                    rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/llvm/trunk/test/CodeGen/<wbr>X86/partial-fold32.ll?rev=<wbr>287792&r1=287791&r2=287792&<wbr>view=diff</a><br>
                  ==============================<wbr>==============================<wbr>==================<br>
                  --- llvm/trunk/test/CodeGen/X86/pa<wbr>rtial-fold32.ll
                  (original)<br>
                  +++ llvm/trunk/test/CodeGen/X86/pa<wbr>rtial-fold32.ll
                  Wed Nov 23 12:33:49 2016<br>
                  @@ -3,8 +3,7 @@<br>
                    define fastcc i8 @fold32to8(i32 %add, i8 %spill) {<br>
                    ; CHECK-LABEL: fold32to8:<br>
                    ; CHECK:    movl %ecx, (%esp) # 4-byte Spill<br>
                  -; CHECK:    movl (%esp), %eax # 4-byte Reload<br>
                  -; CHECK:    subb %al, %dl<br>
                  +; CHECK:    subb (%esp), %dl  # 1-byte Folded Reload<br>
                    entry:<br>
                      tail call void asm sideeffect "",
                  "~{eax},~{ebx},~{ecx},~{edi},~<wbr>{esi},~{ebp},~{dirflag},~{fpsr<wbr>},~{flags}"()<br>
                      %trunc = trunc i32 %add to i8<br>
                  <br>
                  Modified: llvm/trunk/test/CodeGen/X86/pa<wbr>rtial-fold64.ll<br>
                  URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/partial-fold64.ll?rev=287792&r1=287791&r2=287792&view=diff"
                    rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/llvm/trunk/test/CodeGen/<wbr>X86/partial-fold64.ll?rev=<wbr>287792&r1=287791&r2=287792&<wbr>view=diff</a><br>
                  ==============================<wbr>==============================<wbr>==================<br>
                  --- llvm/trunk/test/CodeGen/X86/pa<wbr>rtial-fold64.ll
                  (original)<br>
                  +++ llvm/trunk/test/CodeGen/X86/pa<wbr>rtial-fold64.ll
                  Wed Nov 23 12:33:49 2016<br>
                  @@ -3,8 +3,7 @@<br>
                    define i32 @fold64to32(i64 %add, i32 %spill) {<br>
                    ; CHECK-LABEL: fold64to32:<br>
                    ; CHECK:    movq %rdi, -{{[0-9]+}}(%rsp) # 8-byte
                  Spill<br>
                  -; CHECK:    movq -{{[0-9]+}}(%rsp), %rax # 8-byte
                  Reload<br>
                  -; CHECK:    subl %eax, %esi<br>
                  +; CHECK:    subl -{{[0-9]+}}(%rsp), %esi # 4-byte
                  Folded Reload<br>
                    entry:<br>
                      tail call void asm sideeffect "",
                  "~{rax},~{rbx},~{rcx},~{rdx},~<wbr>{rdi},~{rbp},~{r8},~{r9},~{r10<wbr>},~{r11},~{r12},~{r13},~{r14},<wbr>~{r15},~{dirflag},~{fpsr},~{<wbr>flags}"()<br>
                      %trunc = trunc i64 %add to i32<br>
                  @@ -15,8 +14,7 @@ entry:<br>
                    define i8 @fold64to8(i64 %add, i8 %spill) {<br>
                    ; CHECK-LABEL: fold64to8:<br>
                    ; CHECK:    movq %rdi, -{{[0-9]+}}(%rsp) # 8-byte
                  Spill<br>
                  -; CHECK:    movq -{{[0-9]+}}(%rsp), %rax # 8-byte
                  Reload<br>
                  -; CHECK:    subb %al, %sil<br>
                  +; CHECK:    subb -{{[0-9]+}}(%rsp), %sil # 1-byte
                  Folded Reload<br>
                    entry:<br>
                      tail call void asm sideeffect "",
                  "~{rax},~{rbx},~{rcx},~{rdx},~<wbr>{rdi},~{rbp},~{r8},~{r9},~{r10<wbr>},~{r11},~{r12},~{r13},~{r14},<wbr>~{r15},~{dirflag},~{fpsr},~{<wbr>flags}"()<br>
                      %trunc = trunc i64 %add to i8<br>
                  <br>
                  Modified: llvm/trunk/test/CodeGen/X86/ve<wbr>ctor-half-conversions.ll<br>
                  URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/vector-half-conversions.ll?rev=287792&r1=287791&r2=287792&view=diff"
                    rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/llvm/trunk/test/CodeGen/<wbr>X86/vector-half-conversions.<wbr>ll?rev=287792&r1=287791&r2=<wbr>287792&view=diff</a><br>
                  ==============================<wbr>==============================<wbr>==================<br>
                  --- llvm/trunk/test/CodeGen/X86/ve<wbr>ctor-half-conversions.ll
                  (original)<br>
                  +++ llvm/trunk/test/CodeGen/X86/ve<wbr>ctor-half-conversions.ll
                  Wed Nov 23 12:33:49 2016<br>
                  @@ -4788,9 +4788,8 @@ define <8 x i16>
                  @cvt_8f64_to_8i16(<8 x<br>
                    ; AVX1-NEXT:    orl %ebx, %r14d<br>
                    ; AVX1-NEXT:    shlq $32, %r14<br>
                    ; AVX1-NEXT:    orq %r15, %r14<br>
                  -; AVX1-NEXT:    vmovupd (%rsp), %ymm0 # 32-byte
                  Reload<br>
                  -; AVX1-NEXT:    vpermilpd {{.*#+}} xmm0 = xmm0[1,0]<br>
                  -; AVX1-NEXT:    vzeroupper<br>
                  +; AVX1-NEXT:    vpermilpd $1, (%rsp), %xmm0 # 16-byte
                  Folded Reload<br>
                  +; AVX1-NEXT:    # xmm0 = mem[1,0]<br>
                    ; AVX1-NEXT:    callq __truncdfhf2<br>
                    ; AVX1-NEXT:    movw %ax, %bx<br>
                    ; AVX1-NEXT:    shll $16, %ebx<br>
                  @@ -4856,9 +4855,8 @@ define <8 x i16>
                  @cvt_8f64_to_8i16(<8 x<br>
                    ; AVX2-NEXT:    orl %ebx, %r14d<br>
                    ; AVX2-NEXT:    shlq $32, %r14<br>
                    ; AVX2-NEXT:    orq %r15, %r14<br>
                  -; AVX2-NEXT:    vmovupd (%rsp), %ymm0 # 32-byte
                  Reload<br>
                  -; AVX2-NEXT:    vpermilpd {{.*#+}} xmm0 = xmm0[1,0]<br>
                  -; AVX2-NEXT:    vzeroupper<br>
                  +; AVX2-NEXT:    vpermilpd $1, (%rsp), %xmm0 # 16-byte
                  Folded Reload<br>
                  +; AVX2-NEXT:    # xmm0 = mem[1,0]<br>
                    ; AVX2-NEXT:    callq __truncdfhf2<br>
                    ; AVX2-NEXT:    movw %ax, %bx<br>
                    ; AVX2-NEXT:    shll $16, %ebx<br>
                  @@ -5585,9 +5583,8 @@ define void
                  @store_cvt_8f64_to_8i16(<8 x<br>
                    ; AVX1-NEXT:    vzeroupper<br>
                    ; AVX1-NEXT:    callq __truncdfhf2<br>
                    ; AVX1-NEXT:    movw %ax, {{[0-9]+}}(%rsp) # 2-byte
                  Spill<br>
                  -; AVX1-NEXT:    vmovupd {{[0-9]+}}(%rsp), %ymm0 #
                  32-byte Reload<br>
                  -; AVX1-NEXT:    vpermilpd {{.*#+}} xmm0 = xmm0[1,0]<br>
                  -; AVX1-NEXT:    vzeroupper<br>
                  +; AVX1-NEXT:    vpermilpd $1, {{[0-9]+}}(%rsp), %xmm0
                  # 16-byte Folded Reload<br>
                  +; AVX1-NEXT:    # xmm0 = mem[1,0]<br>
                    ; AVX1-NEXT:    callq __truncdfhf2<br>
                    ; AVX1-NEXT:    movl %eax, %r12d<br>
                    ; AVX1-NEXT:    vmovupd {{[0-9]+}}(%rsp), %ymm0 #
                  32-byte Reload<br>
                  @@ -5654,9 +5651,8 @@ define void
                  @store_cvt_8f64_to_8i16(<8 x<br>
                    ; AVX2-NEXT:    vzeroupper<br>
                    ; AVX2-NEXT:    callq __truncdfhf2<br>
                    ; AVX2-NEXT:    movw %ax, {{[0-9]+}}(%rsp) # 2-byte
                  Spill<br>
                  -; AVX2-NEXT:    vmovupd {{[0-9]+}}(%rsp), %ymm0 #
                  32-byte Reload<br>
                  -; AVX2-NEXT:    vpermilpd {{.*#+}} xmm0 = xmm0[1,0]<br>
                  -; AVX2-NEXT:    vzeroupper<br>
                  +; AVX2-NEXT:    vpermilpd $1, {{[0-9]+}}(%rsp), %xmm0
                  # 16-byte Folded Reload<br>
                  +; AVX2-NEXT:    # xmm0 = mem[1,0]<br>
                    ; AVX2-NEXT:    callq __truncdfhf2<br>
                    ; AVX2-NEXT:    movl %eax, %r12d<br>
                    ; AVX2-NEXT:    vmovupd {{[0-9]+}}(%rsp), %ymm0 #
                  32-byte Reload<br>
                  <br>
                  <br>
                  ______________________________<wbr>_________________<br>
                  llvm-commits mailing list<br>
                  <a moz-do-not-send="true"
                    href="mailto:llvm-commits@lists.llvm.org"
                    target="_blank">llvm-commits@lists.llvm.org</a><br>
                  <a moz-do-not-send="true"
                    href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits"
                    rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-commits</a><br>
                </blockquote>
                <br>
                <br>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <p><br>
    </p>
  </body>
</html>