<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>I think I've fixed everything now - tweaks to the vector
      legalizer and using isOperationLegal in SimplifyDemandedBits seem
      to have done the trick.</p>
    <p>Shout if you are still seeing problems.</p>
    <p>Simon.<br>
    </p>
    <div class="moz-cite-prefix">On 25/06/2019 08:55, Eric Christopher
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CALehDX5=aqnkC7jb0UNYLL3HFMQH0dcmkbb6VBDn+rUfHvn96w@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="auto">X86 and did. :)
        <div dir="auto"><br>
        </div>
        <div dir="auto">Few more messages from Craig and I in the
          thread. </div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Tue, Jun 25, 2019, 12:37 AM
          Simon Pilgrim <<a href="mailto:llvm-dev@redking.me.uk"
            moz-do-not-send="true">llvm-dev@redking.me.uk</a>> wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0 0 0
          .8ex;border-left:1px #ccc solid;padding-left:1ex">Is this on
          X86 or some other target? The SimplifyDemanded* functions can
          <br>
          be a little quick to change opcodes without proper legal op
          testing....<br>
          <br>
          If you can send me a repro I'll take a look.<br>
          <br>
          Simon.<br>
          <br>
          On 25/06/2019 00:58, Eric Christopher wrote:<br>
          > Hi Simon,<br>
          ><br>
          > I'm seeing at least one occurrence of "fatal error: error
          in backend:<br>
          > Cannot emit physreg copy instruction" after this patch
          attempting to<br>
          > build an mp4 library. I'm not entirely sure from where or
          how (inline<br>
          > assembly, normal code, etc), but wanted to let you know
          as we<br>
          > investigate. The code has, of course, been building up to
          this point<br>
          > so my inclination is to revert and get a testcase reduced
          out of the<br>
          > sources.<br>
          ><br>
          > Thoughts?<br>
          ><br>
          > -eric<br>
          ><br>
          > On Wed, Jun 19, 2019 at 6:54 AM Simon Pilgrim via
          llvm-commits<br>
          > <<a href="mailto:llvm-commits@lists.llvm.org"
            target="_blank" rel="noreferrer" moz-do-not-send="true">llvm-commits@lists.llvm.org</a>>
          wrote:<br>
          >> Author: rksimon<br>
          >> Date: Wed Jun 19 06:58:02 2019<br>
          >> New Revision: 363802<br>
          >><br>
          >> URL: <a
            href="http://llvm.org/viewvc/llvm-project?rev=363802&view=rev"
            rel="noreferrer noreferrer" target="_blank"
            moz-do-not-send="true">http://llvm.org/viewvc/llvm-project?rev=363802&view=rev</a><br>
          >> Log:<br>
          >> [TargetLowering] SimplifyDemandedBits
          SIGN_EXTEND_VECTOR_INREG -> ANY/ZERO_EXTEND_VECTOR_INREG<br>
          >><br>
          >> Simplify SIGN_EXTEND_VECTOR_INREG if the extended
          bits are not required/known zero.<br>
          >><br>
          >> Matches what we already do for SIGN_EXTEND.<br>
          >><br>
          >> Modified:<br>
          >>     
          llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp<br>
          >>      llvm/trunk/test/CodeGen/X86/pmul.ll<br>
          >>      llvm/trunk/test/CodeGen/X86/xop-ifma.ll<br>
          >><br>
          >> Modified:
          llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp<br>
          >> URL: <a
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp?rev=363802&r1=363801&r2=363802&view=diff"
            rel="noreferrer noreferrer" target="_blank"
            moz-do-not-send="true">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp?rev=363802&r1=363801&r2=363802&view=diff</a><br>
          >>
==============================================================================<br>
          >> ---
          llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp
          (original)<br>
          >> +++
          llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp Wed Jun
          19 06:58:02 2019<br>
          >> @@ -1413,9 +1413,11 @@ bool
          TargetLowering::SimplifyDemandedBit<br>
          >>       bool IsVecInReg = Op.getOpcode() ==
          ISD::SIGN_EXTEND_VECTOR_INREG;<br>
          >><br>
          >>       // If none of the top bits are demanded,
          convert this into an any_extend.<br>
          >> -    // TODO: Add SIGN_EXTEND_VECTOR_INREG -
          ANY_EXTEND_VECTOR_INREG fold.<br>
          >> -    if (DemandedBits.getActiveBits() <= InBits
          && !IsVecInReg)<br>
          >> -      return TLO.CombineTo(Op,
          TLO.DAG.getNode(ISD::ANY_EXTEND, dl, VT, Src));<br>
          >> +    if (DemandedBits.getActiveBits() <= InBits)<br>
          >> +      return TLO.CombineTo(<br>
          >> +          Op, TLO.DAG.getNode(IsVecInReg ?
          ISD::ANY_EXTEND_VECTOR_INREG<br>
          >> +                                         :
          ISD::ANY_EXTEND,<br>
          >> +                              dl, VT, Src));<br>
          >><br>
          >>       APInt InDemandedBits =
          DemandedBits.trunc(InBits);<br>
          >>       APInt InDemandedElts =
          DemandedElts.zextOrSelf(InElts);<br>
          >> @@ -1434,9 +1436,11 @@ bool
          TargetLowering::SimplifyDemandedBit<br>
          >>       Known = Known.sext(BitWidth);<br>
          >><br>
          >>       // If the sign bit is known zero, convert this
          to a zero extend.<br>
          >> -    // TODO: Add SIGN_EXTEND_VECTOR_INREG -
          ZERO_EXTEND_VECTOR_INREG fold.<br>
          >> -    if (Known.isNonNegative() &&
          !IsVecInReg)<br>
          >> -      return TLO.CombineTo(Op,
          TLO.DAG.getNode(ISD::ZERO_EXTEND, dl, VT, Src));<br>
          >> +    if (Known.isNonNegative())<br>
          >> +      return TLO.CombineTo(<br>
          >> +          Op, TLO.DAG.getNode(IsVecInReg ?
          ISD::ZERO_EXTEND_VECTOR_INREG<br>
          >> +                                         :
          ISD::ZERO_EXTEND,<br>
          >> +                              dl, VT, Src));<br>
          >>       break;<br>
          >>     }<br>
          >>     case ISD::ANY_EXTEND: {<br>
          >><br>
          >> Modified: llvm/trunk/test/CodeGen/X86/pmul.ll<br>
          >> URL: <a
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/pmul.ll?rev=363802&r1=363801&r2=363802&view=diff"
            rel="noreferrer noreferrer" target="_blank"
            moz-do-not-send="true">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/pmul.ll?rev=363802&r1=363801&r2=363802&view=diff</a><br>
          >>
==============================================================================<br>
          >> --- llvm/trunk/test/CodeGen/X86/pmul.ll (original)<br>
          >> +++ llvm/trunk/test/CodeGen/X86/pmul.ll Wed Jun 19
          06:58:02 2019<br>
          >> @@ -1326,15 +1326,13 @@ define <8 x i64>
          @mul_v8i64_sext(<8 x i1<br>
          >>   ; SSE41-NEXT:    pshufd {{.*#+}} xmm3 =
          xmm0[1,1,2,3]<br>
          >>   ; SSE41-NEXT:    pmovsxwq %xmm3, %xmm6<br>
          >>   ; SSE41-NEXT:    pmovsxwq %xmm0, %xmm7<br>
          >> -; SSE41-NEXT:    pshufd {{.*#+}} xmm0 =
          xmm2[2,3,0,1]<br>
          >> -; SSE41-NEXT:    pmovsxdq %xmm0, %xmm3<br>
          >> +; SSE41-NEXT:    pshufd {{.*#+}} xmm3 =
          xmm2[2,2,3,3]<br>
          >>   ; SSE41-NEXT:    pmuldq %xmm4, %xmm3<br>
          >> -; SSE41-NEXT:    pmovsxdq %xmm2, %xmm2<br>
          >> +; SSE41-NEXT:    pmovzxdq {{.*#+}} xmm2 =
          xmm2[0],zero,xmm2[1],zero<br>
          >>   ; SSE41-NEXT:    pmuldq %xmm5, %xmm2<br>
          >> -; SSE41-NEXT:    pshufd {{.*#+}} xmm0 =
          xmm1[2,3,0,1]<br>
          >> -; SSE41-NEXT:    pmovsxdq %xmm0, %xmm4<br>
          >> +; SSE41-NEXT:    pshufd {{.*#+}} xmm4 =
          xmm1[2,2,3,3]<br>
          >>   ; SSE41-NEXT:    pmuldq %xmm6, %xmm4<br>
          >> -; SSE41-NEXT:    pmovsxdq %xmm1, %xmm0<br>
          >> +; SSE41-NEXT:    pmovzxdq {{.*#+}} xmm0 =
          xmm1[0],zero,xmm1[1],zero<br>
          >>   ; SSE41-NEXT:    pmuldq %xmm7, %xmm0<br>
          >>   ; SSE41-NEXT:    movdqa %xmm4, %xmm1<br>
          >>   ; SSE41-NEXT:    retq<br>
          >><br>
          >> Modified: llvm/trunk/test/CodeGen/X86/xop-ifma.ll<br>
          >> URL: <a
href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/xop-ifma.ll?rev=363802&r1=363801&r2=363802&view=diff"
            rel="noreferrer noreferrer" target="_blank"
            moz-do-not-send="true">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/xop-ifma.ll?rev=363802&r1=363801&r2=363802&view=diff</a><br>
          >>
==============================================================================<br>
          >> --- llvm/trunk/test/CodeGen/X86/xop-ifma.ll
          (original)<br>
          >> +++ llvm/trunk/test/CodeGen/X86/xop-ifma.ll Wed Jun
          19 06:58:02 2019<br>
          >> @@ -67,12 +67,10 @@ define <8 x i32>
          @test_mul_v8i32_add_v8i<br>
          >>   define <4 x i64>
          @test_mulx_v4i32_add_v4i64(<4 x i32> %a0, <4 x
          i32> %a1, <4 x i64> %a2) {<br>
          >>   ; XOP-AVX1-LABEL: test_mulx_v4i32_add_v4i64:<br>
          >>   ; XOP-AVX1:       # %bb.0:<br>
          >> -; XOP-AVX1-NEXT:    vpmovsxdq %xmm0, %xmm3<br>
          >> -; XOP-AVX1-NEXT:    vpshufd {{.*#+}} xmm0 =
          xmm0[2,3,0,1]<br>
          >> -; XOP-AVX1-NEXT:    vpmovsxdq %xmm0, %xmm0<br>
          >> -; XOP-AVX1-NEXT:    vpmovsxdq %xmm1, %xmm4<br>
          >> -; XOP-AVX1-NEXT:    vpshufd {{.*#+}} xmm1 =
          xmm1[2,3,0,1]<br>
          >> -; XOP-AVX1-NEXT:    vpmovsxdq %xmm1, %xmm1<br>
          >> +; XOP-AVX1-NEXT:    vpmovzxdq {{.*#+}} xmm3 =
          xmm0[0],zero,xmm0[1],zero<br>
          >> +; XOP-AVX1-NEXT:    vpshufd {{.*#+}} xmm0 =
          xmm0[2,1,3,3]<br>
          >> +; XOP-AVX1-NEXT:    vpmovzxdq {{.*#+}} xmm4 =
          xmm1[0],zero,xmm1[1],zero<br>
          >> +; XOP-AVX1-NEXT:    vpshufd {{.*#+}} xmm1 =
          xmm1[2,1,3,3]<br>
          >>   ; XOP-AVX1-NEXT:    vextractf128 $1, %ymm2, %xmm5<br>
          >>   ; XOP-AVX1-NEXT:    vpmacsdql %xmm5, %xmm1, %xmm0,
          %xmm0<br>
          >>   ; XOP-AVX1-NEXT:    vpmacsdql %xmm2, %xmm4, %xmm3,
          %xmm1<br>
          >><br>
          >><br>
          >> _______________________________________________<br>
          >> llvm-commits mailing list<br>
          >> <a href="mailto:llvm-commits@lists.llvm.org"
            target="_blank" rel="noreferrer" moz-do-not-send="true">llvm-commits@lists.llvm.org</a><br>
          >> <a
            href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits"
            rel="noreferrer noreferrer" target="_blank"
            moz-do-not-send="true">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><br>
        </blockquote>
      </div>
    </blockquote>
  </body>
</html>