<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><span class="vcard"><a class="email" href="mailto:llvm-dev@redking.me.uk" title="Simon Pilgrim <llvm-dev@redking.me.uk>"> <span class="fn">Simon Pilgrim</span></a>

</span> changed

          <a class="bz_bug_link 

          bz_status_RESOLVED  bz_closed"

   title="RESOLVED FIXED - [X86] AVX2 should use an extract_subvector and phadd for the first step of a pairwise v8i32 addition reduction"

   href="https://bugs.llvm.org/show_bug.cgi?id=39921">bug 39921</a>

          <br>

             <table border="1" cellspacing="0" cellpadding="8">

          <tr>

            <th>What</th>

            <th>Removed</th>

            <th>Added</th>

          </tr>

         <tr>

           <td style="text-align:right;">Fixed By Commit(s)</td>

           <td>r359491

           </td>

           <td>r359491,r362327

           </td>

         </tr>

         <tr>

           <td style="text-align:right;">Resolution</td>

           <td>---

           </td>

           <td>FIXED

           </td>

         </tr>

         <tr>

           <td style="text-align:right;">Status</td>

           <td>NEW

           </td>

           <td>RESOLVED

           </td>

         </tr></table>

      <p>

        <div>

            <b><a class="bz_bug_link 

          bz_status_RESOLVED  bz_closed"

   title="RESOLVED FIXED - [X86] AVX2 should use an extract_subvector and phadd for the first step of a pairwise v8i32 addition reduction"

   href="https://bugs.llvm.org/show_bug.cgi?id=39921#c2">Comment # 2</a>

              on <a class="bz_bug_link 

          bz_status_RESOLVED  bz_closed"

   title="RESOLVED FIXED - [X86] AVX2 should use an extract_subvector and phadd for the first step of a pairwise v8i32 addition reduction"

   href="https://bugs.llvm.org/show_bug.cgi?id=39921">bug 39921</a>

              from <span class="vcard"><a class="email" href="mailto:llvm-dev@redking.me.uk" title="Simon Pilgrim <llvm-dev@redking.me.uk>"> <span class="fn">Simon Pilgrim</span></a>

</span></b>

        <pre>(In reply to Simon Pilgrim from <a href="show_bug.cgi?id=39921#c1">comment #1</a>)

<span class="quote">> But Intel targets can get stuck as the 'fast shuffle' attribute gets in the

> way:

> 

> pairwise_reduction8i32: # @pairwise_reduction8i32

>   vmovdqa .LCPI0_0(%rip), %ymm1 # ymm1 = [0,2,4,6,4,6,6,7]

>   vpermd %ymm0, %ymm1, %ymm1

>   vmovdqa .LCPI0_1(%rip), %ymm2 # ymm2 = [1,3,5,7,5,7,6,7]

>   vpermd %ymm0, %ymm2, %ymm0

>   vpaddd %xmm0, %xmm1, %xmm0

>   vpshufd $232, %xmm0, %xmm1 # xmm1 = xmm0[0,2,2,3]

>   vpshufd $237, %xmm0, %xmm0 # xmm0 = xmm0[1,3,2,3]

>   vpaddd %xmm0, %xmm1, %xmm0

>   vpshufd $229, %xmm0, %xmm1 # xmm1 = xmm0[1,1,2,3]

>   vpaddd %xmm1, %xmm0, %xmm0

>   vmovd %xmm0, %eax

>   vzeroupper

>   retq</span >

Resolving - the fast-variable-shuffle issue was fixed at rL362327:

pairwise_reduction8i32: # @pairwise_reduction8i32

  vextracti128 $1, %ymm0, %xmm1

  vphaddd %xmm1, %xmm0, %xmm0

  vphaddd %xmm0, %xmm0, %xmm0

  vpshufd $229, %xmm0, %xmm1 # xmm1 = xmm0[1,1,2,3]

  vpaddd %xmm1, %xmm0, %xmm0

  vmovd %xmm0, %eax

  vzeroupper

  retq</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>