<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><span class="vcard"><a class="email" href="mailto:spatel+llvm@rotateright.com" title="Sanjay Patel <spatel+llvm@rotateright.com>"> <span class="fn">Sanjay Patel</span></a>

</span> changed

          <a class="bz_bug_link 

          bz_status_RESOLVED  bz_closed"

   title="RESOLVED FIXED - Loop vectorizer seems very reluctant to make use of PMULLD"

   href="https://bugs.llvm.org/show_bug.cgi?id=22703">bug 22703</a>

          <br>

             <table border="1" cellspacing="0" cellpadding="8">

          <tr>

            <th>What</th>

            <th>Removed</th>

            <th>Added</th>

          </tr>

         <tr>

           <td style="text-align:right;">Resolution</td>

           <td>---

           </td>

           <td>FIXED

           </td>

         </tr>

         <tr>

           <td style="text-align:right;">Status</td>

           <td>NEW

           </td>

           <td>RESOLVED

           </td>

         </tr></table>

      <p>

        <div>

            <b><a class="bz_bug_link 

          bz_status_RESOLVED  bz_closed"

   title="RESOLVED FIXED - Loop vectorizer seems very reluctant to make use of PMULLD"

   href="https://bugs.llvm.org/show_bug.cgi?id=22703#c3">Comment # 3</a>

              on <a class="bz_bug_link 

          bz_status_RESOLVED  bz_closed"

   title="RESOLVED FIXED - Loop vectorizer seems very reluctant to make use of PMULLD"

   href="https://bugs.llvm.org/show_bug.cgi?id=22703">bug 22703</a>

              from <span class="vcard"><a class="email" href="mailto:spatel+llvm@rotateright.com" title="Sanjay Patel <spatel+llvm@rotateright.com>"> <span class="fn">Sanjay Patel</span></a>

</span></b>

        <pre>The attached program is no longer a valid test. LLVM pre-computes the whole

thing - no multiplies needed!

If we modify main to take 'argc' and the loop to be:

    for (i=0;i<argc;i++) a[i]= (float) (i*i);

...then we can see with r318307:

$ ./clang++ -w -S -O2 -msse4.1 floop.cpp -o -  -emit-llvm | grep mul

  %5 = mul nsw <4 x i32> %vec.ind25, %vec.ind25

  %6 = mul nsw <4 x i32> %step.add26, %step.add26

  %13 = mul nsw <4 x i32> %vec.ind.next28, %vec.ind.next28

  %14 = mul nsw <4 x i32> %step.add26.1, %step.add26.1

  %21 = mul nsw <4 x i32> %vec.ind25.unr, %vec.ind25.unr

  %22 = mul nsw <4 x i32> %step.add26.epil, %step.add26.epil

and:

$ ./clang++ -w -S -O2 -msse4.1 floop.cpp -o -   |grep mul

        pmulld  %xmm6, %xmm6

        pmulld  %xmm5, %xmm5

        pmulld  %xmm5, %xmm5

        pmulld  %xmm6, %xmm6

        pmulld  %xmm0, %xmm0

        pmulld  %xmm1, %xmm1

So I'm going to close this as fixed. If there are other cases where we still do

not produce the right vector multiply, please do open a new bug report.</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>