<html>
    <head>
      <base href="https://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - LSR needs to be improved for x86"
   href="https://llvm.org/bugs/show_bug.cgi?id=23384">23384</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>LSR needs to be improved for x86
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: X86
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>wmi@google.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvmbugs@cs.uiuc.edu
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Created <span class=""><a href="attachment.cgi?id=14272" name="attach_14272" title="testcase 1.cc">attachment 14272</a> <a href="attachment.cgi?id=14272&action=edit" title="testcase 1.cc">[details]</a></span>
testcase 1.cc

For the testcase attached, llvm now generates 3 AddRec but 1 AddRec is enough
by using complex addressing modes. Every AddRec will be translated to an add
insn at the end of loop, so the number of AddRec is important for loop
performance. And it is more important than NumRegs especially when register
pressure is not very high.

~/workarea/llvm-r234391/build/bin/clang++ -std=c++11 -O2
-fno-omit-frame-pointer -S 1.cc -o 1.s

The kernel loop:
.LBB1_10:                               # %for.body6.i
        movss   -4(%rcx), %xmm2         # xmm2 = mem[0],zero,zero,zero
        movss   (%rcx), %xmm1           # xmm1 = mem[0],zero,zero,zero
        mulss   -4(%rdi), %xmm2
        addss   %xmm0, %xmm2
        mulss   (%rdi), %xmm1
        addss   %xmm2, %xmm1
        addq    $8, %rdi
        addq    $8, %rcx
        addl    $-2, %esi
        movaps  %xmm1, %xmm0
        jne     .LBB1_10 

A better version is:
.LBB1_8:                                # %for.body6.i
        movss   (%rdi,%rcx,4), %xmm0    # xmm0 = mem[0],zero,zero,zero
        mulss   (%rsi,%rcx,4), %xmm0
        addss   %xmm1, %xmm0
        movss   4(%rdi,%rcx,4), %xmm1   # xmm1 = mem[0],zero,zero,zero
        mulss   4(%rsi,%rcx,4), %xmm1
        addss   %xmm0, %xmm1
        addq    $2, %rcx
        cmpl    %ecx, %eax
        movaps  %xmm1, %xmm0
        jne     .LBB1_8</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>