<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - RegAllocGreedy/InlineSpiller inserts 3 reloads instead of 1."
   href="https://bugs.llvm.org/show_bug.cgi?id=43405">43405</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>RegAllocGreedy/InlineSpiller inserts 3 reloads instead of 1.
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Register Allocator
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>paulsson@linux.vnet.ibm.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org, quentin.colombet@gmail.com
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Created <span class=""><a href="attachment.cgi?id=22543" name="attach_22543" title="reduced testcase">attachment 22543</a> <a href="attachment.cgi?id=22543&action=edit" title="reduced testcase">[details]</a></span>
reduced testcase

While experimenting with improving the folding of memory operands (by swapping
operands of comparison instructions), there emerged cases where this lead to
more reload/spill instructions. If I instead disabled folding of all compare
operands (in foldMemoryOperandImpl()), there were now less reload/spill
instructions than with trunk. Since the folding of a reload into the
instruction should be an advantage (avoiding the reload snippet), this is
unexpected.

I have reduced one test case (attached), which generates 3 reloads instead of
1. (llc -mcpu=z14  -o out.s ./tc_regalloc.ll -disable-block-placement)

A value in %r0 is multiplied (mghi) with 24 which is used later by two
additions (agr). Without the folding of a compare operand (not shown), register
%r0 is used for the snipped between the unfolded reload and the compare, which
causes a different allocation for the mghi interval with the result of just one
reload just before the additions which is the only place this is needed.

It seems that this could very possibly be a case where the register allocator
could do better, since it is obvious that it should only have to reload in the
block where the value is used. I don't know exactly what is going wrong -
perhaps the loop structure is making things more complicated?

.LBB0_5:                                # %bb3
        mghi    %r0, 24
        stg     %r0, 208(%r15)          # 8-byte Folded Spill
...
.LBB0_12:                               # %bb63
        lg      %r0, 208(%r15)          # 8-byte Folded Reload
# %bb.13:
        lg      %r0, 208(%r15)          # 8-byte Folded Reload
.LBB0_14:                               # %bb36
.LBB0_16:                               # %bb69
        agr     %r5, %r0
        agr     %r6, %r0
...
.LBB0_21:                               # %bb87
        lg      %r0, 208(%r15)          # 8-byte Folded Reload
...
.LBB0_24:                               # %bb92
.Lfunc_end0:</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>