<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - MergeBlockIntoPredecessor in UnrollLoop is very slow for sufficiently complicated loops"
   href="https://bugs.llvm.org/show_bug.cgi?id=47746">47746</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>MergeBlockIntoPredecessor in UnrollLoop is very slow for sufficiently complicated loops
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Transformation Utilities
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>ctetreau@quicinc.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>UnrollLoop in Transforms/Utils/LoopUnroll.cpp calls MergeBlockIntoPredecessor
for each of the latch basic blocks. MergeBlockIntoPredecessor calls
RemoveRedundantDbgInstrs, which iterates over the instructions of the basic
block twice. For sufficiently complicated loops, this can cause extremely long
compile times.

We're seeing > 10 minute compile times on internal codebases. Unfortunately, I
cannot share any code for these cases. Running in a profiler, we're seeing ~98%
of the runtime inside of RemoveRedundantDbgInstrs. 

I believe it might be possible add a flag to MergeBlockIntoPredecessor skip the
call to RemoveRedundantDbgInstrs, then call RemoveRedundantDbgInstrs for each
basic block inside of simplifyLoopAfterUnroll, however I am not familiar enough
with this code to know for certain there will be no unintended consequences. I
experimented with this in my downstream, and it reduced the runtime from 10
minutes to 4 seconds.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>