<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - AsmPrinter puts NamedRegionTimer around every instruction, causing big overhead in clang -ftime-report"
   href="https://bugs.llvm.org/show_bug.cgi?id=40303">40303</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>AsmPrinter puts NamedRegionTimer around every instruction, causing big overhead in clang -ftime-report
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Common Code Generator Code
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>aras@nesnausk.org
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Clang -ftime-report seems to have quite a lot of overhead, compared to when the
flag is not used. In the time reported, curiously passes like "X86 Assembly
Printer" take up a lot of of time.

Overhead of having -ftime-report on is 30-60% which sounds like a lot. Some
tests I did locally (actual files don't matter; basically any non-trivial .cpp
file compilation will do):

                regular   -ftime-report
catch.cpp       1.337     1.950                 
catch.cpp -O2   4.616     6.586         
stl.cpp         0.882     1.269                 
unityformat.cpp 7.195     7.312                 
range-compr.cpp 5.352     6.129                 
shader.cpp      6.000     9.555                 
shader.cpp -O2  12.635    20.061                

My guess is that's because lib/CodeGen/AsmPrinter/AsmPrinter.cpp basically has
two timer samples (NamedRegionTimer) for every instruction, and for every
"Handler" that it invokes.

EmitFunctionBody basically looks like:

for (auto &MBB : *MF) {
  for (auto &MI : MBB) {
    for (const HandlerInfo &HI : Handlers) {
      NamedRegionTimer T(...);
      beginInstruction();
    }
    // ...
    for (const HandlerInfo &HI : Handlers) {
      NamedRegionTimer T(...);
      endInstruction();
    }
  }
}

And then every timer sample is of course involves getting elapsed time, process
times, and memory usage samples twice (for beginning and end of region).


I'm testing this with a couple days old clang trunk (8.0.0), but seemingly the
issue has been there for a while. Haven't tracked down how far back it exists.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>