[llvm-bugs] [Bug 40303] New: AsmPrinter puts NamedRegionTimer around every instruction, causing big overhead in clang -ftime-report

via llvm-bugs llvm-bugs at lists.llvm.org
Mon Jan 14 00:16:42 PST 2019


https://bugs.llvm.org/show_bug.cgi?id=40303

            Bug ID: 40303
           Summary: AsmPrinter puts NamedRegionTimer around every
                    instruction, causing big overhead in clang
                    -ftime-report
           Product: libraries
           Version: trunk
          Hardware: All
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Common Code Generator Code
          Assignee: unassignedbugs at nondot.org
          Reporter: aras at nesnausk.org
                CC: llvm-bugs at lists.llvm.org

Clang -ftime-report seems to have quite a lot of overhead, compared to when the
flag is not used. In the time reported, curiously passes like "X86 Assembly
Printer" take up a lot of of time.

Overhead of having -ftime-report on is 30-60% which sounds like a lot. Some
tests I did locally (actual files don't matter; basically any non-trivial .cpp
file compilation will do):

                regular   -ftime-report
catch.cpp       1.337     1.950                 
catch.cpp -O2   4.616     6.586         
stl.cpp         0.882     1.269                 
unityformat.cpp 7.195     7.312                 
range-compr.cpp 5.352     6.129                 
shader.cpp      6.000     9.555                 
shader.cpp -O2  12.635    20.061                

My guess is that's because lib/CodeGen/AsmPrinter/AsmPrinter.cpp basically has
two timer samples (NamedRegionTimer) for every instruction, and for every
"Handler" that it invokes.

EmitFunctionBody basically looks like:

for (auto &MBB : *MF) {
  for (auto &MI : MBB) {
    for (const HandlerInfo &HI : Handlers) {
      NamedRegionTimer T(...);
      beginInstruction();
    }
    // ...
    for (const HandlerInfo &HI : Handlers) {
      NamedRegionTimer T(...);
      endInstruction();
    }
  }
}

And then every timer sample is of course involves getting elapsed time, process
times, and memory usage samples twice (for beginning and end of region).


I'm testing this with a couple days old clang trunk (8.0.0), but seemingly the
issue has been there for a while. Haven't tracked down how far back it exists.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190114/d633f98d/attachment.html>


More information about the llvm-bugs mailing list