[llvm-bugs] [Bug 33662] New: Building and running 'check-llvm' with ENABLE_SANITIZER=Address; Undefined is too slow

Fri Jun 30 10:53:49 PDT 2017

https://bugs.llvm.org/show_bug.cgi?id=33662

            Bug ID: 33662
           Summary: Building and running 'check-llvm' with
                    ENABLE_SANITIZER=Address;Undefined is too slow
           Product: new-bugs
           Version: unspecified
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: new bugs
          Assignee: unassignedbugs at nondot.org
          Reporter: chandlerc at gmail.com
                CC: djasper at google.com, eugeni.stepanov at gmail.com,
                    kcc at google.com, llvm-bugs at lists.llvm.org

I was reminded that I should be using the sanitizers as part of my upstream
development more frequently and more consistently, and in fact *all* developers
should.

So I switched over and ran 'ninja check-llvm'. I had -O2 on, and ASan and UBSan
enabled.

The test execution time was *over 10x slower*!!!! This is way worse than what
ASan advertises, and probably explains why so few developers are willing to use
it routinely.

I would like to change this. I think addressing this is a good test for whether
ASan+(some)UBSan is a realistic mode for developers to use as their default
"check for bugs" mode.

Here are my initial findings after profiling where the time went:

1) We spend 25.82% of the time inside of 'llc' doing work (mostly verifying
that each machine function pass preserved analyses, which is really silly, but
unrelated to this bug).

2) We spend *well over 10% of the total test time in LSan for 'llc' executions
alone:
   - 7.26% __lsan::ScanRangeForPointers
      + 0.31% page_fault
        0.04% __lsan::ScanRangeForPointers
   - 3.91% __lsan::PointsIntoChunk

3) We spend over 6% of the time in llvm-symbolizer. 5.8% of the time is in
symbolizeInlinedCode. This is really weird as I think there was only one crash
and it didn't get a symbolized backtrace.

4) We spend over 5.2% of the time in LSan for 'opt' executions. This is *more
time than we spend running 'opt'*! (3.4% in ScanRangeForPointers, 1.8% in
PointsIntoChunk)

5) We spend, wait for it, 1.74% of the time running 'opt' for actual tests.
This is less time than we spend running FileCheck, llvm-symbolizer, LSan, or
any other part of the test suite really.

So there are some serious problems here IMO.

The most glaring is that some 15% of the test time is LSan. The above times are
actually a lower bound on the cost there because they don't account for much of
the time in the kernel dealing with faulting in all the pages.

The second most glaring is that llvm-symbolizer remains amazingly hot in this
profile considering there weren't 100s of failures being symbolized. Why is
that?

The third most glaring (or perhaps the most glaring but also hardest to fix) is
why does the ratio of time between 'llc' and 'opt' is so very bad. But this
isn't ASan specific. In a normal (with asserts) test run, 'llc' takes 25% of
the time and opt's main takes 1.4% of the time. We spend less time testing
things with opt than we do *registering passes with opt* or time running the
'initialize' routines of LLVM for opt. it's amazing really.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20170630/377b7123/attachment-0001.html>