[cfe-dev] [ubsan] Check recovery performance, optimizations

Fri Dec 7 08:52:05 PST 2012

On Fri, Dec 7, 2012 at 8:20 AM, Will Dietz <willdtz at gmail.com> wrote:
> Hi all,
>
> A short while back support to recover from checks was added as a -cc1
> option "-fsanitize-recover".
>
> Largest concern regarding recovery was the performance impact.
>
> To explore this, I ran some experiments using SPEC CINT2006.  These
> aren't intended to represent much other than comparing the cost of
> recovery vs non-recovery.
>
> Flags used were "-fsanitize=integer
> -fno-sanitize=unsigned-integer-overflow,shift", with unsigned and
> shift checks disabled so the no-recover variant didn't abort on every
> program :).  Even so, perl, gcc, and h264 were removed from the runs
> since they aborted on the no-recovery configuration.
>
> Without recovery, overhead was 15% on average, and with recovery it
> was 35%. So it does seem the performance concerns were rather
> justified.  See the attached graphs for per-benchmark breakdown
> ("NoRecover" and "Recover" respectively).
>
> Exploring why, I made some tweaks and re-ran the same experiments,
> these are shown in the "NoRecoverColdBr" and "RecoverColdBr" groups.
> As can be seen, this reduces overhead of recovery to 22%, which as
> shown on the second page brings the average overhead relative to
> NoRecover down to 6% (second page).

Awesome - thanks for running the numbers. (ideally it'd be nice if
these sort of experiments could easily be run with LNT as that's what
we like to track perf with (not that it leaves nothing to be desired -
weighting the tests there appropriately & adding more relevant ones,
etc) - I know we've discussed LNT before, but were there particular
issues that meant LNT+the nightly test suite wasn't applicable? One of
the issues I had with it was noise. How was the noise on your SPEC
test runs?)

> The 'tweaks' made were:
> * Branch metadata on the checks indicating the checks are very
> unlikely to trigger.  Performance counter experiments showed much
> worse icache performance with recovery enabled, and branch metadata
> helps block placement avoid these issues.
>
> * Use of ColdCC on ubsan runtime calls.  This alleviates the other
> issue, severe register allocation damage incurred by inserting
> function calls everywhere.  The idea here is the same: optimize
> assuming these checks should never be invoked.  Unfortunately this
> requires:
> ** Fixing ColdCC support in LLVM (see
> http://llvm.org/bugs/show_bug.cgi?id=14481, which includes the patch
> used for these experiments)
> ** Platform-specific code in ubsan to save/restore registers around
> the handlers.  Currently I'm doing this with hand-written asm, it'd be
> much better if we could simply mark the functions themselves with
> ___attribute__(coldcc) or so.
>
> These fixes seem worth exploring: adding branch metadata seems like
> the right thing to do now regardless, and fixing ColdCC support (and
> providing a reasonable way to make cold functions) seems of interest
> to the community as well.

I suppose the other relevant data point: how much does that (ColdCC on
ubsan runtime calls) speed up the non-recovery case? (kind of unfair
to compare recovery with ColdCC to non-recovery without it)

> Anyway, thoughts/feedback welcome, especially on how to best proceed.
> Also, is the reduced overhead low enough to make recovery a reasonable
> default option?

I'm not sure about that value judgement but I expect (perhaps this is
already the assumption) that recovery will be necessary regardless of
whether it's the default.

> Thank you for your time, and have a great Friday! :)

Thanks again,
- David