[cfe-dev] [ubsan] Add -fsanitize-warn-once, only emit runtime error once per check

Sun Dec 30 01:15:01 PST 2012

On Fri, Dec 28, 2012 at 9:37 PM, Will Dietz <willdtz at gmail.com> wrote:
> Thanks for discussion, sorry for the delay in responding.  Holiday and all :).
>
> Reply inline.
>
> On Mon, Dec 17, 2012 at 1:06 AM, Alexey Samsonov <samsonov at google.com> wrote:
>>> Previously IOC used a short linear scan table (~10-20 elements was
>>> sweet spot IIRC) with fallback to a larger hashtable to manage
>>> duplicates, but that was always a performance issue.  As a useful data
>>> point, a quick spot-check of 403.gcc shows 96 static locations
>>> triggered a total of 3,476,066 times dynamically when just processing
>>> one of the inputs used for the 'ref' input set (166.i).  More on this
>>> below.
>
> I think the point I was trying to make above got lost in the other
> details (and wasn't shown in the table):
>
> Looking at 403.gcc, there were 13205 calls inserted, 13205 "possible"
> locations for ubsan to fire.
> Of these 13205 calls, 96 triggered dynamically a total of ~3.5 million times.
>
> Avoiding linear scanning a list of 96 elements 3 million times seems
> worth a 1% code increase, was the point.
>
> While I've only gathered these numbers for 403.gcc recently, previous
> experience with SPEC CINT2000 suggests this is not uncommon at all (4
> of the 8 benchmarks that had any overflows had more than 25 unique
> locations trigger many many times).
>
>> I didn't extensively test ubsan on real-world applications so it's hard for
>> me
>> to estimate the number of error reports it prints. But I think we need to
>> count
>> not the number of calls to __ubsan_handle (i.e. number of places in code
>> where
>> an error _might_ happen), but the number of actual unique reports printed by
>> ubsan.
>> If, say, it's at most 10-20, then storing PCs of all the erroneous
>> instructions and doing
>> a linear scan before printing another report might be better than bloating
>> the binary size by 1%.
>>
>
> Hard decision to make.  Agreed regarding the 10-20 being threshold,
> that's what I found when tuning IOC's runtime previously.
>
> Given the commonly very high invocation count (thousands is common, if
> not millions) I'd rather err on known minor code size increase with
> predictable performance behavior instead of optimizing for cases where
> few checks are invoked a small number of times and scaling poorly when
> that's not the case.
> Does this seem reasonable?

Here's what I would suggest: make the static data objects passed to
the ubsan handlers non-constant, and modify the SourceLocation in
place to mark the diagnostic as disabled. This should be possible with
no binary size impact (other than a little more code in the ubsan
runtime).

>> That said, I think that the de-duplication functionality should definitely
>> be implemented one
>> way or another, and it should be "on" by default (and I can't imagine a
>> reason why a user
>> may decide to turn it off).
>>
>
> Sounds good to me, and agreed :).

SGTM too.