[cfe-dev] [ubsan] Add -fsanitize-warn-once, only emit runtime error once per check

Fri Dec 28 21:37:52 PST 2012

Thanks for discussion, sorry for the delay in responding.  Holiday and all :).

Reply inline.

On Mon, Dec 17, 2012 at 1:06 AM, Alexey Samsonov <samsonov at google.com> wrote:
>> Previously IOC used a short linear scan table (~10-20 elements was
>> sweet spot IIRC) with fallback to a larger hashtable to manage
>> duplicates, but that was always a performance issue.  As a useful data
>> point, a quick spot-check of 403.gcc shows 96 static locations
>> triggered a total of 3,476,066 times dynamically when just processing
>> one of the inputs used for the 'ref' input set (166.i).  More on this
>> below.

I think the point I was trying to make above got lost in the other
details (and wasn't shown in the table):

Looking at 403.gcc, there were 13205 calls inserted, 13205 "possible"
locations for ubsan to fire.
Of these 13205 calls, 96 triggered dynamically a total of ~3.5 million times.

Avoiding linear scanning a list of 96 elements 3 million times seems
worth a 1% code increase, was the point.

While I've only gathered these numbers for 403.gcc recently, previous
experience with SPEC CINT2000 suggests this is not uncommon at all (4
of the 8 benchmarks that had any overflows had more than 25 unique
locations trigger many many times).

> I didn't extensively test ubsan on real-world applications so it's hard for
> me
> to estimate the number of error reports it prints. But I think we need to
> count
> not the number of calls to __ubsan_handle (i.e. number of places in code
> where
> an error _might_ happen), but the number of actual unique reports printed by
> ubsan.
> If, say, it's at most 10-20, then storing PCs of all the erroneous
> instructions and doing
> a linear scan before printing another report might be better than bloating
> the binary size by 1%.
>

Hard decision to make.  Agreed regarding the 10-20 being threshold,
that's what I found when tuning IOC's runtime previously.

Given the commonly very high invocation count (thousands is common, if
not millions) I'd rather err on known minor code size increase with
predictable performance behavior instead of optimizing for cases where
few checks are invoked a small number of times and scaling poorly when
that's not the case.
Does this seem reasonable?

> That said, I think that the de-duplication functionality should definitely
> be implemented one
> way or another, and it should be "on" by default (and I can't imagine a
> reason why a user
> may decide to turn it off).
>

Sounds good to me, and agreed :).

~Will