[cfe-commits] [Review] [ubsan] Deduplication

Will Dietz wdietz2 at illinois.edu
Tue Jan 8 18:01:06 PST 2013


On Tue, Jan 8, 2013 at 3:02 PM, Richard Smith <richard at metafoo.co.uk> wrote:
> On Tue, Jan 8, 2013 at 12:21 AM, Will Dietz <wdietz2 at illinois.edu> wrote:
>> Thanks for taking a look.  Responses inline.
>>
>> On Mon, Jan 7, 2013 at 10:16 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>> Thanks! This patch is highlighting a pressing need to refactor out the
>>> repetition in ubsan_handlers.cc...
>>>
>>
>> Heh, yep.  One day... :).
>>
>>> The atomic compare-exchange isn't sufficient to remove the data race,
>>> since the copy to the local Location object is non-atomic. How about
>>> something like...
>>>
>>> SourceLocation SourceLocation::acquire() {
>>>   return SourceLocation(Filename, Line,
>>> __sync_val_compare_and_swap(&Column, Column, ~u32(0)));
>
> Oops, this is actually wrong, too, since the load of Column is
> non-atomic. What we actually want here is
>
>   __atomic_exchange_n(&Column, ~u32(0), __ATOMIC_RELAXED)
>
> ... and this should be much quicker than __sync_* on some platforms,
> since it's not a synchronization operation. The downside is that this
> is pretty new (g++ 4.7 and Clang 3.1), so we should wrap it in #ifdef
> __ATOMIC_RELAXED and use the __sync_ builtin otherwise.
>

Sounds good to me, and thank you for your explanation!  Certainly
clears things up about the UB.

Accordingly, I've removed all racing accesses to the Column field
(including the misguided optimization) and
switched to using __sync_lock_test_and_set when __atomic_exchange_n is
unavailable.  This avoids the UB load and avoids the need for a full
barrier (or a compare) as compared to the CAS approach.  TAS is
guaranteed at least acquire semantics, which is more than we need but
sufficient (Clang however gives it seq_cst, not sure why).

Tricky business, thanks for helping me get this right! :)

>>> }
>>> bool SourceLocation::isDisabled() {
>>>   return Column == ~u32(0);
>>> }
>>>
>>> // ... then ...
>>> SourceLocation Loc = Data->Loc.acquire();
>>> if (Loc.isDisabled())
>>>   return;
>>>
>>
>> I like how this forces the copying of the SourceLocation in order to
>> use the deduplication feature, something I didn't like about the
>> approach I used.
>>
>> That said, I'm not sure I understand your concern about the local
>> SourceLocation copy.  Would you mind briefly explaining?  As far as I
>> can tell, the load used for the copy must happen before our own CAS,
>> and if the load in the local copy contains anything other than the
>> original pristine Column value then we'll see the full ~u32(0) value
>> in the CAS and discard the copy/loaded value without using it.  I
>> suppose the question is if such a raced load is UB (unused or not,
>> this is wrong) or simply an 'undef' value (like uninitialized,
>> harmless if not used).  My reading says it's 'undef', but it's quite
>> possible I'm missing something.  Thoughts?  Thanks!
>
> In theory, C++11 [intro.multithread]p21: "The execution of a program
> contains a data race if it contains two conflicting actions in
> different threads, at least one of which is not atomic, and neither
> happens before the other. Any such data race results in
> undefined behavior."
>
> In practice, since the compiler knows that a non-atomic load of Column
> dominates the atomic exchange of Column, it also knows there can be no
> concurrent modifications, so it can transform the atomic exchange into
> an atomic store, which defeats your attempt to emit only one
> diagnostic. I'm not sure whether there are compilers which actually do
> that (yet).
>

Again, thanks for the detailed explanation.  Makes good sense, and
much appreciated! :)

>> Additionally, one intentional feature of the approach used in my patch
>> is to avoid the call to __sync_bool_compare_and_swap() in the
>> potentially 'common' case of the diagnostic indeed being disabled.
>> This helps performance on programs with many concurrent duplicate
>> overflows.  Does this seem valid to you?
>
> No, for the same reason. But __atomic_exchange_n might be faster than
> a branch anyway.
>

Sounds good to me.  This is likely plenty fast.

Updated patches attached, thanks!

~Will

>>> On Mon, Jan 7, 2013 at 3:17 PM, Will Dietz <wdietz2 at illinois.edu> wrote:
>>>> Updated, slightly neater patches attached.
>>>>
>>>> Thanks!
>>>>
>>>> ~Will
>>>>
>>>> On Tue, Jan 1, 2013 at 9:49 PM, Will Dietz <wdietz2 at illinois.edu> wrote:
>>>>> Updated to apply cleanly to latest clang/compiler-rt.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> ~Will
>>>>>
>>>>> On Sun, Dec 30, 2012 at 6:22 PM, Will Dietz <wdietz2 at illinois.edu> wrote:
>>>>>> (Moving to cfe-commits@, was previously:
>>>>>> http://lists.cs.uiuc.edu/pipermail/cfe-dev/2012-December/026519.html)
>>>>>>
>>>>>> Please see attached! :)
>>>>>>
>>>>>> Thank you,
>>>>>>
>>>>>> ~Will
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ubsan-Add-deduplication-functionality-always-enabled.patch
Type: application/octet-stream
Size: 10251 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130108/adf5f0ac/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ubsan-Make-static-check-data-non-const-so-it-can-be-.patch
Type: application/octet-stream
Size: 3553 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130108/adf5f0ac/attachment-0001.obj>


More information about the cfe-commits mailing list