[cfe-commits] [Review] [ubsan] Deduplication
Alexey Samsonov
samsonov at google.com
Sun Jan 13 23:27:59 PST 2013
On Wed, Jan 9, 2013 at 6:01 AM, Will Dietz <wdietz2 at illinois.edu> wrote:
> On Tue, Jan 8, 2013 at 3:02 PM, Richard Smith <richard at metafoo.co.uk>
> wrote:
> > On Tue, Jan 8, 2013 at 12:21 AM, Will Dietz <wdietz2 at illinois.edu>
> wrote:
> >> Thanks for taking a look. Responses inline.
> >>
> >> On Mon, Jan 7, 2013 at 10:16 PM, Richard Smith <richard at metafoo.co.uk>
> wrote:
> >>> Thanks! This patch is highlighting a pressing need to refactor out the
> >>> repetition in ubsan_handlers.cc...
> >>>
> >>
> >> Heh, yep. One day... :).
> >>
> >>> The atomic compare-exchange isn't sufficient to remove the data race,
> >>> since the copy to the local Location object is non-atomic. How about
> >>> something like...
> >>>
> >>> SourceLocation SourceLocation::acquire() {
> >>> return SourceLocation(Filename, Line,
> >>> __sync_val_compare_and_swap(&Column, Column, ~u32(0)));
> >
> > Oops, this is actually wrong, too, since the load of Column is
> > non-atomic. What we actually want here is
> >
> > __atomic_exchange_n(&Column, ~u32(0), __ATOMIC_RELAXED)
> >
> > ... and this should be much quicker than __sync_* on some platforms,
> > since it's not a synchronization operation. The downside is that this
> > is pretty new (g++ 4.7 and Clang 3.1), so we should wrap it in #ifdef
> > __ATOMIC_RELAXED and use the __sync_ builtin otherwise.
> >
>
> Sounds good to me, and thank you for your explanation! Certainly
> clears things up about the UB.
>
> Accordingly, I've removed all racing accesses to the Column field
> (including the misguided optimization) and
> switched to using __sync_lock_test_and_set when __atomic_exchange_n is
> unavailable. This avoids the UB load and avoids the need for a full
> barrier (or a compare) as compared to the CAS approach. TAS is
> guaranteed at least acquire semantics, which is more than we need but
> sufficient (Clang however gives it seq_cst, not sure why).
>
> Tricky business, thanks for helping me get this right! :)
>
Can you re-use atomic_exchange defined in sanitizer_common library? I took
a glance at
atomic_exchange in sanitizer_atomic_clang.h and it uses
__sync_lock_test_and_set.
Maybe, we should put optional use of __atomic_exchange_n there?
>
> >>> }
> >>> bool SourceLocation::isDisabled() {
> >>> return Column == ~u32(0);
> >>> }
> >>>
> >>> // ... then ...
> >>> SourceLocation Loc = Data->Loc.acquire();
> >>> if (Loc.isDisabled())
> >>> return;
> >>>
> >>
> >> I like how this forces the copying of the SourceLocation in order to
> >> use the deduplication feature, something I didn't like about the
> >> approach I used.
> >>
> >> That said, I'm not sure I understand your concern about the local
> >> SourceLocation copy. Would you mind briefly explaining? As far as I
> >> can tell, the load used for the copy must happen before our own CAS,
> >> and if the load in the local copy contains anything other than the
> >> original pristine Column value then we'll see the full ~u32(0) value
> >> in the CAS and discard the copy/loaded value without using it. I
> >> suppose the question is if such a raced load is UB (unused or not,
> >> this is wrong) or simply an 'undef' value (like uninitialized,
> >> harmless if not used). My reading says it's 'undef', but it's quite
> >> possible I'm missing something. Thoughts? Thanks!
> >
> > In theory, C++11 [intro.multithread]p21: "The execution of a program
> > contains a data race if it contains two conflicting actions in
> > different threads, at least one of which is not atomic, and neither
> > happens before the other. Any such data race results in
> > undefined behavior."
> >
> > In practice, since the compiler knows that a non-atomic load of Column
> > dominates the atomic exchange of Column, it also knows there can be no
> > concurrent modifications, so it can transform the atomic exchange into
> > an atomic store, which defeats your attempt to emit only one
> > diagnostic. I'm not sure whether there are compilers which actually do
> > that (yet).
> >
>
> Again, thanks for the detailed explanation. Makes good sense, and
> much appreciated! :)
>
> >> Additionally, one intentional feature of the approach used in my patch
> >> is to avoid the call to __sync_bool_compare_and_swap() in the
> >> potentially 'common' case of the diagnostic indeed being disabled.
> >> This helps performance on programs with many concurrent duplicate
> >> overflows. Does this seem valid to you?
> >
> > No, for the same reason. But __atomic_exchange_n might be faster than
> > a branch anyway.
> >
>
> Sounds good to me. This is likely plenty fast.
>
> Updated patches attached, thanks!
>
> ~Will
>
> >>> On Mon, Jan 7, 2013 at 3:17 PM, Will Dietz <wdietz2 at illinois.edu>
> wrote:
> >>>> Updated, slightly neater patches attached.
> >>>>
> >>>> Thanks!
> >>>>
> >>>> ~Will
> >>>>
> >>>> On Tue, Jan 1, 2013 at 9:49 PM, Will Dietz <wdietz2 at illinois.edu>
> wrote:
> >>>>> Updated to apply cleanly to latest clang/compiler-rt.
> >>>>>
> >>>>> Thanks!
> >>>>>
> >>>>> ~Will
> >>>>>
> >>>>> On Sun, Dec 30, 2012 at 6:22 PM, Will Dietz <wdietz2 at illinois.edu>
> wrote:
> >>>>>> (Moving to cfe-commits@, was previously:
> >>>>>>
> http://lists.cs.uiuc.edu/pipermail/cfe-dev/2012-December/026519.html)
> >>>>>>
> >>>>>> Please see attached! :)
> >>>>>>
> >>>>>> Thank you,
> >>>>>>
> >>>>>> ~Will
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>
>
--
Alexey Samsonov, MSK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130114/b31e1a8c/attachment.html>
More information about the cfe-commits
mailing list