[LLVMdev] Intended semantics for ``fence seq_cst``

Wed Jul 31 18:10:19 PDT 2013

> FWIW, I don't think we should add that, because it's an attempt to
> define behavior that's undefined for other reasons (the data race on
> the volatile).

I had a discussion with Chandler and others, and something I
misunderstood was pointed out: it is not an explicit goal of LLVM to
support or continue supporting legacy code that did what it had to to
express functional concurrent code. It may happen to work now, but
unless enough LLVM users express interest this may break one day, and
__sync_synchronize may not order anything (it may just emit a fence
without forcing any ordering). It was pointed out that it's not clear
that __sync_synchronize has a clear spec, and that implementing it
properly in LLVM may not be tractable or worthwhile.

> If you (PNaCl?) explicitly want to define the behavior of legacy code
> that used 'volatile' for synchronization (which has always been
> undefined behavior outside of assembly, even before the memory model;
> it just happened to work in many cases), could you compile volatile
> accesses to 'atomic volatile monotonic' accesses? Then the normal
> memory model would apply, and I don't think the instructions emitted
> would change at all on the platforms I'm familiar with.

I actually go further for now and promote volatiles to seq_cst
atomics. This promotion happens after opt, but before most
architecture-specific optimizations. I could have used relaxed
ordering, but as a conservative first approach went with seq_cst. For
PNaCl it's correct because we only support 8/16/32/64 bit types,
require natural alignment (though we should provide better
diagnostics), packed volatiles can be split, we don't allow direct
device access, and we don't allow mapping the same physical address at
multiple virtual addresses. We could relax this at a later time, but
we'd really be better off if C++11's better-defined memory model were
used in the code.

This thread (and the side discussion) answered my concern, though with
a solution I didn't expect: trying to make user code that doesn't use
volatile or atomic, but does use __sync_synchronize, work as intended
isn't one of LLVM's goals.