[llvm-commits] [PATCH] New (C++0x) atomics: start implementation of 'fence' instruction

Mon Jul 25 16:09:12 PDT 2011

On Sat, Jul 23, 2011 at 6:59 PM, Jeffrey Yasskin <jyasskin at google.com> wrote:
> On Fri, Jul 22, 2011 at 5:40 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>> Per subject; patch attached.  This is the replacement for the very
>> confusing llvm.memory.barrier intrinsic.  Based off of Jeffrey
>> Yasskin's branch where he started work on this.  This patch is a
>> little on the large side, but I couldn't usefully commit anything much
>> smaller.
>
> Thanks for picking this up!

No problem. :)

>> There's one thing about the definition of fence in this patch I'm not
>> sure about: it doesn't provide any obvious way to order a load/store
>> marked !nontemporal.  We don't really want to require that by default
>> because it's a performance hit and not generally useful, but someone
>> might need it.
>
> Two possible ways to support that would be to add values to the
> SynchronizationScope enum or to attach !nontemporal to the fence
> instruction. The alternative is not so much emitting extra
> instructions for ordinary fences to make them apply to nontemporal
> operations as having the x86 code generator automatically insert an
> sfence after any sequence of nontemporal stores. The code generator,
> of course, might not do as good a job as the optimizers in deciding
> where to put sfences.

Hmm... I guess I'll leave this for now; this wasn't really clearly
specified for llvm.memory.barrier anyway.

>> Index: include/llvm/Instruction.def
>> ===================================================================
>> --- include/llvm/Instruction.def      (revision 135822)
>> +++ include/llvm/Instruction.def      (working copy)
>> @@ -133,41 +133,42 @@
>>  HANDLE_MEMORY_INST(27, Load  , LoadInst  )  // Memory manipulation instrs
>>  HANDLE_MEMORY_INST(28, Store , StoreInst )
>>  HANDLE_MEMORY_INST(29, GetElementPtr, GetElementPtrInst)
>> -  LAST_MEMORY_INST(29)
>> +HANDLE_MEMORY_INST(30, Fence , FenceInst )
>
> I assume you're skipping 31 and 32 here to save space for atomicrmw and cmpxchg?
>
>> +  LAST_MEMORY_INST(32)

Yes.

And I just spotted an error a few lines down here; minor editing glitch.

>> Index: docs/LangRef.html
>> ===================================================================
>> --- docs/LangRef.html (revision 135822)
>> +++ docs/LangRef.html (working copy)
>> @@ -4544,6 +4545,63 @@
>>  </div>
>>
>>  <!-- _______________________________________________________________________ -->
>> +<div class="doc_subsubsection"> <a name="i_fence">'<tt>fence</tt>'
>> +Instruction</a> </div>
>> +
>> +<div class="doc_text">
>> +
>> +<h5>Syntax:</h5>
>> +<pre>
>> +  fence [singlethread] <ordering>                   <i>; yields {void}</i>
>> +</pre>
>> +
>> +<h5>Overview:</h5>
>> +<p>The '<tt>fence</tt>' instruction is used to introduce happens-before edges
>> +between operations.</p>
>> +
>> +<h5>Arguments:</h5> <p>'<code>fence</code>' instructions take an <a
>> +href="#ordering">ordering</a> argument which defines what
>> +<i>synchronizes-with</i> edges they add.  They can only be given
>> +<code>acquire</code>, <code>release</code>, <code>acq_rel</code>, and
>> +<code>seq_cst</code> orderings.</p>
>> +
>> +<h5>Semantics:</h5>
>> +<p>A fence <var>A</var> which has (at least) <code>release</code> ordering
>> +semantics <i>synchronizes with</i> a fence <var>B</var> with (at least)
>> +<code>acquire</code> ordering semantics if and only if there exist atomic
>> +operations <var>X</var> and <var>Y</var>, both operating on some atomic object
>> +<var>M</var>, such that <var>A</var> is sequenced before <var>X</var>,
>> +<var>X</var> modifies <var>M</var> (either directly or through some side effect
>> +of a sequence headed by <var>X</var>), <var>Y</var> is sequenced before
>> +<var>B</var>, and <var>Y</var> observes <var>M</var>. This provides a
>> +<i>happens-before</i> dependency between <var>A</var> and <var>B</var>. Rather
>> +than an explicit <code>fence</code>, one (but not both) of the atomic operations
>> +<var>X</var> or <var>Y</var> might provide a <code>release</code> or
>> +<code>acquire</code> (resp.) ordering constraint and still
>> +<i>synchronize-with</i> the explicit <code>fence</code> and establish the
>> +<i>happens-before</i> edge.</p>
>> +
>> +<p>A <code>fence</code> which has <code>seq_cst</code> ordering, in addition to
>> +having both <code>acquire</code> and <code>release</code> semantics specified
>> +above, participates in the global program order of other <code>seq_cst</code>
>> +operations and/or fences.</p>
>> +
>> +<p>The optional "<a href="#singlethread"><code>singlethread</code></a>" argument
>> +specifies that the fence only synchronizes with other fences in the same
>> +thread.</p>
>
> Do you want to mention that 'singlethread' fences are intended
> for/useful in signal handlers?

Sure.

> Looks good to me. (But I would say that. ;)

;)

-Eli