[llvm-dev] RFC: non-temporal fencing in LLVM IR

JF Bastien via llvm-dev llvm-dev at lists.llvm.org
Fri Jan 15 00:15:21 PST 2016


>
> I agree that it's fine to use a locked instruction as a seq_cst fence if
> MFENCE is not available.
>
> It's not clear to me this is true if the seq_cst fence is expected to
> fence non-temporal stores.  I think in practice, you'd be very unlikely to
> notice a difference, but I can't point to anything in the Intel docs which
> justifies a lock prefixed instruction as sufficient to fence any
> non-temporal access.
>

Correct, that's why changing the memory model is critical: seq_cst fence
wouldn't have any guarantee w.r.t. non-temporal.


What exactly would the non-temporal fences be?  It seems that on x86, the
> load and store case may differ.  In theory, there's also a before vs. after
> question.  In practice code using MOVNTA seems to assume that you only need
> an SFENCE afterwards.  I can't back that up with spec verbiage.  I don't
> know about MOVNTDQA.  What about ARM?
>
> I'll leave this to JF to answer.  I'm not knowledgeable enough about
> non-temporals to answer without substantial research first.
>

I'm proposing two builtins:
- __builtin_nontemporal_load_fence
- __builtin_nontemporal_store_fence

I've I've got this right, on x86 they would respectively be a nop, and
sfence.

They otherwise act as memory code motion barriers unless accesses are
proven to not alias. I think it may be possible to loosen the rule so they
act closer to acquire/release (allowing accesses to move into the pair) but
I'm not convinced that this works for every ISA so I'd err on the side of
caution (since this can be loosened later).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160115/aa0fc855/attachment.html>


More information about the llvm-dev mailing list