[LLVMdev] Non-temporal moves in memset [Was: ASM output with JIT / codegen barriers]

Chandler Carruth chandlerc at google.com
Tue Jan 5 10:10:18 PST 2010

On Tue, Jan 5, 2010 at 9:53 AM, James Y Knight <foom at fuhm.net> wrote:
> On Jan 5, 2010, at 1:09 AM, Chandler Carruth wrote:
>>>>>> Consider that 'memset' to zero is often codegened to a non-temporal
>>>>>> store to memory. This exempts it from all ordering considerations
> Hm...off topic from my original email since I think this is only relevant
> for multithreaded code...
> But from what I can tell, an implementation of memset that does not contain
> an sfence after using movnti is considered broken. Callers of memset would
> not (and should not need to) know that they must use an actual memory
> barrier (sfence) after the memset call to get the usual x86 store-store
> guarantee.
> Thread describing that bug in glibc memset implementation:
> http://sourceware.org/ml/libc-alpha/2007-11/msg00017.html
> Redhat errata including that fix in a stable update:
> http://rhn.redhat.com/errata/RHBA-2008-0083.html
> Then there's a recent discussion on the topic of who is responsible for
> calling sfence on the gcc mailing list:
> http://www.mail-archive.com/gcc@gcc.gnu.org/msg45939.html
> Unfortunately, that thread didn't seem to have any firm conclusion, but ISTM
> that the current default assumption is (b): anything that uses movnti is
> assumed to surround such uses with memory fences so that other code doesn't
> need to.

I didn't mean to imply that the fence was missing after the
non-temporal store (yikes!!), rather that it was an example of a not
uncommon situation where fencing (may be) required even in
single-threaded x86 code. That said, Jeffrey raised good points that
it isn't entirely clear at all to what extent non-temporal stores
deviate from the ordering constraints of typical x86 code. From the
threads you cite, there is also dispute about the best way to manage
those deviations from the ordering constraints. At least w.r.t.
memset, I would agree with you and assume that it is providing the
fencing needed.

More information about the llvm-dev mailing list