[LLVMdev] Non-temporal moves in memset [Was: ASM output with JIT / codegen barriers]
James Y Knight
foom at fuhm.net
Tue Jan 5 09:53:46 PST 2010
On Jan 5, 2010, at 1:09 AM, Chandler Carruth wrote:
>>>>> Consider that 'memset' to zero is often codegened to a non-
>>>>> store to memory. This exempts it from all ordering considerations
Hm...off topic from my original email since I think this is only
relevant for multithreaded code...
But from what I can tell, an implementation of memset that does not
contain an sfence after using movnti is considered broken. Callers of
memset would not (and should not need to) know that they must use an
actual memory barrier (sfence) after the memset call to get the usual
x86 store-store guarantee.
Thread describing that bug in glibc memset implementation:
Redhat errata including that fix in a stable update:
Then there's a recent discussion on the topic of who is responsible
for calling sfence on the gcc mailing list:
Unfortunately, that thread didn't seem to have any firm conclusion,
but ISTM that the current default assumption is (b): anything that
uses movnti is assumed to surround such uses with memory fences so
that other code doesn't need to.
More information about the llvm-dev