[cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

Thu Jul 25 12:18:42 PDT 2019

Update: MSVC++ should gain this ability in 2019 Update 4; so it would be nice to enable these as soon as possible. The user in https://developercommunity.visualstudio.com/content/problem/274938/vs2017-1567158p2-stdatomicload-causes-write-access.html has been long suffering 🙂

Billy3
________________________________
From: Billy O'Neal (VC LIBS) <bion at microsoft.com>
Sent: Thursday, March 28, 2019 03:13 PM
To: Reid Kleckner <rnk at google.com>; Friedman, Eli <efriedma at codeaurora.org>; Grang, Mandeep Singh <mgrang at codeaurora.org>
Cc: JF Bastien <jfbastien at apple.com>; cfe-dev <cfe-dev at lists.llvm.org>
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

>The optimizers may change the order of volatile operations relative to non-volatile operations. This is not Java’s “volatile” and has no cross-thread synchronization behavior.

Right, but if they want to link with MSVC++ generated code they need to at least respect _ReadWriteBarrier<https://docs.microsoft.com/en-us/cpp/intrinsics/readwritebarrier?view=vs-2017>.

I can certainly mess with our <atomic> to play more nicely with you folks if you want us calling different builtins instead. <atomic> is entirely library tech for us.

Billy3
________________________________
From: Reid Kleckner <rnk at google.com>
Sent: Thursday, March 28, 2019 03:08 PM
To: Billy O'Neal (VC LIBS); Friedman, Eli; Grang, Mandeep Singh
Cc: JF Bastien; cfe-dev
Subject: Re: [cfe-dev] FYI, Intel folks might be looking to add the __iso_volatile_Xxx family for MSVC STL <atomic> soon

On Thu, Mar 28, 2019 at 2:28 PM Billy O'Neal (VC LIBS) <bion at microsoft.com<mailto:bion at microsoft.com>> wrote:
>Are you sure we shouldn't be marking these as atomic instead of volatile? Volatile is usually not suitable for anything except

I am implementing <atomic> 🙂

My concern is that volatile in LLVM has nothing to do with atomicity, and I think what you really want is LLVM atomic loads and stores, with real ordering markers of monotonic, seq_cst, release, acquire, etc. This is all described here:
https://llvm.org/docs/LangRef.html#volatile-memory-accesses<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fllvm.org%2Fdocs%2FLangRef.html%23volatile-memory-accesses&data=02%7C01%7Cbion%40microsoft.com%7Cd5b30d145383489891e408d6b3ca007c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636894077507135048&sdata=73J4avlu9DO5KGNF%2BONhLJVMzl1Z%2F0TMDnJE9fpUkmY%3D&reserved=0>
https://llvm.org/docs/LangRef.html#memory-model-for-concurrent-operations<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fllvm.org%2Fdocs%2FLangRef.html%23memory-model-for-concurrent-operations&data=02%7C01%7Cbion%40microsoft.com%7Cd5b30d145383489891e408d6b3ca007c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636894077507145039&sdata=Vn1qu6pXuqVPMWJtclPUeFOYp5eK4aLRf2WWJxpzEfA%3D&reserved=0>
In particular: "The optimizers may change the order of volatile operations relative to non-volatile operations. This is not Java’s “volatile” and has no cross-thread synchronization behavior."

So, I'm concerned that there is something subtly incorrect about using __iso_volatile_* to implement atomics.

After looking at the xatomic.h source, I think I have a better understanding of what is going on. I added some folks who probably have a better understanding of LLVM IR atomics, and maybe they can help guide the discussion to a simpler implementation of Visual C++ <atomic> that plays well with LLVM. We had a similar discussion about over- or under-emitting dmb fences in this code review: https://reviews.llvm.org/D52807<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Freviews.llvm.org%2FD52807&data=02%7C01%7Cbion%40microsoft.com%7Cd5b30d145383489891e408d6b3ca007c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636894077507145039&sdata=RroXJ2BBZBaTokulSxd2PTOUor6e56Gm0ii%2Bau9om%2Bw%3D&reserved=0>.

The Visual C++ headers currently add fences for ARM themselves. The code looks like this: https://reviews.llvm.org/P8137<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Freviews.llvm.org%2FP8137&data=02%7C01%7Cbion%40microsoft.com%7Cd5b30d145383489891e408d6b3ca007c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636894077507155032&sdata=UORTGrVnJ%2F8IcsoG1Bezaov2sSB7WrXejzHGyCTH4Cc%3D&reserved=0> In small test cases, this generates the appropriate code, a single load followed by a fence.

-----

Completely aside from improving the implementation of <atomic> with clang, I will go ahead and make those __iso_volatile_* intrinsics available everywhere.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190725/ca8a9a47/attachment.html>