[llvm] [NVPTX] Add Volta Load/Store Atomics (.relaxed, .acquire, .release) and Volatile (.mmio/.volatile) support (PR #98022)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 11 16:39:15 PDT 2024
================
@@ -700,6 +700,140 @@ static unsigned int getCodeAddrSpace(MemSDNode *N) {
return NVPTX::PTXLdStInstCode::GENERIC;
}
+static unsigned int getCodeMemorySemantic(MemSDNode *N,
+ const NVPTXSubtarget *Subtarget) {
+ AtomicOrdering Ordering = N->getSuccessOrdering();
+ auto CodeAddrSpace = getCodeAddrSpace(N);
+
+ bool HasMemoryOrdering = Subtarget->hasMemoryOrdering();
+ bool HasRelaxedMMIO = Subtarget->hasRelaxedMMIO();
+
+ // TODO: lowering for SequentiallyConsistent Operations: for now, we error.
+ // TODO: lowering for AcquireRelease Operations: for now, we error.
+ //
+
+ // clang-format off
+
+ // Lowering for non-SequentiallyConsistent Operations
+ //
+ // | Atomic | Volatile | Statespace | Lowering sm_60- | Lowering sm_70+ |
+ // |---------|----------|-------------------------------|-----------------|------------------------------------------------------|
+ // | No | No | All | plain | .weak |
+ // | No | Yes | Generic / Shared / Global [0] | .volatile | .volatile |
+ // | No | Yes | Local / Const / Param | plain [1] | .weak [1] |
+ // | Relaxed | No | Generic / Shared / Global [0] | .volatile | <atomic sem> |
+ // | Other | No | Generic / Shared / Global [0] | Error [2] | <atomic sem> |
+ // | Yes | No | Local / Const / Param | plain [1] | .weak [1] |
+ // | Relaxed | Yes | Generic / Shared [0] | .volatile | .volatile |
+ // | Relaxed | Yes | Global [0] | .volatile | .mmio.relaxed.sys (PTX 8.2+) or .volatile (PTX 8.1-) |
+ // | Relaxed | Yes | Local / Const / Param | plain [1] | .weak [1] |
+ // | Other | Yes | Generic / Shared / Global [0] | Error [2] | <atomic sem> [3] |
+
+ // clang-format on
+
+ // [0]: volatile and atomics are only supported on generic addressing to
+ // shared or global, or shared, or global.
+ // MMIO requires generic addressing to global or global, but
+ // (TODO) we only implement it for global.
+
+ // [1]: TODO: this implementation exhibits PTX Undefined Behavior; it
+ // fails to preserve the side-effects of atomics and volatile
+ // accesses in LLVM IR to local / const / param, causing
+ // well-formed LLVM-IR & CUDA C++ programs to be miscompiled
+ // in sm_70+.
----------------
gonzalobg wrote:
I have improved this comment, please take a look and let me know if it is clear now :)
https://github.com/llvm/llvm-project/pull/98022
More information about the llvm-commits
mailing list