[llvm] [NVPTX] Add Volta Atomic SequentiallyConsistent Load and Store Operations (PR #98551)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 31 11:15:19 PDT 2024
================
@@ -82,45 +153,139 @@ define void @generic_volatile(ptr %a, ptr %b, ptr %c, ptr %d) local_unnamed_addr
; CHECK: st.volatile.f64 [%rd{{[0-9]+}}], %fd{{[0-9]+}}
store volatile double %f.add, ptr %c
+ ; TODO: volatile, atomic, and volatile atomic memory operations on vector types.
+ ; Currently, LLVM:
+ ; - does not allow atomic operations on vectors.
+ ; - it allows volatile operations but not clear what that means.
+ ; Following both semantics make sense in general and PTX supports both:
+ ; - volatile/atomic/volatile atomic applies to the whole vector
+ ; - volatile/atomic/volatile atomic applies elementwise
+ ; Actions required:
+ ; - clarify LLVM semantics for volatile on vectors and align the NVPTX backend with those
+ ; Below tests show that the current implementation picks the semantics in an inconsistent way
+ ; * volatile <2 x i8> lowers to "elementwise volatile"
+ ; * <4 x i8> lowers to "full vector volatile"
+ ; - provide support for vector atomics, e.g., by extending LLVM IR or via intrinsics
+ ; - update tests in load-store-sm70.ll as well.
+
+ ; TODO: make this operation consistent with the one for <4 x i8>
+ ; This operation lowers to a "element wise volatile PTX operation".
+ ; CHECK: ld.volatile.v2.u8 {%rs{{[0-9]+}}, %rs{{[0-9]+}}}, [%rd{{[0-9]+}}]
+ %h.load = load volatile <2 x i8>, ptr %b
+ %h.add = add <2 x i8> %h.load, <i8 1, i8 1>
+ ; CHECK: st.volatile.v2.u8 [%rd{{[0-9]+}}], {%rs{{[0-9]+}}, %rs{{[0-9]+}}}
+ store volatile <2 x i8> %h.add, ptr %b
+
+ ; TODO: make this operation consistent with the one for <2 x i8>
+ ; This operation lowers to a "full vector volatile PTX operation".
+ ; CHECK: ld.volatile.u32 %r{{[0-9]+}}, [%rd{{[0-9]+}}]
+ %i.load = load volatile <4 x i8>, ptr %c
----------------
Artem-B wrote:
We probably want `<8 x i8>`, and, maybe, `<6 x i8>` too. i8 in PTX is a really odd type.
Speaking of odd types, we should also test `i1`.
https://github.com/llvm/llvm-project/pull/98551
More information about the llvm-commits
mailing list