[libcxx-commits] [libcxx] [libcxx] Adjust inline assembly constraints for the AMDGPU target (PR #101747)
Joseph Huber via libcxx-commits
libcxx-commits at lists.llvm.org
Mon Aug 5 10:26:48 PDT 2024
================
@@ -291,17 +291,27 @@ struct is_same<T, T> { enum {value = 1}; };
// when optimizations are enabled.
template <class Tp>
inline Tp const& DoNotOptimize(Tp const& value) {
- asm volatile("" : : "r,m"(value) : "memory");
- return value;
+ // The `m` constraint is invalid in the AMDGPU backend.
+# if defined(__AMDGPU__) || defined(__NVPTX__)
+ asm volatile("" : : "r"(value) : "memory");
+# else
+ asm volatile("" : : "r,m"(value) : "memory");
+# endif
+ return value;
}
template <class Tp>
inline Tp& DoNotOptimize(Tp& value) {
-#if defined(__clang__)
+ // The `m` and `r` output constraint is invalid in the AMDGPU backend as well
+ // as i8 / i1 arguments, so we just capture the pointer instead.
+# if defined(__AMDGPU__)
----------------
jhuber6 wrote:
This is how `libcxx` does it apparently.
https://github.com/llvm/llvm-project/pull/101747
More information about the libcxx-commits
mailing list