[PATCH] D85044: Add __atomic_is_lock_free to compiler-rt
James Y Knight via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 20 12:47:48 PDT 2020
jyknight added a comment.
There is going to be a bunch more complexity required here, I'm afraid.
Not only do we need to detect the CPU capabilities in order to implement __atomic_is_lock_free, but we also need to ensure that the implementations of __atomic_load_c / etc can actually use those atomic instructions as appropriate.
This means we need to compile the function with different flags, e.g. -mcx16 for x86-64, in order for the compiler to be able to emit cmpxchg16b, or -march=i586 on 32bit x86. But -- we can't compile the entire file that way. Only the functions called after the runtime detection has assured that we're running on the appropriate hardware can be built with those flags.
GCC handles this by compiling the source files multiple times, with different flags, adding some additional #defines which change the function names for each of those compiles. Then, it arranges to dispatch to the correct function depending on the runtime CPU detection. Ideally, we'd also be able to use the "target" attribute to do this on an individual function within the file, but it looks like that may not actually work properly.
================
Comment at: compiler-rt/lib/builtins/atomic.c:34
+
+#if defined(COMPILER_RT_HAS_AUXV) && defined(COMPILER_RT_HAS_AUXV)
+#include <asm/hwcap.h>
----------------
Same check used twice? Also, I think we don't need a configure check for this, `#if __has_include(<asm/hwcap.h>)` would be ok.
================
Comment at: compiler-rt/lib/builtins/atomic.c:154-156
+ if (__have_atomic_cas == -1) {
+ __have_atomic_cas = check_x86_atomic_cas() != 0 ? 1 : 0;
+ }
----------------
This isn't threadsafe as written. It would be okay to use a relaxed atomic to load and store __have_atomic_cas (Relaxed is okay, because it's okay to potentially call check_x86_atomic_cas multiple times if this is reached simultaneously on multiple threads, and it will result in the same answer each time).
================
Comment at: compiler-rt/lib/builtins/atomic.c:162-166
+ case 8:
+#ifdef __x86_64__
+ case 16:
+#endif
+ return __have_atomic_cas;
----------------
It's confusing to have "have_atomic_cas" sometimes mean "have CMPXCHG8b" and sometimes "have CMPXCHG16b".
And that confusion has led to a bug here -- on x86-64, cmpxchg8b is always available, but this code returns false for 8-byte atomics, unless CMPXCHG16b is also available.
================
Comment at: compiler-rt/lib/builtins/atomic.c:177
+ if (__have_atomic_cap == -1) {
+ __have_atomic_cap = (getauxval(AT_HWCAP) & HWCAP_ATOMICS) != 0 ? 1 : 0;
+ }
----------------
AArch64 always supports atomics, the HWCAP_ATOMICS is needed only for the newer optimized instructions instead of the older LL/SC loop. (We may want to use those instructions for performance in the future, but it's not necessary for correctness, so we can leave out that complexity for now.
================
Comment at: compiler-rt/lib/builtins/atomic.c:197
+ case 8:
+ return false; // FIXME: not sure the check similar to aarch64 works
+ }
----------------
Let's leave aside ARM32 support for the first draft and just concentrate on x86 to get the overall framework in place.
================
Comment at: compiler-rt/lib/builtins/atomic.c:202-203
+#else
+// Returns true if this platform support atomic operation for the given size in
+// bytes.
+static inline bool have_atomic_cap(int) { return false; }
----------------
At the end, it'll be better to fail to compile on platforms we haven't implemented, than to do the wrong thing.
But in order to allow implementing platforms piecemeal at the beginning, I'm ok to leave this fallback for now.
================
Comment at: compiler-rt/lib/builtins/atomic.c:229
+ CHECK_LOCK_FREE_POW2(2);
+ // [[clang::fallthrough]];
+ case 4:
----------------
In C code (compiled with Clang), this can be spelled `__attribute__((fallthrough))`.
================
Comment at: compiler-rt/test/builtins/Unit/atomic_lock_free_test.cc:3
+// REQUIRES: librt_has_atomic
+//===-- atomic_lock_free_test.c - Test is_lock_free function ------===//
+//
----------------
Given that the contract for __atomic_is_lock_free doesn't require actual pointers, I'd leave out the construction of real objects in this test, and just pass in constructed values e.g. `(void*)~7`.
Also, all these assertions will need to be platform-specific.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D85044/new/
https://reviews.llvm.org/D85044
More information about the llvm-commits
mailing list