[PATCH] D136525: [M68k] Add codegen pattern for atomic load / store

Tue Nov 1 21:58:59 PDT 2022

myhsu added a comment.

Let me try to answer most of the questions at once. But first, here is the workflow I would do:

1. For 68020 and later, `setMaxAtomicSizeInBitsSupported(32)` and `setMaxAtomicSizeInBitsSupported(0)` otherwise. AtomicExpandPass then replaces everything not in the size ranges with `__atomic_*`.
2. AtomicLoad, AtomicStore and AtomicCmpXchg in target >= 68020 will be lowered to native instructions.
3. Mark every other `ISD::ATOMIC_*` as LibCall, this effectively lowers them into library calls to `__sync_*`.

Now here are the questions:

> Marking it as LibCall can stop the type legalizer from expanding the AtomicStore with 64bits operand to AtomicSwap,

I'm not sure how did you get this, given the fact that 64-bit AtomicStore should already been replaced by __atomic_store by AtomicExpandPass. So you don't need to mark ISD::ATOMIC_STORE as LibCall.

> transformed to __sync_lock_test_and_set --- a function that I can't find in libatomic.a.

Correct, because `__sync_*` come from libgcc <https://github.com/gcc-mirror/gcc/tree/master/libgcc>, the primary compiler runtime library for GNU toolchain. In the case of m68k <https://github.com/gcc-mirror/gcc/blob/master/libgcc/config/m68k/linux-atomic.c>, these `__sync_*` functions are "basically" lock-free -- they simply make a syscall <https://github.com/torvalds/linux/blob/master/arch/m68k/kernel/syscalls/syscall.tbl#L345> that provides the cmpxchg feature (though that also means it's less portable than, said libatomic which is a pure C implementation). To my best understanding, the reason legalizer chooses to lower atomic operations into `__sync_*` rather than `__atomic_*` is because the former are more lower level and more importantly, libgcc is always available during the linking (when using GNU driver of course, which is the case for m68k-linux-gnu).

> So I cannot lower atomic_load / atomic_store to native instruction whilst lower atomic_compare_and_swap to library call --- all of them should either be lowered to native instruction or library calls ?
> Why the atomic width is not a property of a specific instruction ?

To my best understanding, LLVM makes the decision to segregate native atomic supports by size,  rather than a specific instruction, simply because //most// architectures do that. You either have full or no atomic support for a certain size. Meaning, m68k, as an anomaly, needs to do some extra works on this matter, which is not difficult because I believe the workflow I put at the beginning of this comment can lower atomic load / store to native instructions (on supported sub architectures) while lowering other atomic operations to libcalls for a given size.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136525/new/

https://reviews.llvm.org/D136525