[PATCH] D141406: [AArch64] Codegen for FEAT_LSE128

Wed Jan 18 09:44:29 PST 2023

lenary added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:21997
+  assert(ISDOpcode != ISD::ATOMIC_LOAD_CLR &&
+         "ATOMIC_LOAD_AND should be lowered to LDCLRP directly");
+  assert(ISDOpcode != ISD::ATOMIC_LOAD_ADD && "There is no 128 bit LDADD");
----------------
tmatheson wrote:
> lenary wrote:
> > I am also confused about this assert message.
> See `ReplaceATOMIC_LOAD_128Results` which has more explanation. It will convert `ATOMIC_LOAD_AND` to 2x `XOR` + `LDCLRP` (determined by this function, `getAtomicLoad128Opcode`). This is in contrast to other sizes, which convert `ATOMIC_LOAD_AND` to `ATOMIC_LOAD_CLR` earlier, because i128 is not legal. The assert is checking that no `ATOMIC_LOAD_CLR` made it through, because for 128 we should see only `ATOMIC_LOAD_AND` here.
Ok, I see, and yet I worry I'd forget this again looking directly at the assert message, which never mentions `_CLR` (the link between `_AND` and `_CLR` seems implicit to me)

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:22176
+    assert(N->getValueType(0) != MVT::i128 &&
+           "128-bit ATOMIC_LOAD_AND should be lowered directly to LDCLRP");
+    break;
----------------
tmatheson wrote:
> lenary wrote:
> > The message here doesn't correspond to the case?
> This assert is also checking that no 128-bit `ATOMIC_LOAD_CLR` appeared, which would indicate that `ATOMIC_LOAD_AND` was not lowered directly to a MachineInstr.
I think it might be clearer if the assert message said "ATOMIC_LOAD_CLR should be legal" maybe?

================
Comment at: llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-lse128.ll:119
 define dso_local void @store_atomic_i128_aligned_unordered(i128 %value, ptr %ptr) {
-; -O0-LABEL: store_atomic_i128_aligned_unordered:
-; -O0:    casp x0, x1, x2, x3, [x8]
----------------
tmatheson wrote:
> lenary wrote:
> > I thought we tested O0 because of globalisel, but here both O0 and O1 are updated, despite not having changed globalisel. Why?
> GlobalISel does not support quite a lot of the atomics, so it is falling back to SelectionDAG. This is also the reason so many of the tests have identical output for O0 and O1. The idea with using optimisation levels rather than forcing -global-isel is that what we care about at the end of the day is the actual output, rather than how complete GlobalISel is.
Right, ok. This does seem like something we could improve, but it's fine to happen later.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141406/new/

https://reviews.llvm.org/D141406