[PATCH] D141429: [AArch64] Codegen for FEAT_LRCPC3

Fri Mar 17 06:33:53 PDT 2023

tmatheson added a subscriber: LukeGeeson.
tmatheson added inline comments.

================
Comment at: llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-load-rcpc3.ll:283
+; CHECK:    ldp x1, x0, [x0]
+; CHECK:    dmb ish
     %r = load atomic i128, ptr %ptr seq_cst, align 16
----------------
tmatheson wrote:
> efriedma wrote:
> > efriedma wrote:
> > > Sort of orthogonal to this change, but can someone at ARM verify if this is the sequence we actually want for sequentially consistent loads with lse2, as opposed to using caspal?  (I'm a bit concerned given the issues we ran into with narrower widths on Windows; see D141748.)
> > Any update here?
> Not yet but I haven't forgotten about it.
Sorry this has taken a while, I've had to work on other things. The first thing to note is that a CASP implementation is slower, so as long as both are correct we should use the `ldp+dmb`.

@LukeGeeson has been working on adding CASP to herd7 so that we can compare `ldp+dmb` with `casp`. Using his Telechat tool, which can compare the C++ semantics with the machine code semantics and identify new behaviours introduced by the machine code, we aren't seeing any differences so far, although admittedly the number of cases we've checked is low because each CASP case has to be manually written.

We are still looking for at a way to run more extensive testing automatically, but if you have any concerns about specific scenarios we can check them.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141429/new/

https://reviews.llvm.org/D141429