[PATCH] D47882: [RISCV] Codegen for i8, i16, and i32 atomicrmw with RV32A

Fri Jun 8 10:36:30 PDT 2018

jyknight added a comment.

In https://reviews.llvm.org/D47882#1126302, @asb wrote:

> > can only contain other instructions from the base “I” subset, excluding loads, stores,
> >  backward jumps or taken backward branches,

Oh, nice! That changed since the last time I read it in 2016; it used to prohibit -all- taken branches, not just backwards ones.

================
Comment at: lib/Target/RISCV/RISCVExpandPseudoInsts.cpp:179
+
+  // andi offset, addr, 3
+  // andi alignedaddr, addr, -4
----------------
asb wrote:
> jyknight wrote:
> > I'd note that all of the code which is OUTSIDE the ll/sc loop doesn't really need to be implemented within this pseudo. I'm not sure if it's worthwhile extracting that or not, but it could theoretically have benefit, due to being able to potentially reuse e.g. the address math.
> > 
> > Have you considered making such a split?
> Yes, I've thought about this quite a bit. The ideal is to expand everything other than the LL+SC+branch at the IR level. I had been thinking about a new inlineasm-like 'black box' construct that would be expanded by target-dependent code only at the very last moment, when lowering to MCInst. But when I started sketching it out in more detail with the aim of incorporating it as an option in an RFC on atomics lowering I started thinking that there's not much benefit in this versus expanding pseudo-instructions in preEmitPass2. Targets can guarantee that runs after everything else that might cause problems (such as the machine outliner). I'll play with this some more.
I agree -- adding an intrinsic, say "llvm.riscv.atomicrmw.masked.llsc.loop", and then teaching AtomicExpandPass to emit that for the inner loop, seems like it should be fine without adding any new concepts to LLVM.

That's pretty easy for atomicrmw, but not so much for cmpxchg, if you want to avoid extra conditionals and jumps on the way out and also support the separate success/failure fences. But this review is for atomicrmw, so we can ignore that for now. (But: there's already work ongoing to allow for "calls" which have multiple successors, which seems like it'll help to fix that issue once it lands.)

I'd be fine with you landing this, and going back and rework it to expand the prefix/suffix in AtomicExpandPass in followup changes.

https://reviews.llvm.org/D47882