[PATCH] D47882: [RISCV] Codegen for i8, i16, and i32 atomicrmw with RV32A
James Y Knight via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 7 14:40:32 PDT 2018
jyknight added a comment.
The problem with min and max is that you must not have a branch within the ll/sc loop. But, they can be done branchless with not too many instructions, so ISTM we should just implement them that way.
I think the following will work -- it computes `b ^ ((a^b) & -(a < b))`. The `& -(a < b)` either returns the LHS (if comparison is true) or returns 0 (if comparison is false), so you either get b^a^b, or b.
min(a0, a1) -> a0:
slt a2, a0, a1
neg a2, a2
xor a0, a0, a1
and a0, a0, a2
xor a0, a1, a0
And I think just use sltu instead of slt for umin/umax, and reverse the args of the slt for max/umax.
================
Comment at: lib/Target/RISCV/RISCVExpandPseudoInsts.cpp:179
+
+ // andi offset, addr, 3
+ // andi alignedaddr, addr, -4
----------------
I'd note that all of the code which is OUTSIDE the ll/sc loop doesn't really need to be implemented within this pseudo. I'm not sure if it's worthwhile extracting that or not, but it could theoretically have benefit, due to being able to potentially reuse e.g. the address math.
Have you considered making such a split?
================
Comment at: lib/Target/RISCV/RISCVTargetMachine.cpp:104
+
+void RISCVPassConfig::addPreSched2() { addPass(createRISCVExpandPseudoPass()); }
----------------
Is it okay to do this pre-scheduling? Are we guaranteed that the scheduler won't do something invalid here? (I'm not sure the set of transforms it's capable of at the moment.)
https://reviews.llvm.org/D47882
More information about the llvm-commits
mailing list