[PATCH] D97604: [SystemZ] Reimplement the 1-byte compare-and-swap logic
Jonas Paulsson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 1 17:17:10 PST 2021
jonpa added a comment.
> Not necessarily. Our ABI does require that "char" and "short" parameters and return values are extended, but that can be either a zero- or a sign-extension depending on the type. Also, this is implemented via the zeroext/signext type attributes on the parameters in code generated by clang; with LLVM IR generated elsewhere (like in those test cases!), we may get a plain i8 or i16 that is not extended. And of course if the i8 or i16 in question is not a function parameter but the result of some intermediate computation, it is not guaranteed to be extended anyway.
>
> So in short, yes, the CmpVal may have to be extended. However, it is probably worthwhile to detect those (common) cases where it already *is* extended to avoid redundant effort. This is hard(er) to do at the MI level, so I think the extension is best done in SystemZTargetLowering::lowerATOMIC_CMP_SWAP at the Select\ionDAG level before emitting the ATOMIC_CMP_SWAPW MI instruction.
I added an AND to zero-out the high bits to perform the zero-extension from the narrow type. It seemed that if the source was a constant (e.g. '1'), the DAG.getNode(ISD::AND, ...) call folded the AND on the fly. And if the source was a parameter with the 'zeroext' attribute (or rather any result with an AssertZext node) ,the AND goes away during DAGCombine2. So for what I could see, there is not much extra work to do here.
> As an aside, it seems the code does now require one extra register. It might be worthwhile to avoid this by rearranging the statements a bit:
>
> As an added bonus, this would make Dest already zero-extended, so it might be possible to avoid any extra zero-extension on the result (by adding an AssertZExt ISD node after the ATOMIC_SWAP_CMPW).
Ah, yes, that makes sense now.
At this point I am wondering about how to treat the signedness of the input/output: if the template type of std::atomic is signed, then the result of e.g. a signed char should be sign-extended to i32, right? So both CmpVal and Dest should either be sign- or zero-extended. The patch currently always zero-extends...
I see that with signed char:
Optimized lowered selection DAG: %bb.0 '_Z3funa:entry'
SelectionDAG has 15 nodes:
t0: ch = EntryToken
t8: ch = lifetime.start<0 to 1> t0, TargetFrameIndex:i64<0>
t12: ch = store<(store 1 into %ir.0, align 2)> t8, Constant:i8<1>, FrameIndex:i64<0>, undef:i64
t14: i8,i1,ch = AtomicCmpSwapWithSuccess<(volatile load store acquire monotonic 1 on %ir.0)> t12, FrameIndex:i64<0>, Constant:i8<1>, Constant:i8<0>
t15: i8,ch = AtomicLoad<(volatile dereferenceable load seq_cst 1 from %ir.0, align 2)> t14:2, FrameIndex:i64<0>
t16: ch = lifetime.end<0 to 1> t15:1, TargetFrameIndex:i64<0>
t18: i64 = sign_extend t15
t20: ch,glue = CopyToReg t16, Register:i64 $r2d, t18
t21: ch = SystemZISD::RET_FLAG t20, Register:i64 $r2d, t20:1
Type-legalized selection DAG: %bb.0 '_Z3funa:entry'
SelectionDAG has 17 nodes:
t16: ch = lifetime.end<0 to 1> t27:1, TargetFrameIndex:i64<0>
t28: i64 = any_extend t27
t30: i64 = sign_extend_inreg t28, ValueType:ch:i8
t20: ch,glue = CopyToReg t16, Register:i64 $r2d, t30
t0: ch = EntryToken
t8: ch = lifetime.start<0 to 1> t0, TargetFrameIndex:i64<0>
t24: ch = store<(store 1 into %ir.0, align 2), trunc to i8> t8, Constant:i32<1>, FrameIndex:i64<0>, undef:i64
t26: i32,i32,ch = AtomicCmpSwapWithSuccess<(volatile load store acquire monotonic 1 on %ir.0)> t24, FrameIndex:i64<0>, Constant:i32<1>, Constant:i32<0>
t27: i32,ch = AtomicLoad<(volatile dereferenceable load seq_cst 1 from %ir.0, align 2)> t26:2, FrameIndex:i64<0>
t21: ch = SystemZISD::RET_FLAG t20, Register:i64 $r2d, t20:1
For a signed type (above), there is a sign_extend of the result (and for the unsigned case a zero_extend). But after the type-legalizer, the AtomicCmpSwapWithSuccess has been replaced with a new one of i32 type, but the extension node is gone. The memory operand has no information of what extension is supposed to happen, so I wonder how the result (original value) is supposed to be properly extended..?
This happens in DAGTypeLegalizer::PromoteIntRes_AtomicCmpSwap() which uses TLI.getExtendForAtomicCmpSwapArg(). This also seem to ignore the original type (takes no parameters), so I am not sure if I am missing something?
It seems nice if we could extend CmpVal and Dest with the same signedness and just change the opcodes for the extensions with this new approach...
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D97604/new/
https://reviews.llvm.org/D97604
More information about the llvm-commits
mailing list