[PATCH] D135948: [LoongArch] Add codegen support for cmpxchg on LA64
Gong LingQin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 18 05:01:53 PDT 2022
gonglingqin added inline comments.
================
Comment at: llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll:20
+; LA64-NEXT: .LBB0_1: # =>This Inner Loop Header: Depth=1
+; LA64-NEXT: dbar 0
+; LA64-NEXT: ll.w $a4, $a3, 0
----------------
hev wrote:
> xry111 wrote:
> > xry111 wrote:
> > > hev wrote:
> > > > Not needed.
> > > >
> > > > ll: full-membar + load-exclusive
> > > Jiaxun told me (via linux-mips):
> > >
> > > > I had checked with Loongson guys and they confirmed that the workaround still needs to be applied to latest 3A4000
> > > > processors, including 3A4000 for MIPS and 3A5000 for LoongArch. Though, the reason behind the workaround varies with the evaluation
> > > > of their uArch, for GS464V based core, barrier is required as the uArch design allows regular load to be reordered after an atomic linked
> > > > load, and that would break assumption of compiler atomic constraints.
> > >
> > > In GCC we use `dbar 0x700`, so in the future HW engineers can fix this issue and make `dbar 0x700` no-op.
> > Ouch, I mean the `dbar 1792` instruction at `LBB0_3`. Yes this one can be removed for 3A5000.
> >
> > But what should we do if `LLDBAR` bit is 0 in CPUCFG?
> Nice question.
>
> What case we need memory barrier before atomic-op? Atomic-op with store-release semantics? Why not make sc as membar + store-conditional?
>
> If memory barrier semantics of ll are still membar + load-exclusive in future, and the atomic-op with load-acquire semantics, I think we can't make dbar 0x700 as no-op.
@hev, @xry111, Thank you for your input, I will remove this dbar. In addition, thanks to @hev's suggestion, dbar is added before sc to ensure correctness when LLDBAR bit is 0.
================
Comment at: llvm/test/CodeGen/LoongArch/ir-instruction/atomic-cmpxchg.ll:25-27
+; LA64-NEXT: xor $a5, $a4, $a2
+; LA64-NEXT: and $a5, $a5, $a0
+; LA64-NEXT: xor $a5, $a4, $a5
----------------
hev wrote:
> I think we should reduce the number of instructions between ll and sc to make ll/sc complete as fast as possible.
>
> for refer: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/arch/loongarch/include/asm/cmpxchg.h?h=next-20221014#n114
Thanks, I will modify it.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D135948/new/
https://reviews.llvm.org/D135948
More information about the llvm-commits
mailing list