[PATCH] D110069: AArch64: use `CAS` instead of `LDX`/`STX` for more ops if available
Eli Friedman via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 12 18:21:17 PST 2022
efriedma added a comment.
Didn't realize this was up for review; happened to spot it on the list.
Some of these sequence seem extremely long. It should be a little better on main, since we improved ccmp formation, but can we rearrange operations somehow so we need fewer mov operations in the fast path?
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17674
+ return Subtarget->hasLSE() ? AtomicExpansionKind::CmpXChg
+ : AtomicExpansionKind::LLSC;
}
----------------
A comment here might be useful.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17712
// succeed. So at -O0 lower this operation to a CAS loop.
- if (getTargetMachine().getOptLevel() == CodeGenOpt::None)
+ if (getTargetMachine().getOptLevel() == CodeGenOpt::None || Subtarget->hasLSE())
return AtomicExpansionKind::CmpXChg;
----------------
80 cols
Comment needs to be updated.
================
Comment at: llvm/test/CodeGen/AArch64/arm64-atomic-128.ll:836
+; OUTLINE-NEXT: // =>This Inner Loop Header: Depth=1
+; OUTLINE-NEXT: ldaxp xzr, x8, [x2]
+; OUTLINE-NEXT: stlxp w8, x0, x1, [x2]
----------------
This bug got fixed, right?
================
Comment at: llvm/test/CodeGen/AArch64/atomicrmw-xchg-fp.ll:107
+; LSE-NEXT: mov x4, x6
+; LSE-NEXT: mov x5, x7
+; LSE-NEXT: caspal x4, x5, x2, x3, [x0]
----------------
These moves seem very strange.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D110069/new/
https://reviews.llvm.org/D110069
More information about the llvm-commits
mailing list