[PATCH] D110069: AArch64: use `CAS` instead of `LDX`/`STX` for more ops if available
    Eli Friedman via Phabricator via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Mon Dec 12 18:21:17 PST 2022
    
    
  
efriedma added a comment.
Didn't realize this was up for review; happened to spot it on the list.
Some of these sequence seem extremely long.  It should be a little better on main, since we improved ccmp formation, but can we rearrange operations somehow so we need fewer mov operations in the fast path?
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17674
+  return Subtarget->hasLSE() ? AtomicExpansionKind::CmpXChg
+                             : AtomicExpansionKind::LLSC;
 }
----------------
A comment here might be useful.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:17712
   // succeed. So at -O0 lower this operation to a CAS loop.
-  if (getTargetMachine().getOptLevel() == CodeGenOpt::None)
+  if (getTargetMachine().getOptLevel() == CodeGenOpt::None || Subtarget->hasLSE())
     return AtomicExpansionKind::CmpXChg;
----------------
80 cols
Comment needs to be updated.
================
Comment at: llvm/test/CodeGen/AArch64/arm64-atomic-128.ll:836
+; OUTLINE-NEXT:    // =>This Inner Loop Header: Depth=1
+; OUTLINE-NEXT:    ldaxp xzr, x8, [x2]
+; OUTLINE-NEXT:    stlxp w8, x0, x1, [x2]
----------------
This bug got fixed, right?
================
Comment at: llvm/test/CodeGen/AArch64/atomicrmw-xchg-fp.ll:107
+; LSE-NEXT:    mov x4, x6
+; LSE-NEXT:    mov x5, x7
+; LSE-NEXT:    caspal x4, x5, x2, x3, [x0]
----------------
These moves seem very strange.
Repository:
  rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110069/new/
https://reviews.llvm.org/D110069
    
    
More information about the llvm-commits
mailing list