[PATCH] D157265: [AMDGPU] Reorder atomic optimizer to avoid CAS loop.
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 7 03:02:53 PDT 2023
foad added a comment.
> Expand-Atomic pass emits the CAS loop for FP operations
> which limits the optimizations offered by atomic optimizer.
>
> Moving atomic optimizer before expand-atomics allows
> better codegen.
So the intention is that you still get a CAS loop, but the whole loop is executed by a single lane, instead of switching into and out of single-lane mode each time around the loop?
Makes sense to me.
================
Comment at: llvm/test/CodeGen/AMDGPU/atomic_optimizations_pixelshader.ll:44
; GFX7-NEXT: s_wqm_b64 s[4:5], -1
+; GFX7-NEXT: s_and_b64 s[4:5], s[4:5], s[4:5]
; GFX7-NEXT: s_andn2_b64 vcc, exec, s[4:5]
----------------
It would be nice to at least understand where these weird redundant instructions are coming from.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D157265/new/
https://reviews.llvm.org/D157265
More information about the llvm-commits
mailing list