[PATCH] D157265: [AMDGPU] Reorder atomic optimizer to avoid CAS loop.
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 7 07:50:25 PDT 2023
arsenm added a comment.
In D157265#4565286 <https://reviews.llvm.org/D157265#4565286>, @pravinjagtap wrote:
> In D157265#4565049 <https://reviews.llvm.org/D157265#4565049>, @foad wrote:
>
>>> Expand-Atomic pass emits the CAS loop for FP operations
>>> which limits the optimizations offered by atomic optimizer.
>>>
>>> Moving atomic optimizer before expand-atomics allows
>>> better codegen.
>>
>> So the intention is that you still get a CAS loop, but the whole loop is executed by a single lane, instead of switching into and out of single-lane mode each time around the loop?
>
> At the moment, yes, that the current behavior. As a extension, will be creating new patch where atomic-expand calls `simplifyCFG` on the relevant blocks if it emits a CAS loop. [suggested by @arsenm]
AArch64 deals with this by inserting an extra simplifyCFG pass run, which seems excessive given we're only making local changes
================
Comment at: llvm/test/CodeGen/AMDGPU/r600.global_atomics.ll:2
+; RUN: llc -march=r600 -mcpu=cypress -amdgpu-atomic-optimizer-strategy=None -verify-machineinstrs < %s | FileCheck -check-prefix=EG -check-prefix=FUNC %s
+; RUN: llc -march=r600 -mcpu=cayman -amdgpu-atomic-optimizer-strategy=None -verify-machineinstrs < %s | FileCheck -check-prefix=EG -check-prefix=FUNC %s
----------------
It seems you're working around the pass not handling r600, can you just not add the pass in that case
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D157265/new/
https://reviews.llvm.org/D157265
More information about the llvm-commits
mailing list