[PATCH] D131560: AMDGPU: Improve atomicrmw fadd selection
Petar Avramovic via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 16 03:50:10 PDT 2022
Petar.Avramovic added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:12781
+ // Global fadd f32 no-rtn for gfx908 (and gfx11+).
+ if (!Subtarget->hasGFX90AInsts() && AS == AMDGPUAS::GLOBAL_ADDRESS &&
+ Ty->isFloatTy() && RMW->use_empty())
----------------
b-sumner wrote:
> rampitec wrote:
> > I think both are still unsafe even when supported, on both targets.
> I agree, these are unsafe.
In order to simplify all these conditions:
Obvious first requirement is that subtarget needs to have instruction.
- for gfx940 (MI300) having instruction is enough
- for all other subtargets fadd is unsafe and can be selected only when
1) function has amdgpu-unsafe-fp-atomics=true
2) and atomic rmw needs to be non-system scope and not "one-as"
Second bullet is already implemented for gfx90a (MI200).
If I understand this correctly, fadd atomic rmw for gfx908(MI100) and gfx11 is unsafe like for MI200 and can be selected under same conditions.
summary of logic
```
// mi300
if `subtarget == gfx940` and `subtarget has instruction`
don't expand and return
// mi100, mi200, gfx11+
if 'amdgpu-unsafe-fp-atomics=true' and 'scope is non-system' and `subtarget has instruction`
don't expand and return
expand
```
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D131560/new/
https://reviews.llvm.org/D131560
More information about the llvm-commits
mailing list