[PATCH] D152649: [AMDGPU] Enable Atomic Optimizer and Default to Iterative Scan Strategy.
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jun 19 04:01:40 PDT 2023
foad added a comment.
> The pass seems to take an atomic operation that lowers to a single instruction and replace it with a loop over active lanes, each of which calls that same instruction.
No - it takes an atomic operations that is executed by (we assume) many lanes, and replaces it with an atomic that is only executed by a single lane, because it is inside some kind of "if (laneid==0)" check.
To make this work you might have to fettle the inputs or outputs of the atomic op, to make it work "as if" it was executed many times by many lanes. E.g. for an atomic add you have to do a plus-reduction of the inputs to the many-lane atomic adds, to get the value to pass into the single-lane atomic add. That's where the loop comes in: it is one way of calculating the plus-reduction. But since it is only doing ALU work, it is still supposed to be better than running a whole bunch of serialised atomic memory operations.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D152649/new/
https://reviews.llvm.org/D152649
More information about the llvm-commits
mailing list