[PATCH] D115747: [AMDGPU] Hoist waitcnt out of loops when they unecessarily wait for stores
Baptiste Saleil via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 20 12:02:09 PST 2021
bsaleil added a comment.
@arsenm the idea is to rotate the processing loop (runOnMachineFunction) of the SIInsertWaitcnt pass, so we can make decisions from predecessors, not the actual machine IR loop containing the waitcnt:
In D115747#3194595 <https://reviews.llvm.org/D115747#3194595>, @foad wrote:
> Note that the loop over basic blocks in runOnMachineFunction currently looks like:
>
> for each block:
> get saved state for this block
> process block
> merge new state into saved state for each successor
>
> But if necessary we could change it to:
>
> for each block:
> merge saved state from each predecessor
> process block
> save state for this block
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D115747/new/
https://reviews.llvm.org/D115747
More information about the llvm-commits
mailing list