[PATCH] D121437: [AMDGPU] Add s_nop WaitStates between neighboring mfma
Austin Kerbow via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 23 11:41:31 PDT 2022
kerbowa added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp:1397
if (WaitStatesNeeded == MaxWaitStates)
return WaitStatesNeeded; // Early exit.
----------------
rampitec wrote:
> Longest MAI is 64 cycles. You may want to move your code to the top as it can bring longest nop sequence.
Isn't MFMA32x32WritesAGPRAccVgprReadWaitStates longer than 64 cycles? Max wait for padding should be 16 wait states versus 18.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D121437/new/
https://reviews.llvm.org/D121437
More information about the llvm-commits
mailing list