[PATCH] D121437: [AMDGPU] Add s_nop WaitStates between neighboring mfma

Austin Kerbow via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 23 11:41:31 PDT 2022


kerbowa added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp:1397
 
     if (WaitStatesNeeded == MaxWaitStates)
       return WaitStatesNeeded; // Early exit.
----------------
rampitec wrote:
> Longest MAI is 64 cycles. You may want to move your code to the top as it can bring longest nop sequence.
Isn't MFMA32x32WritesAGPRAccVgprReadWaitStates longer than 64 cycles? Max wait for padding should be 16 wait states versus 18.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121437/new/

https://reviews.llvm.org/D121437



More information about the llvm-commits mailing list