[PATCH] D121437: [AMDGPU] Add s_nop WaitStates between neighboring mfma

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 23 11:49:01 PDT 2022


rampitec accepted this revision.
rampitec added a comment.
This revision is now accepted and ready to land.

LGTM



================
Comment at: llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp:1397
 
     if (WaitStatesNeeded == MaxWaitStates)
       return WaitStatesNeeded; // Early exit.
----------------
kerbowa wrote:
> rampitec wrote:
> > Longest MAI is 64 cycles. You may want to move your code to the top as it can bring longest nop sequence.
> Isn't MFMA32x32WritesAGPRAccVgprReadWaitStates longer than 64 cycles? Max wait for padding should be 16 wait states versus 18.
Right, I forgot it is divided by 4.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121437/new/

https://reviews.llvm.org/D121437



More information about the llvm-commits mailing list