[llvm] [AMDGPU] IGLP: Fixes for VMEM load detection and unsigned int handling (PR #135090)

Fri Apr 11 13:08:32 PDT 2025

================
@@ -2079,6 +2083,9 @@ class MFMASmallGemmSingleWaveOpt final : public IGLPStrategy {
 static unsigned DSWCount = 0;
 static unsigned DSWWithPermCount = 0;
 static unsigned DSWWithSharedVMEMCount = 0;
+static void resetDSWCounters() {
+  DSWCount = DSWWithPermCount = DSWWithSharedVMEMCount = 0;
+}
----------------
ro-i wrote:

because the original authors actually wanted to maintain state.
There is this scheduling phase enum:
https://github.com/llvm/llvm-project/blob/a45b133d400b0e57ca1ba70d50a91fbdf11d3b93/llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.h#L20

Take `MFMASmallGemmSingleWaveOpt` as an example. It is instantiated in `createIGLPStrategy`:
https://github.com/llvm/llvm-project/blob/a45b133d400b0e57ca1ba70d50a91fbdf11d3b93/llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp#L2329-L2343

`createIGLPStrategy` is called by an instance of `IGroupLPDAGMutation`:
https://github.com/llvm/llvm-project/blob/a45b133d400b0e57ca1ba70d50a91fbdf11d3b93/llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp#L2689

That's why `IGroupLPDAGMutation` is instantiated itself using a phase parameter in `createIGroupLPDAGMutation`:
https://github.com/llvm/llvm-project/blob/a45b133d400b0e57ca1ba70d50a91fbdf11d3b93/llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp#L2705-L2707

`createIGroupLPDAGMutation` is called, for example, (1) by `GCNSchedStage`* with phase parameters "Initial" or "PreRAReentry":
https://github.com/llvm/llvm-project/blob/a45b133d400b0e57ca1ba70d50a91fbdf11d3b93/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp#L1192
or (2) by `GCNScheduleDAGMILive` with phase parameter "Initial":
https://github.com/llvm/llvm-project/blob/72144d119a7291f8b6b8e022a2947fbe31e66afc/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp#L595
or (3) by `GCNPostScheduleDAGMILive` with phase parameter "PostRA":
https://github.com/llvm/llvm-project/blob/a45b133d400b0e57ca1ba70d50a91fbdf11d3b93/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp#L2065
https://github.com/llvm/llvm-project/blob/72144d119a7291f8b6b8e022a2947fbe31e66afc/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp#L1112

*those stages itself are objects maintained e.g. by `GCNScheduleDAGMILive`:
https://github.com/llvm/llvm-project/blob/a45b133d400b0e57ca1ba70d50a91fbdf11d3b93/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp#L777

And the GCN schedule DAGs itself are created by `AMDGPUTargetMachine`/`GCNTargetMachine`.

Coming back to our `MFMASmallGemmSingleWaveOpt`: It wants to maintain some cached values which are computed in "initial" phase and re-used in later phases:
https://github.com/llvm/llvm-project/blob/ae0aa2dea2dee3af01326e5ff96ab436628f7e2b/llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp#L2092-L2094
https://github.com/llvm/llvm-project/blob/ae0aa2dea2dee3af01326e5ff96ab436628f7e2b/llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp#L2116-L2169

That's a rough description of the situation as far as I can see it.

https://github.com/llvm/llvm-project/pull/135090