[llvm] [AMDGPU][True16][CodeGen] update waitcnt for true16 (PR #128927)

Brox Chen via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 7 14:24:09 PST 2025


================
@@ -137,10 +137,10 @@ enum WaitEventType {
 // We reserve a fixed number of VGPR slots in the scoring tables for
 // special tokens like SCMEM_LDS (needed for buffer load to LDS).
 enum RegisterMapping {
-  SQ_MAX_PGM_VGPRS = 512, // Maximum programmable VGPRs across all targets.
-  AGPR_OFFSET = 256,      // Maximum programmable ArchVGPRs across all targets.
-  SQ_MAX_PGM_SGPRS = 256, // Maximum programmable SGPRs across all targets.
-  NUM_EXTRA_VGPRS = 9,    // Reserved slots for DS.
+  SQ_MAX_PGM_VGPRS = 1024, // Maximum programmable VGPRs across all targets.
----------------
broxigarchen wrote:

If I understand correctly, this is mainly about compile speed right?

It seems determineWait and setScoreByInterval are doing for loops and checking for boundary. I assumed we mostly just update waitcnt of a few registers at one time right? If that's true, we could benefits through maintaing a sorted list of the score so that we can turn the for loop into a binary search. The benefits in search should overcome the loss in updating value.

https://github.com/llvm/llvm-project/pull/128927


More information about the llvm-commits mailing list