[llvm] [AMDGPU][True16][CodeGen] update waitcnt for true16 (PR #128927)
Brox Chen via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 7 14:24:09 PST 2025
================
@@ -137,10 +137,10 @@ enum WaitEventType {
// We reserve a fixed number of VGPR slots in the scoring tables for
// special tokens like SCMEM_LDS (needed for buffer load to LDS).
enum RegisterMapping {
- SQ_MAX_PGM_VGPRS = 512, // Maximum programmable VGPRs across all targets.
- AGPR_OFFSET = 256, // Maximum programmable ArchVGPRs across all targets.
- SQ_MAX_PGM_SGPRS = 256, // Maximum programmable SGPRs across all targets.
- NUM_EXTRA_VGPRS = 9, // Reserved slots for DS.
+ SQ_MAX_PGM_VGPRS = 1024, // Maximum programmable VGPRs across all targets.
----------------
broxigarchen wrote:
If I understand correctly, this is mainly about compile speed right?
It seems determineWait and setScoreByInterval are doing for loops and checking for boundary. I assumed we mostly just update waitcnt of a few registers at one time right? If that's true, we could benefits through maintaing a sorted list of the score so that we can turn the for loop into a binary search. The benefits in search should overcome the loss in updating value.
https://github.com/llvm/llvm-project/pull/128927
More information about the llvm-commits
mailing list