[llvm] [AMDGPU][CodeGen][True16] Track waitcnt as vgpr32 instead of vgpr16 for D16 Instructions in GFX11 (PR #157795)

Mon Sep 15 07:35:04 PDT 2025

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. :warning:

<details>
<summary>
You can test this locally with the following command:
</summary>

``````````bash
git-clang-format --diff origin/main HEAD --extensions h,cpp -- llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
``````````

:warning:
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing `origin/main` to the base branch/commit you want to compare against.
:warning:

</details>

<details>
<summary>
View the diff from clang-format here.
</summary>

``````````diff

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp b/llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
index 21a94db32..7daeae54c 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
@@ -38,9 +38,7 @@ bool AMDGPUSubtarget::useRealTrue16Insts() const {
   return hasTrue16BitInsts() && EnableRealTrue16Insts;
 }
 
-bool AMDGPUSubtarget::has16bitD16HWBug() const {
-  return Enable16bitD16HWBug;
-}
+bool AMDGPUSubtarget::has16bitD16HWBug() const { return Enable16bitD16HWBug; }
 
 // Returns the maximum per-workgroup LDS allocation size (in bytes) that still
 // allows the given function to achieve an occupancy of NWaves waves per
diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index 95e45614f..9d329a656 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -847,7 +847,8 @@ RegInterval WaitcntBrackets::getRegInterval(const MachineInstr *MI,
     Result.second = Result.first + (Size / 16);
 
     if (Size == 16 && Context->ST->has16bitD16HWBug()) {
-      // Regardless of which lo16/hi16 is used, consider the full 32-bit register used.
+      // Regardless of which lo16/hi16 is used, consider the full 32-bit
+      // register used.
       if (AMDGPU::isHi16Reg(MCReg, *TRI))
         Result.first -= 1;
       else

``````````

</details>


https://github.com/llvm/llvm-project/pull/157795