[llvm] [AMDGPU][CodeGen][True16] Track waitcnt of vgpr32 instead of vgpr16 for 16bit reg in GFX11 (PR #157795)

Joe Nash via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 15 07:20:41 PDT 2025


================
@@ -845,6 +845,14 @@ RegInterval WaitcntBrackets::getRegInterval(const MachineInstr *MI,
     assert(Result.first >= 0 && Result.first < SQ_MAX_PGM_VGPRS);
     assert(Size % 16 == 0);
     Result.second = Result.first + (Size / 16);
+
+    if (Size == 16 && Context->ST->has16bitD16HWBug()) {
+      // also update the other half since lo16/hi16 interfere with each other
----------------
Sisyph wrote:

```suggestion
      // Regardless of which lo16/hi16 is used, consider the full 32-bit register used.
```

https://github.com/llvm/llvm-project/pull/157795


More information about the llvm-commits mailing list