[llvm] [AMDGPU][CodeGen][True16] Track waitcnt as vgpr32 instead of vgpr16 for D16 Instructions in GFX11 (PR #157795)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 16 00:23:56 PDT 2025


================
@@ -586,6 +586,12 @@ def FeatureRealTrue16Insts : SubtargetFeature<"real-true16",
   "Use true 16-bit registers"
 >;
 
+def Feature16bitD16HWBug : SubtargetFeature<"d16-hw-bug",
----------------
jayfoad wrote:

I'd prefer to find a more descriptive name for the feature. The symptom is that _for waitcnt insertion purposes_ you need to treat D16 loads as if they write to a full 32-bit VGPR, right? So maybe something like "D16Writes32BitVgpr" or "D16LoadsWriteFullVgpr"?

https://github.com/llvm/llvm-project/pull/157795


More information about the llvm-commits mailing list