[llvm] [AMDGPU][CodeGen][True16] Track waitcnt as vgpr32 instead of vgpr16 for D16 Instructions in GFX11 (PR #157795)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 16 00:23:56 PDT 2025
================
@@ -586,6 +586,12 @@ def FeatureRealTrue16Insts : SubtargetFeature<"real-true16",
"Use true 16-bit registers"
>;
+def Feature16bitD16HWBug : SubtargetFeature<"d16-hw-bug",
----------------
jayfoad wrote:
I'd prefer to find a more descriptive name for the feature. The symptom is that _for waitcnt insertion purposes_ you need to treat D16 loads as if they write to a full 32-bit VGPR, right? So maybe something like "D16Writes32BitVgpr" or "D16LoadsWriteFullVgpr"?
https://github.com/llvm/llvm-project/pull/157795
More information about the llvm-commits
mailing list