[PATCH] D156853: [AMDGPU] Add metadata to track preloaded arguments

Mon Sep 4 23:47:12 PDT 2023

kerbowa added a comment.

In D156853#4553034 <https://reviews.llvm.org/D156853#4553034>, @rampitec wrote:

> I have a problem with the 'current FW'. What if an user uses not a current FW? Preloading shall be accompanied with the 1) code reading SGPRs 2) FW writing these SGPRs 3) RT undertstanding all of that. If any of that disagrees things will get broken. From this review request I do not see how is that going to happen in a lockstep.

The compiler is supposed to produce backwards-compatible code. New FW is the only way to reach code that expects preloaded kern args.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerKernelArguments.cpp:181-184
+          llvm::MDNode::get(Ctx, {MDIndex, MDAllocSizeSGPRs}));
+    } else {
+      InPreloadSequence = false;
+    }
----------------
jdoerfert wrote:
> Swap cases. No need for llvm::
Thanks, this condition disappeared in child revisions since we continue here, so no need to flip the cases.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerKernelArguments.cpp:182
+          llvm::MDNode::get(Ctx, {MDIndex, MDAllocSizeSGPRs}));
+    } else {
+      InPreloadSequence = false;
----------------
The metadata communicates which arguments, and how many, should be preloaded. While not strictly needed for correctness since isel should be able to lower any arguments not handled in this pass, it would be nice to have some indication that some amount of preloading has been requested and to not rely entirely on the inreg attribute. If you would rather do away with the metadata entirely I can consider that, but I am someone concerned that inreg may be added when it is not intended to indicate preloading.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerKernelArguments.cpp:195
+    if (Arg.hasInRegAttr() && (InPreloadSequence || !HasPreloadArgs) &&
+        PreloadInfo.allocPreloadArg(AllocSize)) {
+      // Preload this argument.
----------------
arsenm wrote:
> This AllocSize thing isn't sophisticated enough and won't handle aggregates correctly. You need to check the type and number of parts from getNumRegistersForCallingConv and getRegisterTypeForCallingConv
I was planning on ignoring aggregate types for now.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156853/new/

https://reviews.llvm.org/D156853