[llvm] [Offload] Guard HSA implicit arguments if they aren't created (PR #133073)

Wed Mar 26 06:07:30 PDT 2025

https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/133073

Summary:
We conditionally allocate the implicit arguments, so they possibly are
null. The flang compiler seems to hit this case, even though it
shouldn't when it's supposed to conform to the HSA code object. For now
guard this to fix the regression and cover a case in the future where
someone rolls a fully custom implementatation.

Fixes: https://github.com/llvm/llvm-project/issues/132982


>From 3d5032535885cca674b77f2b0cf2a899390d83ac Mon Sep 17 00:00:00 2001
From: Joseph Huber <huberjn at outlook.com>
Date: Wed, 26 Mar 2025 08:05:43 -0500
Subject: [PATCH] [Offload] Guard HSA implicit arguments if they aren't created

Summary:
We conditionally allocate the implicit arguments, so they possibly are
null. The flang compiler seems to hit this case, even though it
shouldn't when it's supposed to conform to the HSA code object. For now
guard this to fix the regression and cover a case in the future where
someone rolls a fully custom implementatation.

Fixes: https://github.com/llvm/llvm-project/issues/132982
---
 offload/plugins-nextgen/amdgpu/src/rtl.cpp | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/offload/plugins-nextgen/amdgpu/src/rtl.cpp b/offload/plugins-nextgen/amdgpu/src/rtl.cpp
index b2ede888b542d..097a324aa04a1 100644
--- a/offload/plugins-nextgen/amdgpu/src/rtl.cpp
+++ b/offload/plugins-nextgen/amdgpu/src/rtl.cpp
@@ -3386,16 +3386,18 @@ Error AMDGPUKernelTy::launchImpl(GenericDeviceTy &GenericDevice,
     return Err;
 
   // Set the COV5+ implicit arguments to the appropriate values.
-  ImplArgs->BlockCountX = NumBlocks[0];
-  ImplArgs->BlockCountY = NumBlocks[1];
-  ImplArgs->BlockCountZ = NumBlocks[2];
-  ImplArgs->GroupSizeX = NumThreads[0];
-  ImplArgs->GroupSizeY = NumThreads[1];
-  ImplArgs->GroupSizeZ = NumThreads[2];
-  ImplArgs->GridDims = NumBlocks[2] * NumThreads[2] > 1
-                           ? 3
-                           : 1 + (NumBlocks[1] * NumThreads[1] != 1);
-  ImplArgs->DynamicLdsSize = KernelArgs.DynCGroupMem;
+  if (ImplArgs) {
+    ImplArgs->BlockCountX = NumBlocks[0];
+    ImplArgs->BlockCountY = NumBlocks[1];
+    ImplArgs->BlockCountZ = NumBlocks[2];
+    ImplArgs->GroupSizeX = NumThreads[0];
+    ImplArgs->GroupSizeY = NumThreads[1];
+    ImplArgs->GroupSizeZ = NumThreads[2];
+    ImplArgs->GridDims = NumBlocks[2] * NumThreads[2] > 1
+                             ? 3
+                             : 1 + (NumBlocks[1] * NumThreads[1] != 1);
+    ImplArgs->DynamicLdsSize = KernelArgs.DynCGroupMem;
+  }
 
   // Push the kernel launch into the stream.
   return Stream->pushKernelLaunch(*this, AllArgs, NumThreads, NumBlocks,