[clang] [clang][CodeGen][SPIR-V][AMDGPU] Tweak AMDGCNSPIRV ABI to allow for the correct handling of aggregates passed to kernels / functions. (PR #102776)

Mon Aug 19 09:31:34 PDT 2024

================
@@ -78,18 +101,52 @@ ABIArgInfo SPIRVABIInfo::classifyKernelArgumentType(QualType Ty) const {
       return ABIArgInfo::getDirect(LTy, 0, nullptr, false);
     }
 
-    // Force copying aggregate type in kernel arguments by value when
-    // compiling CUDA targeting SPIR-V. This is required for the object
-    // copied to be valid on the device.
-    // This behavior follows the CUDA spec
-    // https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#global-function-argument-processing,
-    // and matches the NVPTX implementation.
-    if (isAggregateTypeForABI(Ty))
-      return getNaturalAlignIndirect(Ty, /* byval */ true);
+    if (isAggregateTypeForABI(Ty)) {
+      if (getTarget().getTriple().getVendor() == llvm::Triple::AMD)
+        // TODO: The AMDGPU kernel ABI passes aggregates byref, which is not
+        // currently expressible in SPIR-V; SPIR-V passes aggregates byval,
+        // which the AMDGPU kernel ABI does not allow. Passing aggregates as
+        // direct works around this impedance mismatch, as it retains type info
+        // and can be correctly handled, post reverse-translation, by the AMDGPU
+        // BE, which has to support this CC for legacy OpenCL purposes. It can
+        // be brittle and does lead to performance degradation in certain
+        // pathological cases. This will be revisited / optimised in the future,
+        // once a way to deal with the byref/byval impedance mismatch is
+        // identified.
+        return ABIArgInfo::getDirect(LTy, 0, nullptr, false);
+      else
----------------
arsenm wrote:

No else after return

https://github.com/llvm/llvm-project/pull/102776