[clang] [clang][CodeGen][SPIR-V][AMDGPU] Tweak AMDGCNSPIRV ABI to allow for the correct handling of aggregates passed to kernels / functions. (PR #102776)
Matt Arsenault via cfe-commits
cfe-commits at lists.llvm.org
Mon Aug 19 09:31:34 PDT 2024
================
@@ -78,18 +101,52 @@ ABIArgInfo SPIRVABIInfo::classifyKernelArgumentType(QualType Ty) const {
return ABIArgInfo::getDirect(LTy, 0, nullptr, false);
}
- // Force copying aggregate type in kernel arguments by value when
- // compiling CUDA targeting SPIR-V. This is required for the object
- // copied to be valid on the device.
- // This behavior follows the CUDA spec
- // https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#global-function-argument-processing,
- // and matches the NVPTX implementation.
- if (isAggregateTypeForABI(Ty))
- return getNaturalAlignIndirect(Ty, /* byval */ true);
+ if (isAggregateTypeForABI(Ty)) {
+ if (getTarget().getTriple().getVendor() == llvm::Triple::AMD)
+ // TODO: The AMDGPU kernel ABI passes aggregates byref, which is not
+ // currently expressible in SPIR-V; SPIR-V passes aggregates byval,
+ // which the AMDGPU kernel ABI does not allow. Passing aggregates as
+ // direct works around this impedance mismatch, as it retains type info
+ // and can be correctly handled, post reverse-translation, by the AMDGPU
+ // BE, which has to support this CC for legacy OpenCL purposes. It can
+ // be brittle and does lead to performance degradation in certain
+ // pathological cases. This will be revisited / optimised in the future,
+ // once a way to deal with the byref/byval impedance mismatch is
+ // identified.
+ return ABIArgInfo::getDirect(LTy, 0, nullptr, false);
+ else
----------------
arsenm wrote:
No else after return
https://github.com/llvm/llvm-project/pull/102776
More information about the cfe-commits
mailing list