[llvm] [AMDGPU] Defaults for missing dimensions in SYCL required wg size (PR #68872)

Jakub Chlanda via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 24 01:12:59 PDT 2023


================
@@ -317,10 +317,38 @@ static bool processUse(CallInst *CI, bool IsV5OrAbove) {
   return MadeChange;
 }
 
+// SYCL allows required work-group size attribute to be partially specified
----------------
jchlanda wrote:

> First of all, why would this happen? I would expect all producers of this to agree on one consistent format

The reason why this happens is because a discrepancy in how `reqd_work_group_size` is handled in OpenCL and SYCL. OpenCL mandates that all 3 dimensions are specified, padded by `1` if applicable (6.7.2 Optional Attribute Qualifiers of OpenCL 1.2 spec):

> The optional __attribute__((reqd_work_group_size(X, Y, Z))) is the work-
group size that must be used as the local_work_size argument to clEnqueueNDRangeKernel.
This allows the compiler to optimize the generated code appropriately for this kernel. The
optional __attribute__((reqd_work_group_size(X, Y, Z))), if specified, must
be (1, 1, 1) if the kernel is executed via clEnqueueTask

However, SYCL allows for any of the dimensions to be specified (Table 180 of SYCL 2020 spec):

> reqd_work_group_size(dim0)
> reqd_work_group_size(dim0, dim1)
> reqd_work_group_size(dim0, dim1, dim2)

Furthermore, the SYCL runtime is supposed to throw an exception when a kernel is launched with a number of dimensions that does not match the required work group size dimensionality, which makes padding the dimensions up to 3 very awkward. 

> Second, a random backend pass shouldn't be going out of its way to canonicalize metadata.

I think this is a bit too harsh. It is a lower kernel attributes pass and it handles just that, an attribute. While I agree, that this is not "a fault" of AMD backend, I do think that sanitising the values here is correct, there are many places in the codebase where backends make special provisions for languages that are not in tree (rust for instance). And it solves the problem (admittedly, SYCL's problem) of loosing the dimensionality at the point where it is save to do so. Alternatively, maybe the [verifier](https://github.com/llvm/llvm-project/blob/856bd99aaedb295a4c9f221a8073e4f67c209d3a/llvm/lib/BinaryFormat/AMDGPUMetadataVerifier.cpp#L232) could be used to canonicalize SYCL required work group size?

https://github.com/llvm/llvm-project/pull/68872


More information about the llvm-commits mailing list