[Mlir-commits] [mlir] Skip address space checks for memrefs between launchOp and kernel func (PR #102925)

Petr Kurapov llvmlistbot at llvm.org
Wed Aug 14 02:32:08 PDT 2024


================
@@ -401,8 +401,29 @@ LogicalResult GPUDialect::verifyOperationAttribute(Operation *op,
              << expectedNumArguments;
 
     auto functionType = kernelGPUFunction.getFunctionType();
+    auto typesMatch = [&](Type launchOpArgType, Type gpuFuncArgType) {
+      auto launchOpMemref = dyn_cast<MemRefType>(launchOpArgType);
+      auto kernelMemref = dyn_cast<MemRefType>(gpuFuncArgType);
+      // Allow address space incompatibility for OpenCL kernels: `gpu.launch`'s
+      // argument memref without address space attribute will match a kernel
+      // function's memref argument with address space `Global`.
+      if (launchOpMemref && kernelMemref) {
+        auto launchAS = llvm::dyn_cast_or_null<gpu::AddressSpaceAttr>(
+            launchOpMemref.getMemorySpace());
+        auto kernelAS = llvm::dyn_cast_or_null<gpu::AddressSpaceAttr>(
+            kernelMemref.getMemorySpace());
+        if (!launchAS && kernelAS &&
+            kernelAS.getValue() == gpu::AddressSpace::Global)
----------------
kurapov-peter wrote:

Thanks for the links!

> According to the OpenCL spec https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_API.html#setting-kernel-arguments

I mean the SLM pointer passed from host would be `null` (and having just one pointer is meaningless anyway since each WG would have its own); this is also in the spec, same section:
> If the argument is declared with the local qualifier, the arg_value entry must be NULL.

We can allow it, but I don't see any practical sense. If a kernel has a parameter marked as local, then the caller should already know about it. In other words, the memory pointer (let's forget that it is NULL for a moment) that is passed to the kernel does not come from a runtime-managed buffer that can migrate between devices. Hence, no such problem should exist for those.

Current definition of the address space enum:
```
def GPU_AddressSpaceEnum : GPU_I32Enum<
  "AddressSpace", "GPU address space", [
    GPU_AddressSpaceGlobal,
    GPU_AddressSpaceWorkgroup,
    GPU_AddressSpacePrivate
  ]>;
```
I suppose for constants it should still have address space global and retrieve constness in the lowering?

P.s. the only reason I'm suggesting to make the check specific at all is that the check can actually be useful I assume, so it seems natural to try and preserve it for all the other cases.


https://github.com/llvm/llvm-project/pull/102925


More information about the Mlir-commits mailing list