[Mlir-commits] [mlir] [mlir] Let GPU ID bounds work on any FunctionOpInterfaces (PR #95166)

Tue Jun 11 15:43:56 PDT 2024

================
@@ -73,12 +85,16 @@ static std::optional<uint64_t> getKnownLaunchDim(Op op, LaunchDims type) {
       return value.getZExtValue();
   }
 
-  if (auto func = op->template getParentOfType<GPUFuncOp>()) {
+  if (auto func = op->template getParentOfType<FunctionOpInterface>()) {
     switch (type) {
     case LaunchDims::Block:
-      return llvm::transformOptional(func.getKnownBlockSize(dim), zext);
+      return llvm::transformOptional(
+          getKnownLaunchAttr(func, GPUFuncOp::getKnownBlockSizeAttrName(), dim),
+          zext);
     case LaunchDims::Grid:
-      return llvm::transformOptional(func.getKnownGridSize(dim), zext);
+      return llvm::transformOptional(
+          getKnownLaunchAttr(func, GPUFuncOp::getKnownGridSizeAttrName(), dim),
+          zext);
----------------
joker-eph wrote:

> One other note is that the GPU dialect already has two official, somewhat independent purposes:

Yes! This is something we should document better, it's "almost" two separate dialects (let's say "sub-dialects" ;))

> and I'm pretty sure you're allowed to just use one half and replace the other half with your own custom arrangement for that same task and have everything basically work (if slightly less seamlessly).

That seems like a bug to me :)
If we want the launch_func to accept other implementations for a kernel, then I would argue we should have a KernelEntryPointOpInterface (right now the verifier just checks for `FunctionOpInterface` which does not seem like the right check for running a kernel IMO). Anyway it does not invalidate your point.

> My claim is that operations like gpu.thread_id and gpu.shuffle are in a third class: abstractions around platform-specific GPU intrinsics. That is, these are operations meant to abstract across what are almost inevitably platform-specific intrinsics, allowing people to write code generation schemes that can target "a GPU" (though somewhere in their context they're likely to know which).

That seems fair, thanks for elaborating on this.
Can we start by working on a documentation update for the dialect and describe these 3 categories in the dialect?
1) Host code manipulation and offloading support
2) "Container" object (gpu.module, gpu.func)
3) GPU specific operations (to be mixed with arith and others).


> If we forced all uses of basic GPU intrinsics into a gpu.module, then you'd have a hard time defining
> std::optional<something::TargetInfoAttr> getTargetInfo(mlir::FunctionOpInterface func);

I don't follow this part, I am concerned that you're saying that we **can't** use a gpu.module, when this is where the GPU target info should live?


https://github.com/llvm/llvm-project/pull/95166