[Mlir-commits] [mlir] [mlir] Let GPU ID bounds work on any FunctionOpInterfaces (PR #95166)

Tue Jun 11 16:31:48 PDT 2024

================
@@ -73,12 +85,16 @@ static std::optional<uint64_t> getKnownLaunchDim(Op op, LaunchDims type) {
       return value.getZExtValue();
   }
 
-  if (auto func = op->template getParentOfType<GPUFuncOp>()) {
+  if (auto func = op->template getParentOfType<FunctionOpInterface>()) {
     switch (type) {
     case LaunchDims::Block:
-      return llvm::transformOptional(func.getKnownBlockSize(dim), zext);
+      return llvm::transformOptional(
+          getKnownLaunchAttr(func, GPUFuncOp::getKnownBlockSizeAttrName(), dim),
+          zext);
     case LaunchDims::Grid:
-      return llvm::transformOptional(func.getKnownGridSize(dim), zext);
+      return llvm::transformOptional(
+          getKnownLaunchAttr(func, GPUFuncOp::getKnownGridSizeAttrName(), dim),
+          zext);
----------------
joker-eph wrote:

> The problem with forcing a gpu.module, especially since it requires its immediate parent to have gpu.container_module on it, is that it breaks a general multi-target compilation scheme. GPU modules would have to have an extra level of nesting when there isn't a corresponding, say, x86.module.

I didn't follow this part?

> In summary, insisting on gpu.module for GPU functions breaks other possible abstractions for keeping the code for different devices separate.

Even if this diagnostic is true (I haven't followed why?), it's not obvious to me that there is a single answer: that is can we fix whatever limitation you feel exist around gpu.module to make it usable in such situation?

https://github.com/llvm/llvm-project/pull/95166