[Mlir-commits] [mlir] [MLIR] Improve KernelOutlining to avoid introducing an extra block (PR #90359)

Mehdi Amini llvmlistbot at llvm.org
Sun Apr 28 06:17:18 PDT 2024


================
@@ -241,24 +241,19 @@ static gpu::GPUFuncOp outlineKernelFuncImpl(gpu::LaunchOp launchOp,
     map.map(operand.value(), entryBlock.getArgument(operand.index()));
 
   // Clone the region of the gpu.launch operation into the gpu.func operation.
-  // TODO: If cloneInto can be modified such that if a mapping for
-  // a block exists, that block will be used to clone operations into (at the
-  // end of the block), instead of creating a new block, this would be much
-  // cleaner.
   launchOpBody.cloneInto(&outlinedFuncBody, map);
 
-  // Branch from entry of the gpu.func operation to the block that is cloned
-  // from the entry block of the gpu.launch operation.
-  Block &launchOpEntry = launchOpBody.front();
-  Block *clonedLaunchOpEntry = map.lookup(&launchOpEntry);
-  builder.setInsertionPointToEnd(&entryBlock);
-  builder.create<cf::BranchOp>(loc, clonedLaunchOpEntry);
-
-  outlinedFunc.walk([](gpu::TerminatorOp op) {
-    OpBuilder replacer(op);
-    replacer.create<gpu::ReturnOp>(op.getLoc());
-    op.erase();
-  });
+  // Splice now the entry block of the gpu.launch operation at the end of the
+  // gpu.func entry block and erase the redundant block.
+  Block *clonedLaunchOpEntry = map.lookup(&launchOpBody.front());
+  Operation *terminator = clonedLaunchOpEntry->getTerminator();
+  OpBuilder replacer(terminator);
+  replacer.create<gpu::ReturnOp>(terminator->getLoc());
----------------
joker-eph wrote:

Added a test. I am not sure why we should just arbitrarily restrict it? The body of the gpu.launch is like a function body to me.

https://github.com/llvm/llvm-project/pull/90359


More information about the Mlir-commits mailing list