[Mlir-commits] [mlir] [MLIR] Improve KernelOutlining to avoid introducing an extra block (PR #90359)
Fabian Mora
llvmlistbot at llvm.org
Sat Apr 27 14:41:27 PDT 2024
================
@@ -241,24 +241,19 @@ static gpu::GPUFuncOp outlineKernelFuncImpl(gpu::LaunchOp launchOp,
map.map(operand.value(), entryBlock.getArgument(operand.index()));
// Clone the region of the gpu.launch operation into the gpu.func operation.
- // TODO: If cloneInto can be modified such that if a mapping for
- // a block exists, that block will be used to clone operations into (at the
- // end of the block), instead of creating a new block, this would be much
- // cleaner.
launchOpBody.cloneInto(&outlinedFuncBody, map);
- // Branch from entry of the gpu.func operation to the block that is cloned
- // from the entry block of the gpu.launch operation.
- Block &launchOpEntry = launchOpBody.front();
- Block *clonedLaunchOpEntry = map.lookup(&launchOpEntry);
- builder.setInsertionPointToEnd(&entryBlock);
- builder.create<cf::BranchOp>(loc, clonedLaunchOpEntry);
-
- outlinedFunc.walk([](gpu::TerminatorOp op) {
- OpBuilder replacer(op);
- replacer.create<gpu::ReturnOp>(op.getLoc());
- op.erase();
- });
+ // Splice now the entry block of the gpu.launch operation at the end of the
+ // gpu.func entry block and erase the redundant block.
+ Block *clonedLaunchOpEntry = map.lookup(&launchOpBody.front());
+ Operation *terminator = clonedLaunchOpEntry->getTerminator();
+ OpBuilder replacer(terminator);
+ replacer.create<gpu::ReturnOp>(terminator->getLoc());
----------------
fabianmcg wrote:
Coincidentally we don't have any outlining tests with multiple blocks.
https://github.com/llvm/llvm-project/pull/90359
More information about the Mlir-commits
mailing list