[PATCH] D75287: [mlir][GPU] Expose the functionality to create a gpu.GPUFuncOp from a gpu.GPULaunchOp
Mahesh Ravishankar via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 28 15:30:48 PST 2020
mravishankar marked 4 inline comments as done.
mravishankar added inline comments.
================
Comment at: mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp:72
+ // Prune those that can be inlined.
+ llvm::SetVector<Value> inlinedOperands;
----------------
mehdi_amini wrote:
> Please don't use "inline" for other aspect that the inliner.
>
> What about sink?
Changed to "sink" and updated all variables names.
================
Comment at: mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp:77
+ Operation *operandOp = operand.getDefiningOp();
+ if (!operandOp || !isInliningBeneficiary(operandOp))
+ continue;
----------------
mehdi_amini wrote:
> herhut wrote:
> > ftynse wrote:
> > > mehdi_amini wrote:
> > > > This whole sinking transformation does not seem safe in general: this should check *legality* rather than "benefit".
> > > > Also it isn't clear to me why this is done during the outlining and not as a pre-pass. The launch operation with the region abstraction seems perfectly suited to model this. I rather have this exposed in a separate API / as a separate step.
> > > > This whole sinking transformation does not seem safe in general: this should check *legality* rather than "benefit".
> > >
> > > The function just seems misnamed, should be something like `shouldSink` because it mixes validity and benefit. In practice, it only returns `true` for `constant` and `dim` operations that don't have side effects.
> > > This whole sinking transformation does not seem safe in general: this should check *legality* rather than "benefit".
> >
> > Well, it should check both. You do not want to move all legal operation either :)
> >
> > > Also it isn't clear to me why this is done during the outlining and not as a pre-pass. The launch operation with the region abstraction seems perfectly suited to model this. I rather have this exposed in a separate API / as a separate step
> >
> > This has purely historical reasons. Not long ago, the `gpu.launch` was closed from above, so this transformation was done when moving to function form. I have a separate pass for this in a local client, which I can send out next week. It just needs tests.
> >
> > It was implemented as a "post transformation" to the outlining and I would prefer if we do not mix it into the outlining transformation itself. When written separately, the transformations are trivial.
> > It was implemented as a "post transformation" to the outlining and
>
> Pre-outlining seems easier to manage because region vs inter-procedural (and also can be kept a function pass).
>
> > I would prefer if we do not mix it into the outlining transformation itself. When written separately, the transformations are trivial.
>
> Seems like we're in agreement :)
A pre-pass is fine, but I think it would be better to leave it here. Eventually, it would be good if all transformations can be expressed as a pattern match and rewrite. This "outlining" is essentially converting a gpu.launchOp to a gpu.launchFuncOp. If you need to have a separate pass to sink the instructions, then it breaks the ability of going from loops -> GPU -> NVVM/SPIR-V. I am not saying anybody does this today (not doing this in IREE), but in general it seems like it would be beneficial to have transformations as patterns, and passes as just a light-weight wrapper around patterns.
Re: being able to keep it as a function pass, is related to where the gpu.module is created. As set up right now it is put outside of the function that the gpu.launch operation lives in. Thats a a very specific choice and would be very useful to allow "clients" of the outlining to decide where to put the gpu.module.
================
Comment at: mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp:116
+ for (auto *inlinedOp : inlinedOperations)
+ entryBlock.push_back(inlinedOp->clone(map));
----------------
ftynse wrote:
> Will this update the users of the inlinedOps? I don't see the map updated anywhere.
It is done within the clone(map) operation. The results of the operation are added to the map as well.
================
Comment at: mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp:119
+ // Clone the rest of the region.
+ for (Block &b : launchOp.body()) {
+ Block *newBlock = map.lookupOrNull(&b);
----------------
ftynse wrote:
> Will this work for blocks whose dominance relation is inverse of their textual order? E.g.
>
> ```
> ^entry:
> br ^bb2:
> ^bb1:
> "use"(%0) : (index) -> ()
> return
> ^bb2:
> %0 = "def"() : () -> (index)
> br ^bb1
> ```
Thanks for pointing this out. I updated the method to use cloneInto which handles this case (see comment on changes to cloneInto. I can make the change here if you think that is reasonable)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D75287/new/
https://reviews.llvm.org/D75287
More information about the llvm-commits
mailing list