[PATCH] D75287: [mlir][GPU] Expose the functionality to create a gpu.GPUFuncOp from a gpu.GPULaunchOp

Mahesh Ravishankar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 28 15:30:48 PST 2020


mravishankar marked 4 inline comments as done.
mravishankar added inline comments.


================
Comment at: mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp:72
 
+  // Prune those that can be inlined.
+  llvm::SetVector<Value> inlinedOperands;
----------------
mehdi_amini wrote:
> Please don't use "inline" for other aspect that the inliner. 
> 
> What about sink?
Changed to "sink" and updated all variables names.


================
Comment at: mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp:77
+    Operation *operandOp = operand.getDefiningOp();
+    if (!operandOp || !isInliningBeneficiary(operandOp))
+      continue;
----------------
mehdi_amini wrote:
> herhut wrote:
> > ftynse wrote:
> > > mehdi_amini wrote:
> > > > This whole sinking transformation does not seem safe in general: this should check *legality* rather than "benefit". 
> > > > Also it isn't clear to me why this is done during the outlining and not as a pre-pass. The launch operation with the region abstraction seems perfectly suited to model this. I rather have this exposed in a separate API / as a separate step.
> > > > This whole sinking transformation does not seem safe in general: this should check *legality* rather than "benefit".
> > > 
> > > The function just seems misnamed, should be something like `shouldSink` because it mixes validity and benefit. In practice, it only returns `true` for `constant` and `dim` operations that don't have side effects.
> > > This whole sinking transformation does not seem safe in general: this should check *legality* rather than "benefit".
> > 
> > Well, it should check both. You do not want to move all legal operation either :)
> > 
> > > Also it isn't clear to me why this is done during the outlining and not as a pre-pass. The launch operation with the region abstraction seems perfectly suited to model this. I rather have this exposed in a separate API / as a separate step
> > 
> > This has purely historical reasons. Not long ago, the `gpu.launch` was closed from above, so this transformation was done when moving to function form. I have a separate pass for this in a local client, which I can send out next week. It just needs tests.
> > 
> > It was implemented as a "post transformation" to the outlining and I would prefer if we do not mix it into the outlining transformation itself. When written separately, the transformations are trivial.
> > It was implemented as a "post transformation" to the outlining and
> 
> Pre-outlining seems easier to manage because region vs inter-procedural (and also can be kept a function pass). 
> 
> > I would prefer if we do not mix it into the outlining transformation itself. When written separately, the transformations are trivial.
> 
> Seems like we're in agreement :)
A pre-pass is fine, but I think it would be better to leave it here. Eventually, it would be good if all transformations can be expressed as a pattern match and rewrite. This "outlining" is essentially converting a gpu.launchOp to a gpu.launchFuncOp. If you need to have a separate pass to sink the instructions, then it breaks the ability of going from loops -> GPU -> NVVM/SPIR-V. I am not saying anybody does this today (not doing this in IREE), but in general it seems like it would be beneficial to have transformations as patterns, and passes as just a light-weight wrapper around patterns.

Re: being able to keep it as a function pass, is related to where the gpu.module is created. As set up right now it is put outside of the function that the gpu.launch operation lives in. Thats a a very specific choice and would be very useful to allow "clients" of the outlining to decide where to put the gpu.module. 


================
Comment at: mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp:116
+  for (auto *inlinedOp : inlinedOperations)
+    entryBlock.push_back(inlinedOp->clone(map));
 
----------------
ftynse wrote:
> Will this update the users of the inlinedOps? I don't see the map updated anywhere.
It is done within the clone(map) operation. The results of the operation are added to the map as well.


================
Comment at: mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp:119
+  // Clone the rest of the region.
+  for (Block &b : launchOp.body()) {
+    Block *newBlock = map.lookupOrNull(&b);
----------------
ftynse wrote:
> Will this work for blocks whose dominance relation is inverse of their textual order?  E.g. 
> 
> ```
> ^entry:
>   br ^bb2:
> ^bb1:
>   "use"(%0) : (index) -> ()
>   return
> ^bb2:
>   %0 = "def"() : () -> (index)
>   br ^bb1
> ```
Thanks for pointing this out. I updated the method to use cloneInto which handles this case (see comment on changes to cloneInto. I can make the change here if you think that is reasonable)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75287/new/

https://reviews.llvm.org/D75287





More information about the llvm-commits mailing list