[PATCH] D75287: [mlir][GPU] Expose the functionality to create a gpu.GPUFuncOp from a gpu.GPULaunchOp

Fri Feb 28 22:22:27 PST 2020

mehdi_amini requested changes to this revision.
mehdi_amini added inline comments.
This revision now requires changes to proceed.

================
Comment at: mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp:77
+    Operation *operandOp = operand.getDefiningOp();
+    if (!operandOp || !isInliningBeneficiary(operandOp))
+      continue;
----------------
mravishankar wrote:
> mehdi_amini wrote:
> > herhut wrote:
> > > ftynse wrote:
> > > > mehdi_amini wrote:
> > > > > This whole sinking transformation does not seem safe in general: this should check *legality* rather than "benefit". 
> > > > > Also it isn't clear to me why this is done during the outlining and not as a pre-pass. The launch operation with the region abstraction seems perfectly suited to model this. I rather have this exposed in a separate API / as a separate step.
> > > > > This whole sinking transformation does not seem safe in general: this should check *legality* rather than "benefit".
> > > > 
> > > > The function just seems misnamed, should be something like `shouldSink` because it mixes validity and benefit. In practice, it only returns `true` for `constant` and `dim` operations that don't have side effects.
> > > > This whole sinking transformation does not seem safe in general: this should check *legality* rather than "benefit".
> > > 
> > > Well, it should check both. You do not want to move all legal operation either :)
> > > 
> > > > Also it isn't clear to me why this is done during the outlining and not as a pre-pass. The launch operation with the region abstraction seems perfectly suited to model this. I rather have this exposed in a separate API / as a separate step
> > > 
> > > This has purely historical reasons. Not long ago, the `gpu.launch` was closed from above, so this transformation was done when moving to function form. I have a separate pass for this in a local client, which I can send out next week. It just needs tests.
> > > 
> > > It was implemented as a "post transformation" to the outlining and I would prefer if we do not mix it into the outlining transformation itself. When written separately, the transformations are trivial.
> > > It was implemented as a "post transformation" to the outlining and
> > 
> > Pre-outlining seems easier to manage because region vs inter-procedural (and also can be kept a function pass). 
> > 
> > > I would prefer if we do not mix it into the outlining transformation itself. When written separately, the transformations are trivial.
> > 
> > Seems like we're in agreement :)
> A pre-pass is fine, but I think it would be better to leave it here. Eventually, it would be good if all transformations can be expressed as a pattern match and rewrite. This "outlining" is essentially converting a gpu.launchOp to a gpu.launchFuncOp. If you need to have a separate pass to sink the instructions, then it breaks the ability of going from loops -> GPU -> NVVM/SPIR-V. I am not saying anybody does this today (not doing this in IREE), but in general it seems like it would be beneficial to have transformations as patterns, and passes as just a light-weight wrapper around patterns.
> 
> Re: being able to keep it as a function pass, is related to where the gpu.module is created. As set up right now it is put outside of the function that the gpu.launch operation lives in. Thats a a very specific choice and would be very useful to allow "clients" of the outlining to decide where to put the gpu.module. 
>  If you need to have a separate pass to sink the instructions, then it breaks the ability of going from loops -> GPU -> NVVM/SPIR-V. 

I don't understand what you mean, can you elaborate?

> but in general it seems like it would be beneficial to have transformations as patterns, and passes as just a light-weight wrapper around patterns.

This is mixing an optimization within an unrelated transformation: this just does not belong here IMO.

> Re: being able to keep it as a function pass, is related to where the gpu.module is created. 

I don't know what you mean or how it answer the point about the function pass right now.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75287/new/

https://reviews.llvm.org/D75287