[mlir] [MLIR][Linalg] Add aggregate ops decomposition pass and softmax decom… (PR #97582)

Tue Jul 9 01:44:52 PDT 2024

banach-space wrote:

> The implementation ends up generating the semantically the same code as the previous decomposition implementation (with minor deviation as you noted).

IMO this is key  - please add that in the summary.

> 
> > Also, what's the difference between "transform-op-decompose.mlir" and "decompose-named-ops.mlir"? Is it just "how" the decomposition is "driven"? (TD vs Pass) Couldn't that be one test instead?
> 
> The first one tests the new pass. The second one uses the transform interpreter and the decompose_interface op (which happen to partially rely on the same code now).

>From what I can tell, both tests verify the decomposition of `linalg.softmax`:
```mlir
func.func @softmax(%arg0: tensor<2x16x32xf32>, %dst: tensor<2x16x32xf32>) -> tensor<2x16x32xf32> {
  %1 = linalg.softmax dimension(2) ins(%arg0 : tensor<2x16x32xf32>) outs(%dst: tensor<2x16x32xf32>) -> tensor<2x16x32xf32>
  return %1 : tensor<2x16x32xf32>
}
```

Couldn't we re-use the input and the `CHECK` lines? To avoid duplication.

> 
> > If this claims alignment with e.g. PyTorch (or, more specifically, torch-mlir), shouldn't there be a link to torch-mlir docs/issues/code/test instead?
> 
> The IR presented is the IR you get by lowering PyTorch to torch-mlir.

I know where the IR is coming from.

FWIW (without links to documentation), that IR is just an implementation detail of torch-mlir. In this PR we are discussing an implementation detail of Linalg. Are you saying that the implementation in Linalg should match torch-mlir? Why? What if the implementation in torch-mlir changes?

I'm trying to understand the motivation here and the overall design that we are converging towards.

> 
> > Are you saying that this PR is changing the semantics of softmax in Linalg?
> 
> I'd say it sets it. The implementation follows the op description, so there's no real 'change'.
> 

Key info, please highlight in the summary.

> > > now aggregate ops are decomposed first and then generalized
> > 
> > 
> > I am a bit confused, there's `-linalg-decompose-named-ops` and `-linalg-generalize-named-ops` - which one are you referring to?
> 
> Decomposition is performed by the newly introduced `-linalg-decompose-named-ops` (as the name suggests). Generalization is done by the default `-linalg-generalize-named-ops`.

I know what the options are, thanks.  To me, your comment implies that `-linalg-decompose-named-ops` is meant to be followed by `-linalg-generalize-named-ops` ("aggregate ops are _decomposed_ first and then _generalized_")? Is that what you had in mind?

https://github.com/llvm/llvm-project/pull/97582