[Mlir-commits] [mlir] [mlir] Expose linalg vectorization without replacement (PR #144158)

Tue Jun 17 18:08:17 PDT 2025

https://github.com/hanhanW requested changes to this pull request.

TLDR: @Max191 I think it is better to return `std::optional<VectorizationResult>` like other transform methods, but let's see how the discussion goes before you make other changes.

---

I just came back from vacation and I'm still catching up the threads. If I read it correctly, there are few concerns:

1. The expectation of the return values from `vectorize()` method.
2. Test coverage for the new code.
3. Test coverage between upstream and downstream projects.
4. The longer-term trajectory for vectorization.

(1) (2) (3) are all important as others could follow bad PRs in the future, and we should have the same review bar as possible as we can. I need to chat with @Max191 about downstream changes, but I can share my thought that is orthogonal to the downstream implementation details.

RE (1): I've been thinking about this when I saw other PRs about refactoring "patterns" and "transforms". TLDR is that I'm +1 on returning transformed IR from methods. The line between patterns and functions to me is that whether it returns the final results or not. People can use functions to build patterns. Either just use a single function, or assemble the pieces from multiple functions. This happens not only in linalg dialect, but also other dialects. As we expose the `vectorize` method today, I'd expect it to return `std::optional<VectorizeResult>`. We have such structure in other linalg transfroms, e.g., Tiling, Promotion, etc, but not `vectorize()`. I think we had the same intention for the vectorization, but it got lost at a moment. Thus, I took a stab at digging into the history.

When we lived in the very old world, we developed most transformations using pattern-rewriting framework with linalg transform filters, see [RFC](https://discourse.llvm.org/t/psa-retire-linalg-filter-based-patterns/63785) for the retirement plan, and it includes tiling, promotion, vectorization, etc. We've spent a lot of time on moving core functions as C++ functions and reuse them everywhere, including transform ops. Tiling is a good example to follow. Here is the [oldest commit](https://github.com/llvm/llvm-project/commit/307cfdf5338641e3a895857ef02dc9da35cd0eb6) that shows the change. It returned `void` in the first place as it was the first step.

At a moment, we had `std::optional<*Result>` style like other TilingResult, TileAndFuseResult, etc.

```cpp
/// Emit a suitable vector form for a Linalg op with fully static shape.
struct VectorizedLinalgOp {
  SmallVector<Value> tensorResults;
  VectorizedLinalgOp &operator=(const VectorizedLinalgOp &) = default;
};
Optional<VectorizedLinalgOp> vectorizeLinalgOp(OpBuilder &builder,
                                               Operation *op);
```

Later, a contributor refactored the code the handling of the result to a referenced function argument for some reasons: https://github.com/llvm/llvm-project/commit/c1a4cd551f1c577008c33d78972929ba6593efcc and it becomes

```cpp
LogicalResult vectorizeLinalgOp(OpBuilder &builder, Operation *op,
                                SmallVectorImpl<Value> &newResults);
```

We still had tracked the return values until [another refactoring](https://reviews.llvm.org/D116665). As you see in the change, all other transformations return transformed op except the `vectorize()` method. I don't find any comments from the review, and it may because we wanted to implement a better vectorizer so we're being conservative. Or maybe it is just because it looked straight-forward in the change itself. E.g., it's easier because you don't need to introduce a new struct, and it'd be very similar to the deprecated one. I don't know.

However, if we get back to what is the line between patterns and functions, I think we should return transformed IRs from the functions; it is pattern's responsibility to handle the return value and do the further transformation within their scope.

Note: other transform like generalizeNamedOp performs the replacement and returns the new ops. I think the vectorize method can follow the same API.

RE (2): If I read code correctly, the current implementation does not introduce dead code because they all go through the old `vectorize` method and it calls the new `vectorize` method. It is not obvious IMO. What I'd argue is should we return `std::optional<SmallVector<Value>>` or `std::optional<VectorizationResult>` instead? It'd break downstream users, but I think it is a reasonable move. What do you think?

RE (3): I think all the code is well-tested in the upstream, and there are no dead codes atm. Like what I mentioned above, I'd suggest returning `std::optional<VectorizationResult>` which makes it clearer. One of the things I like is that the upstream has good test coverage for vectorization, so I only need few tests in the IREE project. E.g., make sure that vector size inference is working when masking is enabled in IREE. The vector size inference could be IREE specific, so it's not upstreamed. How I form it is like the core functionality is tested in upstream, and IREE just has a slightly higher level of tests within the pass scope.

RE (4):

I'm happy to see the discussion and I'm happy to provide feedbacks. There was a [discussion](https://reviews.llvm.org/D150495) long time ago, but we did not make progress so far. I hope that the above explanation could address your concern and this is not becoming a blocker.

I've been wanting to investigate how to move the vectorization to the next chapter, as we may have more future changes on linalg ops and people'd like to extend the existing vectorizer. My rough idea is that we need an interface; any operation implements the interface can be vectorized by our new vectorizer system using the interface. Roughly, we'd need three interface methods. One for reading data from tensor/memref to vectors; it returns the loaded vectors. Another method is for writing vectors into destination, which takes a sequence of vectors. The other is similar to the `vectorize()` method, which takes the input vectors, applies some computation, and returns the result vectors. E.g., the linalg op uses its body to perform the computation.

However, what's been confusing me is whether we should enforce vectorizable ops to implement DPS interface, as we may need to pass input vector size for writing vectors. Today, they are well-defined for linalg ops because the inits are passed and we capture the iteration domain sizes in the indexing maps. I'm happy to chat with you offline or we can prepare RFC together, and hopefully (4) won't become a blocker for the change.

https://github.com/llvm/llvm-project/pull/144158