[PATCH] D77678: [mlir][Linalg] Add loop.parallel lowering for all Linalg Ops.

Fri Apr 10 19:57:05 PDT 2020

bondhugula added a comment.

In D77678#1969889 <https://reviews.llvm.org/D77678#1969889>, @mravishankar wrote:

> In D77678#1968752 <https://reviews.llvm.org/D77678#1968752>, @bondhugula wrote:
>
> > In D77678#1968714 <https://reviews.llvm.org/D77678#1968714>, @mravishankar wrote:
> >
> > > Another point that is off the top of my head if the recommendation is to go through affine dialect. There is already mechanism to generate loop.parallel when tiling linalg operations. AFAIK, the tile size can be dynamic, and therefore cannot be expressed using affine.parallel loops.
> >
> >
> > I've pointed this out a couple of times that this isn't accurate - you can represent non-constant tile sizes using either affine.parallel or affine.for (https://llvm.discourse.group/t/beginner-q-help-with-loops-affine-linalg/707/4).
>
>
> Thanks for the pointer. As was done in that post, I just looked at the op definition and reached the conclusion about parametric tiling. I havent worked with affine dialect as much to know about such things. Its definitely something I want to look into in due course.
>
> >> So if the codegeneration process is tiling linalg ops and then lowering the tiled ops to loops, you can end up in a situation where the outer loops are in Loop dialect but the inner loops are in affine dialect. I am not sure there is an issue with that cause eventually you can lower the affine loops to loop dialect, but its just something that I havent reasoned fully about for myself.
> > 
> > Second, there is no issue with using a mix of affine and loop dialect ops - '-lower-to-affine' should be able to handle it by design. From a mix of affine.for and loop.for, it'll take you to just loop.for's. Please file a bug report if it doesn't!
>
> Agreed (and said so earlier). It should be OK to mix loop.parallel/loop.for with affine.for/affine.parallel. But based on your post is it possible to generate affine.for/affine.parallel while tiling linalg ops as well? That way the same benefit of going to affine.for/affine.parallel would be available at the inter-tile loops as well.

Yes, of course. That's exactly what I've been pointing out. loop.for/parallel is currently being unnecessarily used for all these cases where it is possible to just go through affine.for/affine.parallel. Even with more general computation (where dim/symbol restrictions get in the way) that you may need in the future, with affine.scope ops (renamed from grayboxes), you'd never need to use loop.for/if to lower any tensor/memref indexing computation. And when you lower to loop.for/if, you'd get what you want. Is there a need to maintain a non-unit symbolic step for loop.for's steps? Your bounds are already any index type SSA value. Won't your code be simplified if you just canonicalized to a unit stride? With semi-affine maps, even with the current support, you only get *more* than what you get with loop.for's.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77678/new/

https://reviews.llvm.org/D77678