[Mlir-commits] [mlir] [mlir][SCF] Deprecate `linalg::tileToForallOp` and `linalg::tileToForallOpUsingTileSizes` (PR #91878)

Fri May 31 11:57:46 PDT 2024

MaheshRavishankar wrote:

> My general thinking in such large PRs is to look out for potential causes of differences.
The ideal scenario is when tests don't change(i.e. the PR is NFC) and the APIs don't change too drastically.
So what I am evaluating here is whether this is good to go despite changes to test files of whether something material was changed.

The change to the test file is minimal. I think the only change (and please do verify) is the case where I think the existing lowering was wrong. Essentially in a purely dynamic case, you need to generate the `affine.max`. I dont know how the existing path avoided it, but it looks wrong. So I fixed the test. Please do correct my understanding if I am off here.

> Beyond the API complexity increase / reduction of redundant IR, there may be a catch: I see this PR introduces --cse in one of the test. My immediate gut feeling is we are introducing more IR in other places. If --cse can be dropped then I think this first point is resolved.

CSE was added just to account for constants. The constants are really annoying with `%[[C0]]` and `%[[C0_0]]` and `%[[C0_1]]` and their ordering. 

> Re 2: calculateTileOffsetsAndSizes has related logic in the same code location. Now the logic seems split across getTileOffsetAndSizes and getTileSizes/getUserTileSizesAndNumThreads. The rename to getUserTileSizesAndNumThreads helps. So it seems we are doing a getUserTileSizesAndNumThreads before any loop is generated and later a refinemenent of getTileOffsetAndSizes within the body (passed via. lambda). The part that is putting all this within the same function is what was confusing I believe. Can we hoist out the lambdas as helper functions and reduce the complexity here? It was really not clear to me that we had an inline lambda to wrangle with: my recollections from the refactoring 6 months ago do not include this.

Ok, Ill move the inline-lambda out, but the lambda here is essentially playing the same roles as labmdas do for creating the loop body when used with `rewriter.create<scf::For>` or `rewriter.create<scf::ForAll>`. It is the loop body of the innermost tiled loop. I actually wanted to add this as a method on `LoopLikeOpInterface` but there were concerns that this is really just needed for tiling and does not belong in the interface. I dont want to add to the diff here though. Ill keep it as a inline-lambda and then I can change it. I am not for inline-lambdas as well, but in this case, it seemed warranted, but happy to change it as a follow up.

> Re 3: this should not be in this PR IMO. This is a material change that looks important by itself (bug fixing), I don't think we should have it in the middle of such a big PR with so much code complexity.

I dont know what the original implementation was doing, and where it was going wrong. I didnt try to fix the bug, but the refactoring showed a change in the lit test and then I verified that the current PR is doing the right thing and the old path is wrong. I dont see much value in triaging a bug on a path that is being deleted.

https://github.com/llvm/llvm-project/pull/91878