[PATCH] D114336: [Polly] Generalize the pattern matching to the case of tensor contractions.
Michael Kruse via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon May 16 15:59:13 PDT 2022
Meinersbur added a comment.
In D114336#3500858 <https://reviews.llvm.org/D114336#3500858>, @gareevroman wrote:
> 1
> Yes, it was intended. The transformation helps to optimize a class of programs, which is broader then a tensor contraction. However, it heavily depends on the codegen part. I think that the improvement of the detection can be the goal of the future work.
Please document what pattern is intended to be recognized. I don't think the doc for `isTCPattern` is sufficient, it only mentioned what is checked. Documenting the intended pattern would help identifying if a check has been forgotten. E.g. for the statement domain.
> As far as I know, in these cases, the codegen modifies some memory accesses. Consequently, they are not correspond to the current pattern.
>
> ReadAccess := [Reduction Type: NONE] [Scalar: 0]
> { Stmt3[i0, i1, i2, i3] -> MemRef0[i0, i2, 1 + i3] };
> ReadAccess := [Reduction Type: NONE] [Scalar: 0]
> { Stmt3[i0, i1, i2, i3] -> MemRef1[1 + i3, i1, i2] };
> ReadAccess := [Reduction Type: +] [Scalar: 0]
> { Stmt3[i0, i1, i2, i3] -> MemRef2[i0, i1] };
> MustWriteAccess := [Reduction Type: +] [Scalar: 0]
> { Stmt3[i0, i1, i2, i3] -> MemRef2[i0, i1] };
What do you mean by "codegen modifies some memory accesses"? Polly's Codegen? What is the check to exclude this? Is the `1 + i3` memory access expression? Where does it come from?
> 4
>
>> Here, i occurs as indices for A, B, and C and detected as TC. Is this supported?
>
>
>
>> void foo(double C[64][64], double A[64][64][64], double B[64][64][64]) {
>>
>> for (int i = 0; i < 32; i++)
>> for (int j = 0; j < 32; j++)
>> for (int l = 0; l < 32; l++)
>> for (int w = 0; w < 32; w++)
>> C[i][j] += A[i][l][w] * B[w][j][i];
>>
>> }
>
> For some reason, I cannot reproduce that. I have added a corresponding test case. As far as I understand, that should be detected because of the line 1365.
Should or should not be detected?
While debugging, it is now rejected because of this:
``
if (!TCI.B) {
// IndexSet should be a union of J and P sets.
if (unite(TCI.P, TCI.J) != IndexSet)
return false;
``
Could you choose more meaningful identifiers than `I`, `J`, `P` and `J`, or use them in the pattern described in `isTCPattern`? I think of something like:
for (...) {
...
for (...) {
if (c)
auto acc = C[P]; // ReadFromC
auto a = A[Pa, I];
auto b = B[Pb,J];
auto arg = f(a,b);
acc = acc op arg;
C[P] = acc; // WriteToC
}
}
}
where P, I, J are sets of indices of the surrounding loops, Pa and Pb are subsets of P, I are the indices only occurring in the subscript of reading from A, J are the indices only occurring in the subscript for reading from B. There must be no indices not occuring in either P, I or J. `op=` is a commutative operation ...., c is an affine condition usually just `true`. `f` is a side-effect free operation.
(I don't whether this is correct, I want to understand whether the checked conditions are sufficient).
> 5
>
> I think that that it is redundant to require that bands are marked as permutable, since we check the form of dependencies and memory accesses. I propose to remove such checks for pattern matching optimizations.
Ok.
================
Comment at: polly/lib/Transform/MatmulOptimizer.cpp:210
+
+ /// @{
+ /// Input dimension of the schedule space, which represents contracted
----------------
`@{` is not needed when documenting just a single member.
================
Comment at: polly/lib/Transform/MatmulOptimizer.cpp:1310
+ // Obtain the set I.
+ TCI.I = std::move(set_difference(IndexSet, TCI.P));
+ if (!isSuperset(IandJIndexSet, TCI.I))
----------------
Compiler warning:
```
/home/meinersbur/src/llvm-project/polly/lib/Transform/MatmulOptimizer.cpp:1310:13: warning: moving a temporary object prevents copy elision [-Wpessimizing-move]
TCI.I = std::move(set_difference(IndexSet, TCI.P));
```
The result of `set_difference` is already an r-value, no need to cast it to an r-value.
================
Comment at: polly/lib/Transform/MatmulOptimizer.cpp:1315
+ // Obtain the set J.
+ TCI.J = std::move(set_difference(IandJIndexSet, TCI.I));
+
----------------
Same compiler warning.
================
Comment at: polly/lib/Transform/MatmulOptimizer.cpp:1395
+/// and false, otherwise.
+static bool isTcDep(isl::set DepDelta, unsigned Pos, isl::set BoundDeltas,
+ SmallDenseSet<int> *IndexSet) {
----------------
gareevroman wrote:
> Meinersbur wrote:
> > This seems to check setermine whether there is a reduction (contraction) carried by loop number `Pos`. The function name could be more meaningful. Suggestion: `isDepCarryingReductionOverDim` (not nice, but "TcDep" can mean anything)
> Could we name it isReductionCarriedOverDim? I think, in this case, we should rename the parameter Pos to Dim to make it more readable.
Sounds ok.
================
Comment at: polly/lib/Transform/MatmulOptimizer.cpp:1223
+ const std::set<int> &IandJIndexSet,
+ SmallVector<int, 30> Dimensions, TCInfoTy &TCI) {
+ if (!TCI.A) {
----------------
gareevroman wrote:
> Meinersbur wrote:
> >
> As far as I understand, we cannot do this here because of the assignment to TCI.ADimensions and TCI.BDimensions
The assignments should just make a copy of the array . With `Dimensions` being passed by-value, the caller has to make the copy which it should not need to.
`SmallVector` has no overload for being assigned an `ArrayRef`, but you could use `llvm::append_range` to insert all the values.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D114336/new/
https://reviews.llvm.org/D114336
More information about the llvm-commits
mailing list