[PATCH] D114336: [Polly] Generalize the pattern matching to the case of tensor contractions.

Michael Kruse via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon May 16 15:59:13 PDT 2022


Meinersbur added a comment.

In D114336#3500858 <https://reviews.llvm.org/D114336#3500858>, @gareevroman wrote:

> 1
> Yes, it was intended. The transformation helps to optimize a class of programs, which is broader then a tensor contraction. However, it heavily depends on the codegen part. I think that the improvement of the detection can be the goal of the future work.

Please document what pattern is intended to be recognized. I don't think the doc for `isTCPattern` is sufficient, it only mentioned what is checked. Documenting the intended pattern would help identifying if a check has been forgotten. E.g. for the statement domain.

> As far as I know, in these cases, the codegen modifies some memory accesses. Consequently, they are not correspond to the current pattern.
>
>   ReadAccess :=	[Reduction Type: NONE] [Scalar: 0]
>       { Stmt3[i0, i1, i2, i3] -> MemRef0[i0, i2, 1 + i3] };
>   ReadAccess :=	[Reduction Type: NONE] [Scalar: 0]
>       { Stmt3[i0, i1, i2, i3] -> MemRef1[1 + i3, i1, i2] };
>   ReadAccess :=	[Reduction Type: +] [Scalar: 0]
>       { Stmt3[i0, i1, i2, i3] -> MemRef2[i0, i1] };
>   MustWriteAccess :=	[Reduction Type: +] [Scalar: 0]
>       { Stmt3[i0, i1, i2, i3] -> MemRef2[i0, i1] };

What do you mean by "codegen modifies some memory accesses"? Polly's Codegen? What is the check to exclude this? Is the `1 + i3` memory access expression? Where does it come from?

> 4
>
>> Here, i occurs as indices for A, B, and C and detected as TC. Is this supported?
>
>
>
>> void foo(double C[64][64], double A[64][64][64], double B[64][64][64]) {
>>
>>   for (int i = 0; i < 32; i++)
>>   for (int j = 0; j < 32; j++)
>>   for (int l = 0; l < 32; l++)
>>   for (int w = 0; w < 32; w++)
>>       C[i][j] += A[i][l][w] * B[w][j][i];
>>
>> }
>
> For some reason, I cannot reproduce that. I have added a corresponding test case. As far as I understand, that should be detected because of the line 1365.

Should or should not be detected?

While debugging, it is now rejected because of this:
``

  if (!TCI.B) {
    // IndexSet should be a union of J and P sets.
    if (unite(TCI.P, TCI.J) != IndexSet)
      return false;

``

Could you choose more meaningful identifiers than `I`, `J`, `P` and `J`, or use them in the pattern described in `isTCPattern`? I think of something like:

  for (...) {
    ...
    for (...) {
      if (c)
        auto acc = C[P]; // ReadFromC
        auto a = A[Pa, I];
        auto b = B[Pb,J];
        auto arg = f(a,b);
        acc = acc op arg;
        C[P] = acc; // WriteToC
      }
    }
  }
  
  where P, I, J are sets of indices of the surrounding loops, Pa and Pb are subsets of P, I are the indices only occurring in the subscript of reading from A, J are the indices only occurring in the subscript for reading from B. There must be no indices not occuring in either P, I or J. `op=` is a commutative operation ...., c is an affine condition usually just `true`. `f` is a side-effect free operation.

(I don't whether this is correct, I want to understand whether the checked conditions are sufficient).

> 5
>
> I think that that it is redundant to require that bands are marked as permutable, since we check the form of dependencies and memory accesses. I propose to remove such checks for pattern matching optimizations.

Ok.



================
Comment at: polly/lib/Transform/MatmulOptimizer.cpp:210
+
+  /// @{
+  /// Input dimension of the schedule space, which represents contracted
----------------
`@{` is not needed when documenting just a single member.


================
Comment at: polly/lib/Transform/MatmulOptimizer.cpp:1310
+    // Obtain the set I.
+    TCI.I = std::move(set_difference(IndexSet, TCI.P));
+    if (!isSuperset(IandJIndexSet, TCI.I))
----------------
Compiler warning:
```
/home/meinersbur/src/llvm-project/polly/lib/Transform/MatmulOptimizer.cpp:1310:13: warning: moving a temporary object prevents copy elision [-Wpessimizing-move]
    TCI.I = std::move(set_difference(IndexSet, TCI.P));
```
The result of `set_difference` is already an r-value, no need to cast it to an r-value.


================
Comment at: polly/lib/Transform/MatmulOptimizer.cpp:1315
+    // Obtain the set J.
+    TCI.J = std::move(set_difference(IandJIndexSet, TCI.I));
+
----------------
Same compiler warning.


================
Comment at: polly/lib/Transform/MatmulOptimizer.cpp:1395
+///         and false, otherwise.
+static bool isTcDep(isl::set DepDelta, unsigned Pos, isl::set BoundDeltas,
+                    SmallDenseSet<int> *IndexSet) {
----------------
gareevroman wrote:
> Meinersbur wrote:
> > This seems to check setermine whether there is a reduction (contraction) carried by loop number `Pos`. The function name could be more meaningful. Suggestion: `isDepCarryingReductionOverDim` (not nice, but "TcDep" can mean anything)
> Could we name it isReductionCarriedOverDim? I think, in this case, we should rename the parameter Pos to Dim to make it more readable.
Sounds ok.


================
Comment at: polly/lib/Transform/MatmulOptimizer.cpp:1223
+                   const std::set<int> &IandJIndexSet,
+                   SmallVector<int, 30> Dimensions, TCInfoTy &TCI) {
+  if (!TCI.A) {
----------------
gareevroman wrote:
> Meinersbur wrote:
> > 
> As far as I understand, we cannot do this here because of the assignment to TCI.ADimensions and TCI.BDimensions
The assignments should just make a copy of the array . With `Dimensions` being passed by-value, the caller has to make the copy which it should not need to.

`SmallVector` has no overload for being assigned an `ArrayRef`, but you could use `llvm::append_range` to insert all the values.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114336/new/

https://reviews.llvm.org/D114336



More information about the llvm-commits mailing list