[Mlir-commits] [mlir] Full slices when tiling full loop trip count (PR #127197)

Wed Feb 19 22:45:24 PST 2025

================
@@ -214,3 +214,35 @@ module attributes {transform.with_named_sequence} {
     transform.yield
   }
 }
+
+// -----
+
+// CHECK-LABEL: func @non_monotonic_affine_expr
+//  CHECK-SAME:   %[[ARG0:[a-zA-Z0-9_]+]]: tensor<7xf32>
+func.func @non_monotonic_affine_expr(%arg0 : tensor<7xf32>) -> tensor<7xf32> {
+  %c0 = arith.constant 0 : index
+  %0 = tensor.dim %arg0, %c0 : tensor<7xf32>
+  %empty = tensor.empty() : tensor<7xf32>
+
+  // CHECK: %[[OUT:.*]] = tensor.empty() : tensor<7xf32>
+  // CHECK: scf.for {{.*}} to {{.*}} step {{.*}} iter_args(%[[TC0:.*]] = %[[OUT]]) -> (tensor<7xf32>) {
+  // CHECK: tensor.extract_slice %[[TC0]][0] [7] [1] : tensor<7xf32> to tensor<7xf32>
+  %generic = linalg.generic
+    {indexing_maps = [affine_map<(d0) -> (d0 mod 4)>,
+                      affine_map<(d0) -> (d0)>],
+     iterator_types = ["parallel"]}
+    ins(%arg0: tensor<7xf32>)
+    outs(%empty : tensor<7xf32>) {
+    ^bb0(%in : f32, %out: f32):
+      linalg.yield %in : f32
+    } -> tensor<7xf32>
+  return %generic : tensor<7xf32>
+}
----------------
MaheshRavishankar wrote:


> Furthermore, in the presence of non-monotonic expressions just trying to calculate the slice using a static tensor dim size results in invalid slices... For the example in this test case, `7 mod 4` would result in a slice of size 3 instead of 4. And that's why we stop slicing in those cases.
> 

This is a problem with having indexing maps of such expressions. Transformations like tiling will not be able to get the right slice. I am not convinced using the full slice is the right answer here. You cannot compute the slice needed cause it is not a contiguous slice (or a slice with strides), i.e. you cannot really represent the slice that is needed here for operands accessed with such expressions.


https://github.com/llvm/llvm-project/pull/127197