[Mlir-commits] [mlir] [MLIR][affine] Fix for #115849 Illegal affine loop fusion with vector types (PR #117617)
llvmlistbot at llvm.org
llvmlistbot at llvm.org
Tue Dec 3 16:17:26 PST 2024
brod4910 wrote:
The big issue here is that when we are able to validate that the load/store can be fused, the IR the transformation is still invalid. For example:
```mlir
func.func @should_fuse_across_memref_store_load_bounds() {
%a = memref.alloc() : memref<64x512xf32>
%b = memref.alloc() : memref<64x512xf32>
%c = memref.alloc() : memref<64x512xf32>
%d = memref.alloc() : memref<64x4096xf32>
%e = memref.alloc() : memref<64x4096xf32>
affine.for %j = 0 to 8 {
%lhs = affine.vector_load %a[0, %j * 64] : memref<64x512xf32>, vector<64x64xf32>
%rhs = affine.vector_load %b[0, %j * 64] : memref<64x512xf32>, vector<64x64xf32>
%res = arith.addf %lhs, %rhs : vector<64x64xf32>
affine.vector_store %res, %c[0, %j * 64] : memref<64x512xf32>, vector<64x64xf32>
}
affine.for %j = 0 to 8 {
%lhs = affine.vector_load %c[0, 0] : memref<64x512xf32>, vector<64x32xf32>
%rhs = affine.vector_load %d[0, %j * 32] : memref<64x4096xf32>, vector<64x32xf32>
%res = arith.subf %lhs, %rhs : vector<64x32xf32>
affine.vector_store %res, %d[0, %j * 32] : memref<64x4096xf32>, vector<64x32xf32>
}
return
}
```
Produces this invalid IR:
```mlir
func.func @should_fuse_across_memref_store_load_bounds() {
%alloc = memref.alloc() : memref<1x1xf32>
%c0 = arith.constant 0 : index
%alloc_0 = memref.alloc() : memref<64x512xf32>
%alloc_1 = memref.alloc() : memref<64x512xf32>
%alloc_2 = memref.alloc() : memref<64x4096xf32>
%alloc_3 = memref.alloc() : memref<64x4096xf32>
affine.for %arg0 = 0 to 8 {
%0 = affine.vector_load %alloc_0[0, %c0 * 64] : memref<64x512xf32>, vector<64x64xf32>
%1 = affine.vector_load %alloc_1[0, %c0 * 64] : memref<64x512xf32>, vector<64x64xf32>
%2 = arith.addf %0, %1 : vector<64x64xf32>
affine.vector_store %2, %alloc[0, 0] : memref<1x1xf32>, vector<64x64xf32>
%3 = affine.vector_load %alloc[0, 0] : memref<1x1xf32>, vector<64x32xf32>
%4 = affine.vector_load %alloc_2[0, %arg0 * 32] : memref<64x4096xf32>, vector<64x32xf32>
%5 = arith.subf %3, %4 : vector<64x32xf32>
affine.vector_store %5, %alloc_2[0, %arg0 * 32] : memref<64x4096xf32>, vector<64x32xf32>
}
return
}
}
```
The private alloc `%alloc` created is `memref<1x1xf32> instead of the proper size.
https://github.com/llvm/llvm-project/pull/117617
More information about the Mlir-commits
mailing list