[Mlir-commits] [mlir] [mlir][ArmSME] Support widening outer products (PR #78975)

Benjamin Maxwell llvmlistbot at llvm.org
Mon Jan 22 07:54:49 PST 2024


================
@@ -736,6 +736,7 @@ class OuterProductResultTileTypeConstraint<string operand> :
 
 def OuterProductOp :
   ArmSME_Op<"outerproduct", [
+    Pure,
----------------
MacDue wrote:

No ArmSME tile ops are pure, as they're implicitly working on ZA, which is not modeled. 

Consider: 
```
%0 = arm_sme.outerproduct %a, %b : vector<[4]x[4]xf32> // assume result is all 1 (allocs a new tile)
%1 = arm_sme.outerproduct %a, %b : vector<[4]x[4]xf32> // assume result is all 1 (allocs a new tile)
// Move tile slice into %1 (assume slice is all 0)
%2 = arm_sme.move_vector_to_tile_slice %1 ....
arm_sme.tile_store %0 ...
arm_sme.tile_store %2 ...
```
This results in a store of `%0` which is all 1, followed by a store of `%2`, which has one row of zeros.

Now since `arm_sme.outerproduct` is pure, cse can rewrite this to:
```
%0 = arm_sme.outerproduct %a, %b : vector<[4]x[4]xf32> // assume result is all 1
// Move tile slice into %1 (assume slice is all 0)
%1 = arm_sme.move_vector_to_tile_slice %0 ....
arm_sme.tile_store %0 ...
arm_sme.tile_store %1 ...
```
In this case all operations (after tile allocation) are working on the same tile. So _both_ will result in a store of a tile with one slice of zeros, which is incorrect.  

I don't think this can be fixed without proper tile value semantics.

https://github.com/llvm/llvm-project/pull/78975


More information about the Mlir-commits mailing list