[Mlir-commits] [mlir] [mlir][vector] Fix 0-d vector transfer mask inference (PR #116526)

Mon Nov 18 01:44:58 PST 2024

https://github.com/Groverkss requested changes to this pull request.

I don't really understand why we are forcing a vector<1xi1> mask instead of vector<i1> mask. Something seems wrong here.

I'll take 3 examples:

```
vector.transfer_read %tensor[], permutation_map<() -> (0, 0, 0)> : tensor<f32>, vector<4x4x4xf32>
vector.transfer_read %tensor[%idx], permutation_map<(d0) -> (0, 0, 0)> : tensor<1xf32>, vector<4x4x4xf32>
vector.transfer_read %tensor[%idx, %idx2], permutation_map<(d0, d1) -> (0, 0, 0)> : tensor<1x1xf32>, vector<4x4x4xf32>
```

>From this PR, the mask for all of these vector.transfer_read will be vector<1xi1>, which doesn't really make sense to me. The permutation_map attribute specifies a mapping from the memory/tensor space to vector space. The masking needs to be done on the memory/tensor space. Each of these examples has a different dimensionality for the memory/tensor space. I would expect the masks to be:

```
vector.transfer_read %tensor[], permutation_map<() -> (0, 0, 0)> : tensor<f32>, vector<4x4x4xf32> // mask: vector<f32>
vector.transfer_read %tensor[%idx], permutation_map<(d0) -> (0, 0, 0)> : tensor<1xf32>, vector<4x4x4xf32> // mask: vector<1xf32>
vector.transfer_read %tensor[%idx, %idx2], permutation_map<(d0, d1) -> (0, 0, 0)> : tensor<1x1xf32>, vector<4x4x4xf32> // mask: vector<1x1xf32>
```

If we are taking a inverse of the permutation map, with broadcasting on the range, then the mask shape should always match the memory/tensor shape. I don't think we should do any implicit rank reduction.

Instead of forcing the mask to be always vector<1xf32>, can we fix the inference to infer a mask of dimensionality as domain of the permutation map, and any dimension not used in the result of the permutation_map is simply 1, since it's broadcasted. This will also make the operation much more consistent w.r.t. masking. You can always expect the dimensionality of the mask to be same as the memory/tensor.

https://github.com/llvm/llvm-project/pull/116526