[Mlir-commits] [mlir] [mlir][vector] Add verification for incorrect vector.extract (PR #115824)

Sun Nov 17 08:06:24 PST 2024

================
@@ -1339,6 +1339,50 @@ bool ExtractOp::isCompatibleReturnTypes(TypeRange l, TypeRange r) {
   return l == r;
 }
 
+// Common verification rules for `InsertOp` and `ExtractOp` involving indices.
+// `indexedType` is the vector type being indexed in the operation, i.e., the
+// destination type in InsertOp and the source type in ExtractOp.
+// `vecOrScalarType` is the type that is not indexed in the op and can be
+// either a scalar or a vector, i.e., the source type in InsertOp and the
+// return type in ExtractOp.
+static LogicalResult verifyInsertExtractIndices(Operation *op,
+                                                VectorType indexedType,
+                                                int64_t numIndices,
+                                                Type vecOrScalarType) {
+  int64_t indexedRank = indexedType.getRank();
+  if (numIndices > indexedRank)
+    return op->emitOpError(
+        "expected a number of indices no greater than the indexed vector rank");
+
+  if (auto nonIndexedVecType = dyn_cast<VectorType>(vecOrScalarType)) {
+    // Vector case, including:
+    //  * 0-D vector:
+    //    * vector.extract %src[2]: vector<f32> from vector<8xf32)
----------------
banach-space wrote:

> You are looking at it from the dimensionality perspective. I'm looking at it from the number of elements perspective. 

Right, thanks for clarifying! This explains the confusion.

>  The first case is extracting a vector with one element from a vector with one element, which is redundant. The second case is extracting a vector with one element from a vector with 8 elements, which is a meaningful operation.

But should the verifier be this opinionated? 🤔 

Fundamentally, both operations are syntactically and semantically correct:
```mlir
 vector.extract %src[2] : vector<f32> from vector<8xf32>
 vector.extract %arg0[0] : vector<f32> from vector<1xf32>
```

If we only allow the first case, a few concerns arise:

* **For users**, this behavior feels unintuitive:
  * In both cases, I’m extracting one element from a vector that has at least one element, with a valid index. Why would one pass verification while the other fails?
* **For compiler developers**, this adds unnecessary complexity:
  * Now, I need to handle two separate cases for 0-D extracts - one allowed, one disallowed (based on the number of elements in the indexed vector). This will make pattern development more cumbersome.


This calls for some clear guiding principles. I couldn’t find anything explicit in the documentation, so below is a suggestion based on the existing [vector.extract](https://mlir.llvm.org/docs/Dialects/Vector/#vectorextract-vectorextractop) docs.

---

> Proposed Guiding Rules for `vector.extract` and `vector.insert`

**Rule 1 (input vs output rank)**

It is **always valid** to extract/insert a `k-rank` vector from/into an `n-rank` vector, provided `k <= n`. In particular:
* Extracting/inserting a 0-D vector is always allowed.
* Operations with `k > n` are never valid. 

**Rule 2 (single element insertion/extraction)**

It is **always valid** to extract/insert exactly 1 element. Indeed, there are no empty vectors.

**Rule 3 (dimensional consistency)**

All dimensions of the `k-rank` non-indexed argument **must match** the trailing `k` dimensions of the indexed argument.

It seems that **Rule 2**  is contentious. Specifically, when extracting one element, should the result be:

 * `f32` (a scalar),
* `vector<f32>` (a 0-D vector), or
* `vector<1xf32>` (a 1-D with size 1)?

The [current docs](https://mlir.llvm.org/docs/Dialects/Vector/#vectorextract-vectorextractop) state:

> Degenerates to an element type if n-k is zero.

This suggests `f32`, right? If that’s the standard, we should stick to it. 

Alternatively, if all three cases are allowed, we should ensure consistency throughout. In particular, hooks for inferring types (e.g. `inferReturnTypes`) should be disabled in such cases (assert) and it should be for the user to decide which return type to use. The verifier would only make sure that it's one of the 3 options above.

---

> We could let the first case to be valid as well but it's a redundant operation that can't be trivially folded. If we remove the vector.extract and propagate the vector<1xf32> source vector forward, there would be a type mismatch because a single element 1-D vector type is not a 0-D vector type. We would have to introduce a canonicalization pattern that turns that vector.extract into a bitcast or something like that.

The presence of:
```mlir
vector.extract %arg0[0] : vector<f32> from vector<1xf32>
``` 

suggests a deeper issue: something upstream generated a 1-D vector (`vector<1xf32>`) where a `0-D` (`f32`) was expected, right? Whether or not `vector.extract` supports this case, it will still be tricky to handle downstream.

Wouldn’t it make more sense to address this earlier in the pipeline? Specifically, why not directly generate `f32` instead of `vector<1xf32>` (or `vector<f32>`)? That feels like the root issue here.

--- 

**Final points**

Apologies for the length of this response - it's clear that 0-D vectors are in need of some serious TLC. Thank you for tackling this!

Now, I recognise that you have much more experience here and have likely thought through these considerations multiple times. I am also mindful that I might be missing some nuances that are hard to see from the point of view of the verifier. So please bare with me :)

https://github.com/llvm/llvm-project/pull/115824