[Mlir-commits] [mlir] [mlir][vector] Update the folder for vector.{insert|extract} (PR #136579)

Tue Apr 22 04:44:30 PDT 2025

banach-space wrote:

> I thought that -1 (as an attribute) corresponded to a %poison = ub.poison : index value.

Ah, I see how you interpreted that - thanks for clarifying! That also made me realise we actually have three (rather than two) relevant options:

```mlir
  // Option 1   
  %1 = vector.extract %src[-1, 0] : f32 from vector<4x5xf32>

  // Option 2
  %c_neg_1 = arith.constant -1 : inde
  %2 = vector.extract %src[%c_neg_1, 0] : f32 from vector<4x5xf32>

  // Option 3  
  %poison = ub.poison : index
  %3 = vector.extract %src[%poison, 0] : f32 from vector<4x5xf32>
```

Your PR takes care of **Option 1**, and I’m currently addressing **Option 2**. That still leaves **Option 3** open.
Now, (extra emphasis by me):

> it would be fine (but **weird** I agree) to have a different meaning than a %neg_1 = arith.constant -1 : index constant.

Indeed, we should strive for consistency unless there's a strong reason not to. From the [Vector dialect docs](https://mlir.llvm.org/docs/Dialects/Vector/#vectorinsert-vectorinsertop):
> The value -1 represents a poison index, which specifies that the resulting vector is poison.

To me, `-1` refers to a compile-time constant — and that should include both attribute and SSA representations (e.g. `%neg_1 = arith.constant -1 : index`). That’s the rationale for treating **Option 1** and **Option 2** consistently.

Also from the docs:

> The result is undefined if any index is out-of-bounds. 

So we're already making a special exception for -1, and I agree it would be helpful to update the documentation to make this exception clearer.

> I guess things would be clearer if we decide what is the meaning of any OOB access, wether we decide it is UB or poison.

Yes, good point — and actually @dcaballe has voiced support for treating any OOB access as poison (see [this comment](https://github.com/llvm/llvm-project/issues/134516#issuecomment-2789687910)):
> I don't see a problem with allowing more OOB indices. This was initially constrained to test the waters. I would canonicalize OOB to -1, though, to avoid having arbitrary odd numbers in the IR.

That would also make the semantics more consistent with LLVM’s own behavior ([LangRef](https://llvm.org/docs/LangRef.html#id184)):
> If idx exceeds the length of val for a fixed-length vector, the result is a [poison value](https://llvm.org/docs/LangRef.html#poisonvalues)

**NEXT STEPS**

I would land this as is - to me this change is consistent with the existing docs. As for other OOB indices (other than `-1`) and using `%poison = ub.poison : index`, I'd move that to a separate PR. WDYT? Oh, and thanks for the comments and the discussion 🙏🏻 

https://github.com/llvm/llvm-project/pull/136579