[Mlir-commits] [mlir] [mlir][memref] Allow out-of-bounds semantics for memref.subview (PR #152164)

Wed Aug 6 03:41:45 PDT 2025

rengolin wrote:

> vector.transfer_read masking does not use the source memref size to determine if masking is needed. A mask needs to be provided by the user. I think you are talking about the in_bounds attribute, which I agree would work weirdly with memrefs that can be out-of-bounds.

Don't forget about scalable vector extensions, which get the mask from the `first-fault`. @banach-space 

> > What does it even mean for a memref to have a certain size if part of the memref points to invalid/unallocated memory? (In other words: why does the memref have a size?)
> 
> This is the more fundamental question that I'd like to answer.

The value of higher-level abstraction is that you can trust it when lowering/vectorizing because you know the information is complete. If we add a clause where UB is only triggered if you actually access the out-of-bounds memory (ie. needs to look at a DAG of ops), then we're back into LLVM IR level, and the lower part of the stack gets confused.

The type of validation that would be required to assess UB here would be the second case @matthias-springer mentions:

> (b) looking at attributes, types, values and other operations.

The basic validation (a) only looks at the op itself. (b) needs to be a separate pattern (that either fails or converts the value to `ub.poison`. Which is what I think Matthias proposed.

Alternatively, we can reinforce the semantics of attributes like `in-bounds` and `partial-out-of-bounds` and `totally-out-of-bounds` and convert them to poison or not, and help the `vector.transfer_read` check static bounds to a potential `out-of-bounds` memref slice without needing a full code scan pass.

https://github.com/llvm/llvm-project/pull/152164