[Mlir-commits] [mlir] [mlir][memref] Canonicalize memref.reinterpret_cast when offset/sizes/strides are constants. (PR #163505)

Mon Oct 20 08:59:00 PDT 2025

jeanPerier wrote:

> > Ok so it could be enforced in the verifier. We are experiencing with some FIR to MemRef passes and Fir represents the dynamic size as `-1` where MemRef represents it as `std::numeric_limits<int64_t>::min()`. So I guess we need to align our representation.
> 
> Why not directly use the `ShapedType::kDynamic` dynamic size type? If you also cannot determine the current dynamic size at runtime, I suggest using the `ub.poison` value to represent it.

`-1` is used in the Fortran descriptors to encode assumed-size arrays (`ARRAY(n, m,*)`). In these arrays, the extent of the outer dimension (last one in Fortran) will never be known and does not matter (not needed to generate pointer arithmetic). There are a lot of restricictions with these arrays (the user can basically only index of pass them and should not do anything that would require the compiler to know the effective size of the array).

This value may be later used in code to detect assumed-size (e.g SELECT RANK). This specific value is also mandated in C-Fortran interoperability contexts (Fortran 2023 section 18.5.3).

So in the FIR "equivalent" of memref, `fir.box`, `-1` is a well specified and expected value. We do not want to use poison. Another example of its importance is runtime bounds checking (which flang does not have yet). `-1` will allow bounds checking to know that there is no way to check the index when addressing the last dimension of an assumed-size array. 

I think the issue here is that memref is probably not designed currently to support this Fortran use case.
@NexMing, how are you translating fir.embox to memref to deal with assumed-size arrays?

https://github.com/llvm/llvm-project/pull/163505