[Mlir-commits] [mlir] [mlir][bufferization] Use Type instead of Value in unknown conversion (PR #144658)

Tue Jul 1 13:57:32 PDT 2025

hanhanW wrote:

> > > Generally, bufferization should be able to create a memref from a tensor without needing to know more than just a mlir::Type.
> > 
> > 
> > Is it true?
> 
> I would imagine so. I mean, at least this makes sense: you get a type in, you get a type out. It is a type conversion, not a value-to-type conversion.
> 
> > In [IREE](https://github.com/iree-org/iree/blob/63f625d428a31e6ccf2fd594e544c3bef659c63f/compiler/src/iree/compiler/Codegen/Common/IREEComprehensiveBufferizePass.cpp#L143-L164), we have special logic for constants. I don't remember all the details, my guess is that we'd like to use private memory for small constants. I can try to make our project happy, but the change itself looks off to me. We name the function as `unknownTypeConverterFn`, but you always pass tensor types. I was thinking if passing Value allows you doing custom tensor types better because you can define and use your own type system in your dialect.
> 
> Thus far what we've come up with Matthias is: TensorLike + BufferLike give us custom type support, while options serve the builtin tensor -> builtin memref conversion. I guess this makes sense (I haven't seen issues but I'm only in the middle of the process with these changes) - unknown type conversion is kind of a last-mile fallback (for builtins?). Supposedly, if you're inside a custom type already (via TensorLike), you wouldn't need it?

The idea seems okay to me now. I don't have a fresh memory about bufferization now, but what you said and the comments in the codebase makes sense.

> Honestly, I am completely lost in all these layout peculiarities upstream. Our downstream does it slightly different: we generally have strides only for "subview" like operations (e.g. in tiling) and all other IR assumes "dense" (if I may) buffers. But, actually, what we do for constants is we strip the strides manually (via [canonicalizer](https://github.com/openvinotoolkit/npu_compiler/blob/d5d219ee0ffa5c6a0e3af6b99675b94e28e25583/src/vpux_compiler/src/dialect/const/ops.cpp#L314) - because we have our own constant operation). Now that I think of this, perhaps, this is exactly your problem also? I mean maybe the issue is the _default_ behaviour/implementation? I plan to look at builtin tensor -> memref conversion as well and make sure tensor encoding gets correctly mapped to memref layout. Perhaps it makes sense to revisit what should be done w.r.t. dynamic layouts to solve this issue for both of us?

Yeah, I think the main difference is that you have your own constant in your project, and you can define your own canonicalization patterns to achieve this with your assumptions. IREE uses upstream dialects, e.g., arith/spir-v/etc, and it is very stable. Maybe the issue is in the default behavior in upstream. Or maybe IREE should evolve to the next phase. Again, it has been stable for a long time and I don't have much bandwidth to review it as it touches many components and few different backends, so it might not happen in the near future. If you identify something in the default behavior, I'm happy to learn about it. Thanks for your sharing! 🙂

https://github.com/llvm/llvm-project/pull/144658