[Mlir-commits] [mlir] [MLIR][XeGPU][TransformOps] Add insert_prefetch op (PR #167356)

Mon Nov 10 23:55:20 PST 2025

tkarna wrote:

> I think the op design can be more flexible if we further split the current op to: get_load_op and then insert_prefetch applied to the load_op instead of dpas. When the user case becomes more complex, Insert_prefetch for dpas become less intuitive. Since operand A maybe not inside the K loop, or it is inside the loop coming from last dpas result + some post-op, sometime the load can be a load from slm. At that point, user would like to work against the load directly and try to insert prefetch, not caring the dpas op.

This API is not dpas specific: insert_prefetch takes a handle to a Value (typically `vector`). It's not immediately clear that having an API for the load op is better than an API for the value the load op produces. If the producer chain is complex, the user can use some intermediate value in the insert_prefetch op, it does not have to be a dpas operand.

That said, I'd propose we postpone this change instead of adding a new `get_load_op` now. If in the future we have a generic `find_producer_of_type` op in the transform dialect, that would work for both `get_desc_op` and `get_load_op` use cases.

https://github.com/llvm/llvm-project/pull/167356