[llvm] [SDAG] Add partial_reduce_sumla node (PR #141267)
Alexey Karyakin via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 1 12:58:45 PDT 2025
quic-akaryaki wrote:
Sorry for out of the blue questions... Is there a plan to handle partial reductions when vectors sizes do not exactly match a hardware instruction (e.g. UDOT)? Right now, it looks such cases do not match and fall back to the ladder algorithm. Do you (folks @ARM) think that it may be worth lowering them to a e.g. a sequence of UDOTs instead? Example is `@llvm.experimental.vector.partial.reduce.add.v2i32.v16i32`.
How does the upper layer that generates partial reductions (VPlan?) choose the vector sizes? It has to know the exact CPU variant and the IR becomes tied to that variant, is that right?
https://github.com/llvm/llvm-project/pull/141267
More information about the llvm-commits
mailing list