[llvm] Handle VECREDUCE intrinsics in NVPTX backend (PR #136253)

Wed Jun 18 22:02:04 PDT 2025

Prince781 wrote:

Actually, I'm going to move the proposal of metadata / attributes affecting lowering decisions to another discussion in a future PR or RFC.

I've just updated the PR. We'll use the default shuffle reduction in SelectionDAG when packed ops are available on the target for the element type (ex: `v2f16`, `v2bf16`, `v2i16`). Otherwise we'll use tree or sequential reductions, depending on the fast-math option or `reassoc` flag, and whether there are special operations available like `fmin3` / `fmax3`.

This allows us to keep using packed operations like `max.f16x2` (which use less registers) while switching to tree reduction in every other case where we can reassociate operands. We're deciding this based on the target's support for the element type with the operation, so I think it makes sense to handle these intrinsics at the SelectionDAG level and not at the IR level inside `shouldExpandReductions()`.

I also notice that because we now fallback to the shuffle reduction generated by SelectionDAG instead of ExpandReductions, the codegen is cleaner and the last packed operation is scalarized. So this PR may supersede https://github.com/llvm/llvm-project/pull/143943

https://github.com/llvm/llvm-project/pull/136253