[Mlir-commits] [mlir] [mlir][gpu] Allow subgroup reductions over 1-d vector types (PR #76015)

Guray Ozen llvmlistbot at llvm.org
Wed Dec 20 05:10:34 PST 2023


grypp wrote:

> For example we can perform 4 independent f16 reductions with a series of gpu.shuffles over i64, reducing the final number of gpu.shuffles by 4x.

This sounds good to me. 

PTX only allows shuffling 32bit registers, so 4xf16 needs 2xshuffle. Can the PR supports that?  

https://github.com/llvm/llvm-project/pull/76015


More information about the Mlir-commits mailing list