krzysz00 wrote: Wanted to ping because, even outside of the subgroup_reduce case, there may be cases where rewriting certain `gpu.shuffle` instances to `ds_swizzle` will be advantageous https://github.com/llvm/llvm-project/pull/137109