https://github.com/kuhar commented: I wonder if it would make sense to have this in the gpu dialect? I think it should be portable and generalize to other architectures. https://github.com/llvm/llvm-project/pull/152740