Hardcode84 wrote: GPU dialect launch(func) ops are supposed to be a high-level abstraction over different gpu runtimes, not sure if adding such very vendor-specific details to it is good idea. https://github.com/llvm/llvm-project/pull/95545