[clang] [AMDGPU] Introduce 'amdgpu_num_workgroups_{xyz}' builtin (PR #83927)

Joseph Huber via cfe-commits cfe-commits at lists.llvm.org
Tue Mar 5 20:06:00 PST 2024


jhuber6 wrote:

> I think we would be better off teaching an IR optimizer pass to recognize the divide pattern and remap it to the load from the new location, rather than forcing the complexity into every frontend

That's fair. I would've argued that this version should've been the builtin and the grid size be the computed one but it's definitely not ideal to have multiple versions of this. I'll try to find a place to do this peephole optimization. 

https://github.com/llvm/llvm-project/pull/83927


More information about the cfe-commits mailing list