[clang] [AMDGPU] Introduce 'amdgpu_num_workgroups_{xyz}' builtin (PR #83927)
Joseph Huber via cfe-commits
cfe-commits at lists.llvm.org
Tue Mar 5 20:06:00 PST 2024
jhuber6 wrote:
> I think we would be better off teaching an IR optimizer pass to recognize the divide pattern and remap it to the load from the new location, rather than forcing the complexity into every frontend
That's fair. I would've argued that this version should've been the builtin and the grid size be the computed one but it's definitely not ideal to have multiple versions of this. I'll try to find a place to do this peephole optimization.
https://github.com/llvm/llvm-project/pull/83927
More information about the cfe-commits
mailing list