[PATCH] D47009: AMDGPU: Add pass to optimize reqd_work_group_size

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu May 17 17:48:23 PDT 2018


rampitec added a comment.

As far as I understand it is only applicable if:

- both reqd_work_group_size is used and the program is compiled with -cl-uniform-work-group-size
- reqd_work_group_size is used and the program is compiled with -cl-std less than 2.0.

Potentially other languages can benefit it as well per language standard.

This may be an easier work for an FE to call simplified function, but an FE will not solve the issue with call from a non-kernel function. Since you are writing the whole pass for it makes sense to address this as well.


https://reviews.llvm.org/D47009





More information about the llvm-commits mailing list