[Openmp-commits] [PATCH] D119313: [OpenMP][CUDA] Set the hard team limit to 2^31-1
Shilei Tian via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Wed Feb 9 06:24:42 PST 2022
tianshilei1992 added a comment.
In D119313#3307102 <https://reviews.llvm.org/D119313#3307102>, @jdoerfert wrote:
> I don't think we should hardcode this at all. It seems to me like a device specific value, no? As such we should store it per device and query it in the beginning.
> If this is always going to be 2^31-1, ok, but if there is nothing specifying that I'd say we make it dynamic. I can see little use in a static value anyway.
No. This value is the hard limit, not the per device value. Per device value is `BlocksPerGrid`, and it is set by calling CUDA interface. After that, it compares with this hard limit, and caps it accordingly. As a result, even if the device can support more blocks, it will always capped to 65536.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D119313/new/
https://reviews.llvm.org/D119313
More information about the Openmp-commits
mailing list