[Openmp-commits] [PATCH] D119313: [OpenMP][CUDA] Set the hard team limit to 2^31-1

Shilei Tian via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Wed Feb 9 06:24:42 PST 2022

tianshilei1992 added a comment.

In D119313#3307102 <https://reviews.llvm.org/D119313#3307102>, @jdoerfert wrote:

> I don't think we should hardcode this at all. It seems to me like a device specific value, no? As such we should store it per device and query it in the beginning.
> If this is always going to be 2^31-1, ok, but if there is nothing specifying that I'd say we make it dynamic. I can see little use in a static value anyway.

No. This value is the hard limit, not the per device value. Per device value is `BlocksPerGrid`, and it is set by calling CUDA interface. After that, it compares with this hard limit, and caps it accordingly. As a result, even if the device can support more blocks, it will always capped to 65536.

  rG LLVM Github Monorepo



More information about the Openmp-commits mailing list