[Openmp-commits] [PATCH] D98832: [libomptarget] Tune the number of teams and threads for kernel launch.

Dhruva Chakrabarti via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Thu Mar 18 11:25:15 PDT 2021

dhruvachak added a comment.

In D98832#2634168 <https://reviews.llvm.org/D98832#2634168>, @JonChesterfield wrote:

> This is really interesting. The idea seems to be to choose the dispatch parameters based on the kernel metadata and the limits of the machine.
> What's the underlying heuristic? Break across N CU's in chunks that match the occupancy limits of each CU?

Yes, that's the idea.

> If so we probably want to compare LDS usage as well to avoid partitioning poorly for that.
> Maybe others - there might be a performance cliff on amount of private memory too.

Agreed. However, I don't see LDS usage in the metadata table in the image. Is it present there?

In theory, a very high sgpr count can limit the number of available workgroups if that's not factored in for determining the number of threads. But in practice, VGPRs tend to be the primary limiting factor. So perhaps we can start with using VGPRs for this purpose and have experience guide us in the future.

  rG LLVM Github Monorepo



More information about the Openmp-commits mailing list