[Openmp-commits] [PATCH] D102107: [OpenMP] Codegen aggregate for outlined function captures

Johannes Doerfert via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Wed Sep 22 07:09:24 PDT 2021

jdoerfert added a comment.

In D102107#3014759 <https://reviews.llvm.org/D102107#3014759>, @JonChesterfield wrote:

> In D102107#3014743 <https://reviews.llvm.org/D102107#3014743>, @pdhaliwal wrote:
>> I got this after changing __kmpc_impl_malloc to return 0xdeadbeef. So, this confirms that missing malloc implementation is the root cause.
>>> Memory access fault by GPU node-4 (Agent handle: 0x1bc5000) on address 0xdeadb000. Reason: Page not present or supervisor privilege.
> Nice! In that case I think the way to go is to audit the (probably few) places where kmpc_impl_malloc are called and add a check for whether the return value is 0. With that in place we can reland this and get more graceful failure (at a guess we should fall back to the host when gpu memory is exhausted? or maybe just print a 'out of gpu heap memory' style message and abort, don't know).

We should only fail to remove the __kmpc_shared_alloc with O0. Since we need __kmpc_shared_alloc for all non-trivial codes, they would always fail on AMDGPU. That said,
why is the shared memory stack not catching this. It's a 64 byte stack for the main thread and we are looking at at 24 byte allocation for `declare_mapper_target.cpp`.
Can you determine why first two conditionals in `__kmpc_alloc_shared` don't catch this and return proper memory?

  rG LLVM Github Monorepo



More information about the Openmp-commits mailing list