[Openmp-commits] [PATCH] D119357: [Libomptarget] Increase stack size for bug49779 test

Joseph Huber via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Wed Feb 9 10:50:25 PST 2022

jhuber6 created this revision.
jhuber6 added reviewers: Meinersbur, jdoerfert.
jhuber6 requested review of this revision.
Herald added a project: OpenMP.
Herald added a subscriber: openmp-commits.

The 'bug49779.cpp' test has been failing recently. This is because the
runtime is sufficiently complex when using nested parallelism without
optimizations that the CUDA tools cannot statically determine the stack
size. Because of this the kernel can exceed the thread stack size and
crash. Work around this using the 'LIBOMPTARGET_STACK_SIZE' environment
variable and add an FAQ entry for this situation.

Fixes #53670

  rG LLVM Github Monorepo



Index: openmp/libomptarget/test/offloading/bug49779.cpp
--- openmp/libomptarget/test/offloading/bug49779.cpp
+++ openmp/libomptarget/test/offloading/bug49779.cpp
@@ -1,8 +1,8 @@
-// RUN: %libomptarget-compilexx-run-and-check-aarch64-unknown-linux-gnu
-// RUN: %libomptarget-compilexx-run-and-check-powerpc64-ibm-linux-gnu
-// RUN: %libomptarget-compilexx-run-and-check-powerpc64le-ibm-linux-gnu
-// RUN: %libomptarget-compilexx-run-and-check-x86_64-pc-linux-gnu
-// RUN: %libomptarget-compilexx-run-and-check-nvptx64-nvidia-cuda
+// RUN: %libomptarget-compilexx-generic && \
+// RUN:   env LIBOMPTARGET_STACK_SIZE=2048 %libomptarget-run-generic
+// UNSUPPORTED: amdgcn-amd-amdhsa
+// UNSUPPORTED: amdgcn-amd-amdhsa-newDriver
 #include <cassert>
 #include <iostream>
Index: openmp/docs/SupportAndFAQ.rst
--- openmp/docs/SupportAndFAQ.rst
+++ openmp/docs/SupportAndFAQ.rst
@@ -313,3 +313,16 @@
 are C and C++ with Fortran support planned in the future. Compiler support is
 best for Clang but this module should work for other compiler vendors such as
+Q: What does 'Stack size for entry function cannot be statically determined' mean?
+This is a warning that the Nvidia tools will sometimes emit if the offloading
+region is too complex. Normally, the CUDA tools attempt to statically determine
+how much stack memory each thread. This way when the kernel is launched each
+thread will have as much memory as it needs. If the control flow of the kernel
+is too complex, containing recursive calls or nested parallelism, this analysis
+can fail. If this warning is triggered it means that the kernel may run out of
+stack memory during execution and crash. The environment variable
+``LIBOMPTARGET_STACK_SIZE`` can be used to increase the stack size if this

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D119357.407213.patch
Type: text/x-patch
Size: 2004 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20220209/81fd04d9/attachment.bin>

More information about the Openmp-commits mailing list