<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/71747>71747</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [libomptarget][tests] Maximum assumed device heap size.
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            openmp:libomp
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
            jdoerfert
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          Meinersbur
      </td>
    </tr>
</table>

<pre>
    Despite the GPU of the [openmp-offload-cuda-project](https://lab.llvm.org/staging/#/builders/68) and [openmp-offload-cuda-runtime](https://lab.llvm.org/staging/#/builders/7) buildbots having 4 GiB of memory, the default device heap size as returned by `cuCtxGetLimit(CU_LIMIT_MALLOC_HEAP_SIZE, ...)` is 8388608 (8 MiB), while [offloading/malloc.c](https://github.com/llvm/llvm-project/blob/main/openmp/libomptarget/test/offloading/malloc.c) requires ~55MB and [offloading/malloc_parallel.c](https://github.com/llvm/llvm-project/blob/main/openmp/libomptarget/test/offloading/malloc_parallel.c) even more.

I don't know how CUDA determines the default heap size limit, but I assume it is constant and inherited from the earliest days of CUDA.

To fix this, we either
 1. limit the amount of heap allocated by any test to 8 MiB (e.g. reducing the number of teams in parallel.c to 48), or
 2. set `LIBOMPTARGET_HEAP_SIZE` to the maximum heap size allocated by any test. This patch fixes the two malloc tests, reducing the number of failed tests to 30:

```
diff --git a/openmp/libomptarget/test/lit.cfg b/openmp/libomptarget/test/lit.cfg
index 6dab31bd35a9..e288827c50f6 100644
--- a/openmp/libomptarget/test/lit.cfg
+++ b/openmp/libomptarget/test/lit.cfg
@@ -31,6 +31,8 @@ if 'LIBOMPTARGET_LOCK_MAPPED_HOST_BUFFERS' in os.environ:
 if 'OMP_TARGET_OFFLOAD' in os.environ:
 config.environment['OMP_TARGET_OFFLOAD'] = os.environ['OMP_TARGET_OFFLOAD']

+config.environment['LIBOMPTARGET_HEAP_SIZE'] = '134217728' # 128 MiB
+
 # set default environment variables for test
 if 'CHECK_OPENMP_ENV' in os.environ:
     test_env = os.environ['CHECK_OPENMP_ENV'].split()
```

A 64 MiB heap is sufficient for the malloc.c test, but not for malloc_parallel.c.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzEVkmT4jgT_TXikoFDyAvmwIG1uqKhIbqrvsN3IWQ7jdUjS4wkU1VzmN8-IZnaoqme6ZjDEA4vKPUy8-llStxacVSIU5LOCWPfK42mRuMIYyRdDnjnGm2mWxQKjS06Myh09TRdoj0Jh-AahJv9Peg6vJJ0rk-o2tNQ17XUvBqWXcWHJ6O_Y-lIuiQsb5w7WRLPCFsTtpa8iKQ8t5E2R8LW1vGjUMcwFhO2LjohKzSWsHWWEzYBrqqPvJhOOdHiv_My9k7Cd6GdhYafhTpCAjdi7pNssdXmibBFSLfCmnfSQYVnUSI0yE9gxR8I3IJB1xmFFRRPQDJadgv3eINuI1rhCMsX94fN7fb27rCdbTa7xeHTarY_fLv9_8pjR1FE2IRkFISFPM7zjOZAWJ7DVsz9CFvAQyNkT3jPQZ9Py6XUZVReI-EoXNMVUalbz4g8Pz9eloetC6mLgCIUYeueZG8lCt2eHDdH9FYOrX9cd8wmYPD3Thi08GeabucvS_aj-eHEDZcS5X8W8NsI2ATwjApabTAidEnorL_fQqUVYWMHvyn9AI1-gMX9cgYVOjStUGjfqeFVBrJf7AUUnYNb4NZ2LYJwfllLrazjygV6hGrQCIcV1Ea3AQ25kQKtg4o_WS897_JdWHcaavEIrhE2CAIBhWvQ9KMwinr_AY23ulPOw4ToQu7c9erk6gk8Q-A0BIV5qWF0jMBg1ZVe_x5CdW2BJlQ68taCUPBKnp-b5Bdp6ucQWAQWnZf_5na-2-7vZl9vVndvtJ5RP9Gjt_xRtF37toauxRjBXSMsnLgrG5_9hXr3oKFf0GAV-Pgg-JoLiVVv5p3H1OvtDa0ko5crfFairmE4PAoH_O8lJoWLyvoIxT827b0IVeEjZBUv4lFRxSmfRBGyPM_ZuExpncGI0ixJeuPhcPgLsVyyYvP--uXQSEJJQmEYjwhbZEDYPLzlcBkQNRA2frfAm93i82E72-9Xy8On3be7w_x-vV59_UbY2MtG2wjVWRitXqi_oOy2-8MFY7deb3az5U-mlFrV4vj8f4vKhS3sOghJl0Di5Vugnxq_UwSbf-TrA12_-iNsPIoTNhqPWe5zISyGEesb-cvC9Pn4IV8uz33kjTM4cyN4IdFCrU0Q7zveFp9Wi8-H3X71Zbs_rL787yes-Z-ff0B1vsrIFSySLiN7kmHj8jV-rU76-wyyJLSQUMfCgu3qWpTC5xAiD6XebxV9GpfuqHRv8ENbjgbVNK4m8YQPcDrKJpMkz2mcDpoppZxRTOsiyXgVY5xTxjM-TspJkfE8w4GYMsri0YhOKKU0TiPKYyzL8YTXBU6qmJOEYsuFfDkbDIS1HU7Ho3EyHkheoLSXc9GlYuJZXzGX05GZhv2o6I6WJFQK6-wrlhNOhmPVuyJLlySd900qXcL20vb6raH64SQRDTojp7-8LYY0wnHGZ_JXAAAA___CnCAP">