[llvm] [openmp] [OpenMP][AMDGPU] Adapt dynamic callstack sizes to HIP behavior (PR #74080)

Tue Mar 5 07:12:19 PST 2024

mhalk wrote:

> The TargetParser does seem to only be exposing boolean-ish looking names, which unfortunately would imply introducing N different feature names there. I've lost track of why you need this in the first place; if you over-commit on the stack size, won't the lower level API fail for you?

When launching a kernel which requires a dynamic callstack we want to use the maximum target-specific `StackSize`.

Yes, such a fail will occur, but IIRC this is not desired:
```
AMDGPU fatal error 1: Received error in queue 0x7f24b2a44000:
HSA_STATUS_ERROR_OUT_OF_RESOURCES: The runtime failed to allocate the necessary resources.
This error may also occur when the core runtime library needs to spawn threads or create internal OS-specific events.
Aborted (core dumped)
```
(IMHO in a complex scenario this message might not be very helpful.)

With the current changes the user will be informed, without aborting and while using the max. scratch memory.
(Unfortunately, it's still a matter of relaying the info into the OpenMP amdgpu plugin, so basically the situation hasn't changed much.)

https://github.com/llvm/llvm-project/pull/74080