[flang-commits] [flang] [Flang][OpenMP] Heap-allocate GPU dynamic private arrays in distribute parallel do (PR #200841)
Spencer Bryngelson via flang-commits
flang-commits at lists.llvm.org
Sat Jun 20 18:45:47 PDT 2026
sbryngelson wrote:
Verified the fix (PR llvm/llvm-project#200841) on gfx90a (MI210).
The public Jun-12 ROCm nightly (`therock-dist-linux-gfx90a-7.14.0a20260612`, fork commit `52226beb`) still faults — it builds from a branch that doesn't have the fix yet. The fix is only on `amd-staging`, so I built flang from source:
```
flang version 23.0.0git (https://github.com/ROCm/llvm-project.git 09cac6e4d442814e2448d62fe7a57b8d135e3d09) [amd-staging]
```
Original minimal reproducer (VLA `private` array sized from an assumed-shape dummy) on gfx90a:
| compiler | result |
|---|---|
| ROCm 7.2.0 (LLVM 22) | GPU memory access fault |
| AFAR 23.2.0 (LLVM 23) | GPU memory access fault |
| Jun-12 nightly 7.14 (no fix) | GPU memory access fault |
| **flang built from amd-staging (fix)** | **`1.`, exit 0, deterministic over 3 runs** |
Confirmed it's the fix doing the work:
- compiling emits the new diagnostic `OpenMP private dynamic array 'tmp' ... using device heap allocation`
- the device kernel now references `malloc`/`free` instead of an `addrspace(5)` scratch alloca
Note for anyone hitting this: the fix isn't in a shipping/nightly build yet (only `amd-staging` as of 2026-06-20) — a stock `amdflang` still faults until `amd-staging` promotes. Device link of the heap-allocated path needs the GPU libc (`-Xoffload-linker -lc`; `malloc` resolves via `__ockl_dm_alloc`).
https://github.com/llvm/llvm-project/pull/200841
More information about the flang-commits
mailing list