[llvm-branch-commits] [flang] [mlir] [Flang][OpenMP] Add pass to replace allocas with device shared memory (PR #161863)

Sergio Afonso via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Mon Mar 2 06:05:09 PST 2026


skatrak wrote:

> This is useful for optimization cases when generic is converted to spmd and we need to decide which stores to guard and which not. In that case, if we only let main thread write `setID` and then all other threads get the garbage because it is local to all threads. Making it shared solves that problem.

I see what you mean. Taking a bit of a look into the SPMD-ization logic my understanding is that the store instructions that are guarded are those that write to shared memory. In this case, the allocation for `setID` would be allocated in private stack memory (it's not used inside of parallel), so OpenMPOpt should not introduce any guards for it.

On the other hand, the array descriptor for `Quad%AngSetPtrArray` / `ASet` would use shared memory because: (i) it's passed to an `llvm.intr.memcpy` call before the parallel region (we conservatively assume any pointer passed to another function might potentially reach a parallel region); and (ii) it's read from within the parallel region.

I think the problem here is not that we're using the wrong memory spaces, but rather that OpenMPOpt does not add a guard to the `llvm.intr.memcpy` call that initializes the descriptor in shared memory before the parallel region (`AAKernelInfoCallSite::initialize` never adds a guard to an intrinsic, though in this case we should probably do so if the destination pointer of the memory copy is to shared memory -- similar to the `llvm.store` case).

Let me know @abidh if that analysis makes sense or if I'm missing something.

https://github.com/llvm/llvm-project/pull/161863


More information about the llvm-branch-commits mailing list