[Mlir-commits] [mlir] [OpenMP][MLIR] Hoist static `alloca`s emitted by private `init` regions to the allocation IP of the construct (PR #171597)

Wed Dec 10 09:18:49 PST 2025

================
@@ -1596,10 +1596,64 @@ static llvm::Expected<llvm::Value *> initPrivateVar(
   return phis[0];
 }
 
+/// Beginning with \p startBlock, this function visits all reachable successor
+/// blocks. For each such block, static alloca instructions (i.e. non-array
+/// allocas) are collected. Then, these collected alloca instructions are moved
+/// to the \p allocaIP insertion point.
+///
+/// This is useful in cases where, for example, more than one allocatable or
+/// array are privatized. In such cases, we allocate a number of temporary
+/// descriptors to handle the initialization logic. Additonally, for each
+/// private value, there is branching logic based on the value of the origianl
+/// private variable's allocation state. Therefore, we end up with descriptor
+/// alloca instructions preceded by conditional branches which casues runtime
+/// issues at least on the GPU.
+static void hoistStaticAllocasToAllocaIP(
+    llvm::BasicBlock *startBlock,
+    const llvm::OpenMPIRBuilder::InsertPointTy &allocaIP) {
+  llvm::SmallVector<llvm::BasicBlock *> inlinedBlocks{startBlock};
+  llvm::SmallPtrSet<llvm::BasicBlock *, 4> seenBlocks;
+  llvm::SmallVector<llvm::Instruction *> staticAllocas;
+
+  while (!inlinedBlocks.empty()) {
+    llvm::BasicBlock *curBlock = inlinedBlocks.front();
+    inlinedBlocks.erase(inlinedBlocks.begin());
+    llvm::Instruction *terminator = curBlock->getTerminator();
+
+    for (llvm::Instruction &inst : *curBlock) {
+      if (auto *allocaInst = mlir::dyn_cast<llvm::AllocaInst>(&inst)) {
+        if (!allocaInst->isArrayAllocation()) {
----------------
tblah wrote:

Why are array allocations special? I would have thought the case worth worrying about is array allocations which have a size determined dynamically - a statically sized array allocation should work okay.

Would multiple of these dynamically sized arrays still crash the GPU code?

https://github.com/llvm/llvm-project/pull/171597