[Mlir-commits] [mlir] [OpenMP][MLIR] Hoist static `alloca`s emitted by private `init` regions to the allocation IP of the construct (PR #171597)

Kareem Ergawy llvmlistbot at llvm.org
Thu Dec 11 01:16:53 PST 2025


================
@@ -1596,10 +1596,64 @@ static llvm::Expected<llvm::Value *> initPrivateVar(
   return phis[0];
 }
 
+/// Beginning with \p startBlock, this function visits all reachable successor
+/// blocks. For each such block, static alloca instructions (i.e. non-array
+/// allocas) are collected. Then, these collected alloca instructions are moved
+/// to the \p allocaIP insertion point.
+///
+/// This is useful in cases where, for example, more than one allocatable or
+/// array are privatized. In such cases, we allocate a number of temporary
+/// descriptors to handle the initialization logic. Additonally, for each
+/// private value, there is branching logic based on the value of the origianl
+/// private variable's allocation state. Therefore, we end up with descriptor
+/// alloca instructions preceded by conditional branches which casues runtime
+/// issues at least on the GPU.
+static void hoistStaticAllocasToAllocaIP(
+    llvm::BasicBlock *startBlock,
+    const llvm::OpenMPIRBuilder::InsertPointTy &allocaIP) {
+  llvm::SmallVector<llvm::BasicBlock *> inlinedBlocks{startBlock};
+  llvm::SmallPtrSet<llvm::BasicBlock *, 4> seenBlocks;
+  llvm::SmallVector<llvm::Instruction *> staticAllocas;
+
+  while (!inlinedBlocks.empty()) {
+    llvm::BasicBlock *curBlock = inlinedBlocks.front();
+    inlinedBlocks.erase(inlinedBlocks.begin());
+    llvm::Instruction *terminator = curBlock->getTerminator();
+
+    for (llvm::Instruction &inst : *curBlock) {
+      if (auto *allocaInst = mlir::dyn_cast<llvm::AllocaInst>(&inst)) {
+        if (!allocaInst->isArrayAllocation()) {
----------------
ergawy wrote:

For descritpros, the problematic allocations are the allocations of the temporary descriptor structures used to initialize the private storage. All such allocations are static allocations because are just structs.

The dynamic arrays are allocated only when the original value is allocated, so these allocations has to be maintained in the proper branch since we read the shape from the original value.

The problem when we have many descriptors is that:
1. We inline the `init` region of descriptor number 1 which includes temp allocations + the if-else branch for initialization.
2. We do the same for descriptor number 2.
3. .....
Because of that, such temp allocations are emitted between if-else branching.

I think the compiler is smart enough to keep the dynamic allocations since they are obviously protected/tucked inside a branch while some of the static allocations after the branching joins again are problamtic since they are supposed to be uncoditional. 

https://github.com/llvm/llvm-project/pull/171597


More information about the Mlir-commits mailing list