[Openmp-commits] [openmp] [Flang][OpenMP][MLIR] Initial array section mapping MLIR -> LLVM-IR lowering utilising omp.bounds (PR #68689)

Thu Oct 19 02:44:25 PDT 2023

@@ -1629,13 +1622,153 @@ getRefPtrIfDeclareTarget(mlir::Value value,
   return nullptr;
+// A small helper structure to contain data gathered
+// for map lowering and coalese it into one area and
+// avoiding extra computations such as searches in the
+// llvm module for lowered mapped varibles or checking
+// if something is declare target (and retrieving the
+// value).
+struct MapData {
+  bool isDeclareTarget = false;
+  mlir::Operation *mapClause;
+  llvm::Value *basePointer;
+  llvm::Value *pointer;
+  llvm::Value *kernelValue;
+  llvm::Type *underlyingType;
+  llvm::Value *sizeInBytes;
+uint64_t getArrayElementSizeInBits(LLVM::LLVMArrayType arrTy, DataLayout &dl) {
+  if (auto nestedArrTy = llvm::dyn_cast_if_present<LLVM::LLVMArrayType>(
+          arrTy.getElementType()))
+    return getArrayElementSizeInBits(nestedArrTy, dl);
+  return dl.getTypeSizeInBits(arrTy.getElementType());
+// This function calculates the size to be offloaded for a specified type, given
+// its associated map clause (which can contain bounds information which affects
+// the total size), this size is calculated based on the underlying element type
+// e.g. given a 1-D array of ints, we will calculate the size from the integer
+// type * number of elements in the array. This size can be used in other
+// calculations but is ultimately used as an argument to the OpenMP runtimes
+// kernel argument structure which is generated through the combinedInfo data
+// structures.
+// This function is somewhat equivalent to Clang's getExprTypeSize inside of
+// CGOpenMPRuntime.cpp.
+llvm::Value *getSizeInBytes(DataLayout &dl, const mlir::Type &type,
agozillon wrote:

I also prefer the uniformity and ease of use, but perhaps I'm to close to the source on this one. I do understand the dislike for the excess generated arguments though, and the possible performance impact if subsequent optimisation passes can't tidy it up. 

However, it's not the only thing I've encountered recently during the lowering that will need some polishing off when we can. Allocatables for some odd reason spawn multiple allocas of the descriptor + data structure that get the same value assigned to them, and then only one of them is actually used. Unsure why (perhaps someone else does and it has a good reason or maybe it's an OpenMP lowering oddity), but it is something that might be worth a look into in the future.    


