[Mlir-commits] [mlir] [Flang][OpenMP][MLIR] Initial array section mapping MLIR -> LLVM-IR lowering utilising omp.bounds (PR #68689)
Akash Banerjee
llvmlistbot at llvm.org
Tue Oct 17 07:43:26 PDT 2023
================
@@ -1629,13 +1622,153 @@ getRefPtrIfDeclareTarget(mlir::Value value,
return nullptr;
}
+// A small helper structure to contain data gathered
+// for map lowering and coalese it into one area and
+// avoiding extra computations such as searches in the
+// llvm module for lowered mapped varibles or checking
+// if something is declare target (and retrieving the
+// value).
+struct MapData {
+ bool isDeclareTarget = false;
+ mlir::Operation *mapClause;
+ llvm::Value *basePointer;
+ llvm::Value *pointer;
+ llvm::Value *kernelValue;
+ llvm::Type *underlyingType;
+ llvm::Value *sizeInBytes;
+};
+
+uint64_t getArrayElementSizeInBits(LLVM::LLVMArrayType arrTy, DataLayout &dl) {
+ if (auto nestedArrTy = llvm::dyn_cast_if_present<LLVM::LLVMArrayType>(
+ arrTy.getElementType()))
+ return getArrayElementSizeInBits(nestedArrTy, dl);
+ return dl.getTypeSizeInBits(arrTy.getElementType());
+}
+
+// This function calculates the size to be offloaded for a specified type, given
+// its associated map clause (which can contain bounds information which affects
+// the total size), this size is calculated based on the underlying element type
+// e.g. given a 1-D array of ints, we will calculate the size from the integer
+// type * number of elements in the array. This size can be used in other
+// calculations but is ultimately used as an argument to the OpenMP runtimes
+// kernel argument structure which is generated through the combinedInfo data
+// structures.
+// This function is somewhat equivalent to Clang's getExprTypeSize inside of
+// CGOpenMPRuntime.cpp.
+llvm::Value *getSizeInBytes(DataLayout &dl, const mlir::Type &type,
----------------
TIFitis wrote:
Consider the following example:
```
subroutine omp_target(a, b, c)
integer, intent(in) :: a, b, c
integer :: x(a, b, c)
!$omp target map(tofrom : x)
!$omp end target
end subroutine omp_target
```
Here's a slice of the llvm IR generated:
```
%.offload_sizes = alloca [1 x i64], align 8
%kernel_args = alloca %struct.__tgt_kernel_arguments, align 8
%4 = load i32, ptr %0, align 4
%5 = sext i32 %4 to i64
%6 = icmp sgt i64 %5, 0
%7 = select i1 %6, i64 %5, i64 0
%8 = load i32, ptr %1, align 4
%9 = sext i32 %8 to i64
%10 = icmp sgt i64 %9, 0
%11 = select i1 %10, i64 %9, i64 0
%12 = load i32, ptr %2, align 4
%13 = sext i32 %12 to i64
%14 = icmp sgt i64 %13, 0
%15 = select i1 %14, i64 %13, i64 0
%16 = mul i64 1, %7
%17 = mul i64 %16, %11
%18 = mul i64 %17, %15
%19 = alloca i32, i64 %18, align 4
%20 = sub i64 %7, 1
%21 = sub i64 %11, 1
%22 = sub i64 %15, 1
br label %entry
entry: ; preds = %3
%23 = sub i64 %20, 0
%24 = add i64 %23, 1
%25 = sub i64 %21, 0
%26 = add i64 %25, 1
%27 = mul i64 %24, %26
%28 = sub i64 %22, 0
%29 = add i64 %28, 1
%30 = mul i64 %27, %29
%31 = mul i64 %30, 4
%34 = getelementptr inbounds [1 x i64], ptr %.offload_sizes, i32 0, i32 0
store i64 %31, ptr %34, align 8
```
The above code basically tries to recompute `%30` which is the same as `%18` already present in the alloca instruction.
My view is that we don't need and should neither use nor generate a boundsOp unless explicit bounds have been provided by the user.
Others however, have already expressed that we would like to have a boundsOp present at all times and use it whenever possible as it favours a single solution for all cases. I am not strictly against this, but I prefer the former way of doing things.
And from what I can tell Clang also reuses `%18` for the `offload_size` here.
https://github.com/llvm/llvm-project/pull/68689
More information about the Mlir-commits
mailing list