[Mlir-commits] [llvm] [mlir] [OpenMP][MLIR][OMPIRBuilder] Add a small optional constant alloca raise function pass to finalize, utilised in convertTarget (PR #78818)
llvmlistbot at llvm.org
llvmlistbot at llvm.org
Mon Feb 19 20:50:26 PST 2024
================
@@ -0,0 +1,43 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+// A small condensed version of a problem requiring constant alloca raising in
+// Target Region Entries for user injected code, found in an issue in the Flang
+// compiler. Certain LLVM IR optimisation passes will perform runtime breaking
+// transformations on allocations not found to be in the entry block, current
+// OpenMP dialect lowering of TargetOp's will inject user allocations after
+// compiler generated entry code, in a seperate block, this test checks that
+// a small function which attempts to raise some of these (specifically
+// constant sized) allocations performs its task reasonably in these
+// scenarios.
+
+module attributes {omp.is_target_device = true} {
+ llvm.func @_QQmain() attributes {omp.declare_target = #omp.declaretarget<device_type = (host), capture_clause = (to)>} {
+ %1 = llvm.mlir.constant(1 : i64) : i64
+ %2 = llvm.alloca %1 x !llvm.struct<(ptr)> : (i64) -> !llvm.ptr
+ %3 = omp.map_info var_ptr(%2 : !llvm.ptr, !llvm.struct<(ptr)>) map_clauses(tofrom) capture(ByRef) -> !llvm.ptr
+ omp.target map_entries(%3 -> %arg0 : !llvm.ptr) {
+ ^bb0(%arg0: !llvm.ptr):
+ %4 = llvm.mlir.constant(1 : i32) : i32
+ %5 = llvm.alloca %4 x !llvm.struct<(ptr)> {alignment = 8 : i64} : (i32) -> !llvm.ptr
+ %6 = llvm.mlir.constant(50 : i32) : i32
+ %7 = llvm.mlir.constant(1 : i64) : i64
+ %8 = llvm.alloca %7 x i32 : (i64) -> !llvm.ptr
+ llvm.store %6, %8 : i32, !llvm.ptr
+ %9 = llvm.mlir.undef : !llvm.struct<(ptr)>
+ %10 = llvm.insertvalue %8, %9[0] : !llvm.struct<(ptr)>
+ llvm.store %10, %5 : !llvm.struct<(ptr)>, !llvm.ptr
+ %88 = llvm.call @_ExternalCall(%arg0, %5) : (!llvm.ptr, !llvm.ptr) -> !llvm.struct<()>
----------------
agozillon wrote:
I made the following small little test case using blocks, it makes some attempt to generate the same behavior as the C++ test case above, with some extra on top:
```
PROGRAM main
integer, allocatable :: a
allocate(a)
a = 5
!$omp target map(tofrom:a)
sub_call : BLOCK
INTEGER :: b
b = a + 2
CALL Sub(b)
do i = 1, 10
BLOCK
integer :: j
j = 10
j = j + 10
a = a + j
END BLOCK
a = a + b
end do
END BLOCK sub_call
!$omp end target
print *, a
END PROGRAM main
SUBROUTINE Sub(B)
INTEGER :: b
b = b * b
END SUBROUTINE Sub
```
It yields the same answer for both target and regular Fortran host (as does moving j outside and into the first block yield identical results), what is interesting is that forcing undefined behavior via removing the initial assignment to `j`, the regular Fortran host code will print a garbage value, whereas the target will return an actual consistent value, effectively not emitting the addition of `j` in the loop to `a`. This happens with or without the alloca raise pass here, so seems a seperate inconsistency.
I did, however, have to use the deprecated fir flow via `-flang-deprecated-no-hlfir` (which still uses the alloca raise, so it would still produce the relevant inconsistencies), as this little test has unfortunately shown another very similar compiler crash for allocatables to the one that this PR fixes (crashes with or without this PR sadly). So that will be fun to look into, I am unsure if it's a related or entirely new issue at this point though (will look into it more in the near future).
https://github.com/llvm/llvm-project/pull/78818
More information about the Mlir-commits
mailing list